# Dataset

### Overview

The Dataset page in LLM Labs collects all datasets available for [automated evaluation](https://docs.datasaur.ai/llm-projects/evaluation/automated-evaluation) or [fine-tuning](https://docs.datasaur.ai/llm-projects/models/fine-tuning), providing a centralized location for managing your data.

### Prerequisites

Dataset must be formatted as a CSV (Comma Separated Value) file with the following columns:

1. `prompt`: This column contains the input prompt that you will feed to your LLM.
2. `expected completion`: This column holds the desired or ideal output that your LLM should generate in response to the given prompt.
3. `system_instruction` (Optional): This column contains global or contextual instructions that control how the LLM should interpret and respond to the prompt.

{% file src="/files/xM4DiBmBYdfzxLKH3YqG" %}

{% file src="/files/U9JQq45ugPPy8wNneCoW" %}

### Create dataset

1. Navigate to the **Dataset** menu on the left sidebar.
2. Click the **Create dataset** button.

   <figure><img src="/files/kaaIuirRh1LRTI7RzMHp" alt=""><figcaption><p>Create dataset</p></figcaption></figure>
3. Type the dataset name then click **Create**, and you will be redirected to the dataset table.

   <figure><img src="/files/Bvci4Cgwg87SsgXIKmiO" alt=""><figcaption><p>Dataset name</p></figcaption></figure>
4. Click the **Upload** **dataset** button and select a .csv file containing the following columns: prompt, expected\_completion, and system\_instruction (optional).

   <figure><img src="/files/ohS45NZeVKtfuP2seeD7" alt=""><figcaption></figcaption></figure>
5. Once the file is uploaded, the dataset will be automatically added to the table.

   <figure><img src="/files/S7TDxrlvlLP7EKqSn7Sv" alt=""><figcaption><p>Dataset created</p></figcaption></figure>

### Modify dataset item

Once a dataset is uploaded, you can **add more**, **edit**, or **delete** dataset items.

#### Add more dataset items

1. Click **Add dataset** button next to the Search field.
2. Upload the .csv file.
3. The additional datasets will be added to the table, and the existing ones remain.

   <figure><img src="/files/kbZImSJTkI3ihrOdHcNG" alt=""><figcaption><p>Adding more dataset</p></figcaption></figure>

#### Edit dataset item

1. Find the dataset item you want to edit using the **Search** field and right-click the dataset item to open a popover menu.

   <figure><img src="/files/RXeoh4gnhz6mvR3eU2B8" alt=""><figcaption></figcaption></figure>
2. Then click the **Edit** option.

   <figure><img src="/files/Me2D4ZMNWf2dk0aTrnMW" alt=""><figcaption></figcaption></figure>
3. Modify the necessary details, then click **Enter** to apply the updates.

Notes: For shortcut, you can simply double click the dataset item to edit the item

#### Delete dataset item

1. Find the dataset item you want to edit using the **Search** field and right-click the dataset item to open a popover menu.

   <figure><img src="/files/ypjGYNuGhYvfMCXDn1UY" alt=""><figcaption></figcaption></figure>
2. Then click the **Delete** option.

   <figure><img src="/files/VxhSxnISVBKXfpK3oiie" alt=""><figcaption></figcaption></figure>
3. The dataset item will be deleted immediately.

{% hint style="info" %}
Please note that this action cannot be undone.
{% endhint %}

### Delete entire dataset

1. In the main Dataset page, find the dataset you want to delete using the search field or filter options. Click the **More** menu (three-dots icon), then select **Delete** option.

   <figure><img src="/files/OdXQjlhBwCOyWRMiWuoK" alt=""><figcaption><p>Delete dataset option</p></figcaption></figure>
2. Confirm the deletion by clicking the **Delete** button.

   <figure><img src="/files/CvQRy54UPkSXH63n91bS" alt=""><figcaption></figcaption></figure>

To delete multiple datasets:

1. Select the datasets and click the **Delete** button above the table.

   <figure><img src="/files/ZxpRjyzbXlGKtatiUVEc" alt=""><figcaption></figcaption></figure>
2. Confirm the deletion by clicking the **Delete** button.

   <figure><img src="/files/aFZu2TbgqzducKyijpPL" alt=""><figcaption></figcaption></figure>

### Access via Automated evaluation

Once you've created the dataset, it will be available for use in the Automated evaluation projects. [Learn more about Automated evaluation](https://docs.datasaur.ai/llm-projects/evaluation/automated-evaluation).

Click **Use existing dataset** in Step 1 when creating an automated evaluation project.

<figure><img src="/files/NkuLnkUYUZQwCX2lGBdL" alt=""><figcaption></figcaption></figure>

A dialog will appear where you can choose the dataset to use for the project.

<figure><img src="/files/fNuTx0CXhJbfuT9EMua7" alt=""><figcaption></figcaption></figure>

### Access via Fine-tuning

Once you've created the dataset, it will be available for use for fine-tuning base models. [Learn more about Fine-tuning](https://docs.datasaur.ai/llm-projects/models/fine-tuning) models.

Click **Use existing dataset** in Step 1 when configuring fine-tuning.

<figure><img src="/files/OLg3tgyA7Q5619c7t5oO" alt=""><figcaption></figcaption></figure>

A dialog will appear where you can choose a dataset for fine-tuning.

<figure><img src="/files/XHDafUhuPieKtNQyHMqU" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/llm-projects/dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
