# Data Formats

Datasaur supports a wide range of data import formats. The available formats depend on the [task type](https://datasaurai.gitbook.io/datasaur/overview/task-type), as described in the table below. Click on any format to see a detailed explanation of the file structure Datasaur expects.

{% hint style="info" %}
If you don’t see your preferred file format below, you can use [file transformers](https://datasaurai.gitbook.io/datasaur/workspace-management/file-transformer/upload-file-transformer) to upload a custom format.
{% endhint %}

## Available formats

| [**Span-based**](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/span-based)                          | [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.tsv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#iob-specialized-tsv), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**Span-based with arrows**](https://docs.datasaur.ai/lets-get-labeling/span-based#draw-arrows)                    | [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.tsv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#iob-specialized-tsv), [.tsv-non-iob](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tsv_non_iob), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json), [.conllu](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#conll-u)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| [**Span-based with audio**](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/span-based/audio-project) | <p><strong>Media:</strong> <a href="../../../compatibility-and-updates/supported-formats#mp3">.mp3</a>, <a href="../../../compatibility-and-updates/supported-formats#m4a">.m4a</a>, <a href="../../../compatibility-and-updates/supported-formats#aac">.aac</a>, <a href="../../../compatibility-and-updates/supported-formats#flac">.flac</a>, <a href="../../../compatibility-and-updates/supported-formats#wav">.wav</a><br><strong>Transcription:</strong> <a href="../../../compatibility-and-updates/supported-formats#srt">.srt</a>, <a href="../../../compatibility-and-updates/supported-formats#txt">.txt</a>, <a href="../../../compatibility-and-updates/supported-formats#vtt">.vtt</a>, <a href="../../../compatibility-and-updates/supported-formats#json">.json</a></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| [**Span-based with document**](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/document-based)        | <p><strong>Media:</strong> <a href="../../../compatibility-and-updates/supported-formats#bmp">.bmp</a>, <a href="../../../compatibility-and-updates/supported-formats#docx-doc">.doc</a>, <a href="../../../compatibility-and-updates/supported-formats#docx-doc">.docx</a>, <a href="../../../compatibility-and-updates/supported-formats#pdf">.pdf</a>, <a href="../../../compatibility-and-updates/supported-formats#pptx-ppt">.ppt</a>, <a href="../../../compatibility-and-updates/supported-formats#pptx-ppt">.pptx</a>, <a href="../../../compatibility-and-updates/supported-formats#jpeg-and-jpg">.jpeg</a>, <a href="../../../compatibility-and-updates/supported-formats#jpeg-and-jpg">.jpg</a>, <a href="../../../compatibility-and-updates/supported-formats#png">.png</a>, <a href="../../../compatibility-and-updates/supported-formats#tiff-and-tif">.tiff</a>, <a href="../../../compatibility-and-updates/supported-formats#tiff-and-tif">.tif</a>, <a href="../../../compatibility-and-updates/supported-formats#webp">.webp</a></p><p><strong>Transcription:</strong> <a href="../../../compatibility-and-updates/supported-formats#json">.json</a>, <a href="../../../compatibility-and-updates/supported-formats#txt">.txt</a>, <a href="../../../compatibility-and-updates/supported-formats#iob-specialized-tsv">.tsv</a></p>                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| [**Bounding box labeling**](https://docs.datasaur.ai/data-studio-projects/lets-get-labeling/bounding-box-labeling) | [.bmp](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#bmp), [.gif](https://datasaurai.gitbook.io/datasaur/supported-formats#gif), [.jpeg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jpeg-and-jpg), [.jpg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jpeg-and-jpg), [.pdf](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#pdf), [.png](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#png), [.svg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#svg), [.tiff](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tiff-and-tif), [.tif](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tiff-and-tif), [.webp](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#webp)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| [**Conversational labeling**](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/conversational)         | [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| [**Row-based (text classification)**](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/row-based)      | [.csv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#csv), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json), [.jsonl](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jsonl-json-lines), [.tsv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tsv), [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.xls](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#xls-and-xlsx), [.xlsx](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#xls-and-xlsx)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| [**Document-based\***](https://docs.datasaur.ai/advanced/extensions/document-and-row-labeling)                     | [.bmp](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#bmp), [.csv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#csv), [.gif](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#gif), [.html](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#html), [.jpeg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jpeg-and-jpg), [.jpg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jpeg-and-jpg), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json), [.md](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#md-markdown), [.mp4](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#mp4), [.pdf](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#pdf), [.png](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#png), [.svg](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#svg), [.tiff](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tiff-and-tif), [.tif](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#tiff-and-tif), [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.tsv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#iob-specialized-tsv), [.uri](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#url-urls-uri), [.url](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#url-urls-uri), [.urls](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#url), [.webp](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#webp) |
| [**LLM Evaluation (fine tuning)**](https://datasaurai.gitbook.io/datasaur/llm-projects/evaluation)                 | [.csv](https://datasaurai.gitbook.io/datasaur/getting-started/lets-get-labeling/llm-project-type#creating-an-llm-evaluation-project-in-datasaur-a-4-step-guide)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| [**LLM Ranking (RLHF)**](https://datasaurai.gitbook.io/datasaur/llm-projects/ranking-rlhf)                         | [.csv](https://datasaurai.gitbook.io/datasaur/getting-started/lets-get-labeling/llm-project-type#creating-an-llm-ranking-project-in-datasaur-a-4-step-guide)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |

## Size limit

* Text-based file: 50 MB per file
  * Example: [.txt](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#txt), [.tsv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#iob-specialized-tsv), [.json](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#json), and [.csv](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#csv)
* Multimedia & image file: 500 MB per file
  * Example: [Video](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#mp4), [Image](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#jpeg-and-jpg), [Audio](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#mp3), and [PDF](https://docs.datasaur.ai/compatibility-and-updates/supported-formats#pdf)
* Project size: 1.5 GB

{% hint style="info" %}
To create projects with larger files, use [Robosaur](https://docs.datasaur.ai/integrations/robosaur). For assistance, contact us at **<support@datasaur.ai>.**
{% endhint %}

## Important notes

When uploading documents for OCR labeling and audio labeling, make sure each image file and its corresponding transcription file have the same name. For example, `unicef.jpg` and `unicef.txt`.
