Data Samples

On this page we will provide you with sample datasets so you can immediately create a project and testing the labeling interface. As we mentioned, Datasaur has many different project types. We will provide you with sample datasets for the following projects: span-based, span-based with arrows, row-based, bounding-box, and document-based. If you would like to create an Audio or LLM project type, select their respective links. Both of these pages contain sample data for you to upload.

Span-based

The following zip files include the sample dataset and label sets. The .txt files in this zip folder contain the dataset to be labeled. Upload the .txt files in Step 1 of the project creation wizard (PCW). The .csv is the labelset (taxonomy) to be applied to the dataset. Upload the .csv in Step 3 of PCW.

NER samples
POS samples
OCR samples

Span-based with arrows

The following zip files include the dataset and sample label sets. The .tsv files in this zip folder contain the dataset to be labeled. Upload the .tsv files in Step 1 of the project creation wizard (PCW). The .csv is the labelset (taxonomy) to be applied to the dataset. Upload the .csv in Step 3 of PCW.

Dependency samples
Coreference samples
Relation samples

Row-based (textual classification)

Upload this .csv in Step 1 of PCW. In Step 3 you will be able to make your question set either through UI or by uploading a .csv. In this example, we are doing a sentiment analysis.

Multiple files samples

Document-based (document/image classification)

The following zip files include images and PDFs for you to create a document-based project. Make sure to chose one file type when uploading the dataset in Step 1 of PCW. You can make your question set either in the UI or by uploading a .csv.

Image sample files
PDF samples

Bounding-box based

If you would like to create a Bounding-box project, you can use the datasets below. We have included PDFs and .jpg images; please upload one file type in Step 1 of PCW for your project. Once you get to Step 3 of PCW you can upload your labelset (taxonomy) by .csv or by creating them in the UI.

Conversational labeling

In Step 1 in the Project Creation Wizard, upload the sample JSON file below to get started in creating a Conversational project. In Step 3 in the Project Creation Wizard, you will be able to configure your label set either through UI or by uploading a label set file. The sample CSV file below can be used to instantly configure your label set classes.

Last updated