Pre-Labeled Project
Overview
The Pre-labeled Project feature allows users to initiate a new labeling project using a file that already contains pre-defined labels. This capability enables you to jumpstart the labeling process by leveraging existing annotations, simplifying project setup, and eliminating the need to manually add labels from scratch. To use this feature, upload a pre-labeled file alongside the document you want to label.
Use Cases
It is especially useful in the following scenarios:
Streamlined Onboarding: When starting a new project that shares the same labeling schema as a previously completed project, you can use a pre-labeled file to quickly set up the new project.
Consistency in Labeling: When you have a set of standard labels that should be consistently applied across multiple projects, pre-labeled projects help ensure uniformity.
Efficiency: Save time by using pre-defined labels for projects with known labeling requirements.
Data Preparation: Import data with preliminary labels from external sources directly into Datasaur projects.
How to Create Pre-Labeled Project
Span Labeling
This project is for labeling specific parts of text within a document. To get started, follow these steps:
Prepare the pre-labeled file. Supported formats for importing pre-labeled span projects include:
Span with arrows: TSV non-IOB, JSON Advanced, CoNLL-U, Datasaur Schema (.json)
Span with character based Labeling: JSON Simplified, JSON Advanced,Datasaur Schema (.json)
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled file.
Complete the project creation process.
Row Labeling
This project is for assigning pre-labeled answer for row data. To get started, follow these steps:
Prepare the pre-labeled file. Supported formats for importing pre-labeled row projects include: CSV, JSON Tabular, TSV, XLS and XLSX, JSON Lines, Datasaur Schema (.json)
Note: Prepare a column in this file containing the answers to the questions that will be configured in Step 3 of the Project Creation Wizard.
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled file.
In Step 3 of the Project Creation Wizard, provide the questions. Then, link the answers to the questions by using the “Refer answer to table column…” option. Select the column that will serve as the answer to each question.
Complete the project creation process.
Document Labeling
This project is for assigning labels to whole documents based on their content. To get started, follow these steps:
Prepare the media and pre-labeled answer files. As long as the media files are uploaded with the answer files, they will be supported for pre-labeling.
Notes:
Answer file: This is a JSON file containing the answers to the given question set. The filename should be prefixed with
.answer.json
. Below is an example of an accepted format for the answer file, given the question set from the example above:Media and answer file naming: The media file and its corresponding pre-labeled answer file should have the same name. For example, if the media file is named
a.jpg
, its answer file should be nameda.answer.json
.Multiple media/documents: If you have multiple media files or documents to be pre-labeled, prepare your files as follows:
a.jpg
,a.answer.json
b.jpg
,b.answer.json
c.jpg
,c.answer.json
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled answer file with its media file.
In Step 3 of the Project Creation Wizard, provide the questions. Ensure that the answers in the pre-labeled answer file are configured in this step.
Complete the project creation process.
Bounding Box Labeling
This project is for drawing boxes around objects or text in images or documents to identify them. To get started, follow these steps:
Prepare the media and pre-labeled answer files. As long as the media files are uploaded with the answer files, they will be supported for pre-labeling. Notes:
Answer file format: the supported answer file formats for importing pre-labeled bounding box projects include: YOLO (.txt), LabelMe (.xml), Pascal VOC (.xml), Datasaur Schema (.json)
Media and answer file naming: The media file and its corresponding pre-labeled answer file should have the same name. For example, if the media file is named
a.jpg
, its answer file in YOLO format should be nameda.txt
.Multiple media/documents: If you have multiple media files or documents to be pre-labeled, prepare your files as follows:
a.jpg
,a.txt
b.jpg
,b.txt
c.jpg
,c.txt
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled answer file with its media file.
Notes:
Labels in your pre-labeled file will automatically match the labels in your label set.
Extra labels in the file will lead to the automatic creation of new classes during setup.
In Step 3 of the Project Creation Wizard, provide the labels.
Complete the project creation process.
Last updated