Pre-Labeled Project
The Pre-labeled Project feature allows users to initiate a new labeling project using a file that already contains pre-defined labels. This capability enables you to jumpstart the labeling process by leveraging existing annotations, simplifying project setup, and eliminating the need to manually add labels from scratch. To use this feature, upload a pre-labeled file alongside the document you want to label.
Use Cases
It is especially useful in the following scenarios:
Streamlined Onboarding: When starting a new project that shares the same labeling schema as a previously completed project, you can use a pre-labeled file to quickly set up the new project.
Consistency in Labeling: When you have a set of standard labels that should be consistently applied across multiple projects, pre-labeled projects help ensure uniformity.
Efficiency: Save time by using pre-defined labels for projects with known labeling requirements.
Data Preparation: Import data with preliminary labels from external sources directly into Datasaur projects.
How to Create Pre-Labeled Project
Span Labeling
This project is for labeling specific parts of text within a document. To get started, follow these steps:
Prepare the pre-labeled file. Supported formats for importing pre-labeled span projects include:
Span with arrows: TSV non-IOB, JSON Advanced, CoNLL-U, Datasaur Schema (.json)
Span with character based Labeling: JSON Simplified, JSON Advanced,Datasaur Schema (.json)
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled file.
Complete the project creation process.
Row Labeling
This project is for assigning pre-labeled answer for row data. To get started, follow these steps:
Prepare the pre-labeled file. Supported formats for importing pre-labeled row projects include: CSV, JSON Tabular, TSV, XLS and XLSX, JSON Lines, Datasaur Schema (.json)
Note: Prepare a column in this file containing the answers to the questions that will be configured in Step 3 of the Project Creation Wizard.
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled file.
In Step 3 of the Project Creation Wizard, provide the questions. Then, link the answers to the questions by using the “Refer answer to table column…” option. Select the column that will serve as the answer to each question.
Complete the project creation process.
Document Labeling
This project is for assigning labels to whole documents based on their content. To get started, follow these steps:
Prepare the media and pre-labeled answer files. As long as the media files are uploaded with the answer files, they will be supported for pre-labeling.
Notes:
Answer file: This is a JSON file containing the answers to the given question set. The filename should be prefixed with
.answer.json. Below is an example of an accepted format for the answer file, given the question set from the example above:{ "caption": "A realistic photograph of a white sedan parked on an asphalt road, facing the camera at a front and slightly right angle. The car is centered slightly to the left, with a visible license plate reading 'HZ20 SBV'. The natural daylight provides bright and clear lighting across the scene. In the midground, the asphalt road extends horizontally, flanked by green grassy areas with scattered bushes. The background features a clear, blue sky and a line of four wind turbines with white blades and pale orange towers positioned along a grassy landscape with a body of water visible behind them. The entire composition centers on the car with the wind turbines providing a modern, eco-friendly backdrop." }Media and answer file naming: The media file and its corresponding pre-labeled answer file should have the same name. For example, if the media file is named
a.jpg, its answer file should be nameda.answer.json.Multiple media/documents: If you have multiple media files or documents to be pre-labeled, prepare your files as follows:
a.jpg,a.answer.jsonb.jpg,b.answer.jsonc.jpg,c.answer.json
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled answer file with its media file.
In Step 3 of the Project Creation Wizard, provide the questions. Ensure that the answers in the pre-labeled answer file are configured in this step.
Complete the project creation process.
Bounding Box Labeling
This project is for drawing boxes around objects or text in images or documents to identify them. To get started, follow these steps:
Prepare the media and pre-labeled answer files. As long as the media files are uploaded with the answer files, they will be supported for pre-labeling. Notes:
Answer file format: the supported answer file formats for importing pre-labeled bounding box projects include: YOLO (.txt), LabelMe (.xml), Pascal VOC (.xml), Datasaur Schema (.json)
Media and answer file naming: The media file and its corresponding pre-labeled answer file should have the same name. For example, if the media file is named
a.jpg, its answer file in YOLO format should be nameda.txt.Multiple media/documents: If you have multiple media files or documents to be pre-labeled, prepare your files as follows:
a.jpg,a.txtb.jpg,b.txtc.jpg,c.txt
Open Project Creation Wizard to start the project creation.
Upload the pre-labeled answer file with its media file.
Notes:
Labels in your pre-labeled file will automatically match the labels in your label set.
Extra labels in the file will lead to the automatic creation of new classes during setup.
In Step 3 of the Project Creation Wizard, provide the labels.
Complete the project creation process.
Pre-Labeled Propagation
By default, pre-labeled data is automatically recognized as the labeler’s submission. Once applied in Labeler mode, these labels instantly appear in Reviewer mode. Depending on the project’s consensus configuration, they may appear as as either accepted or conflicted labels.
This default behavior accelerates the labeling process when the pre-labeled source is reliable. However, automatic propagation may not be desirable in situations where the pre-labeled data originates from a model under evaluation or when explicit human verification is required to ensure labeling accuracy and quality.
Require Approval for Pre-Labeled Labels Setting
This project-level setting introduces an additional quality control step for imported pre-labeled data. It is particularly valuable when the pre-labeled source is less reliable or when a mandatory verification process is required before the data can be accepted.
Instead of automatically accepting all pre-labeled labels entries, this option requires labelers to manually accept or reject each pre-labeled label, ensuring that only verified and high-quality data proceeds to the review stage.
This setting is especially useful in the following scenarios:
Untrusted External Data When importing labels from external systems or tools, this setting prevents low-quality or inconsistent labels from automatically appearing in the reviewer’s view, ensuring that only verified labels are displayed during the review process.
Initial Quality Verification Enables labelers to validate simple pre-labeled labels before proceeding with more complex labeling tasks, ensuring a reliable baseline.
Assisted Labeling Allows labelers to treat pre-labeled labels as suggestions, approving those that meet project requirements and rejecting those that do not.
Difference from Default Behavior
Propagation
Automatic. Pre-labels are instantly treated as the labeler's own work and propagated to the reviewer's shared copy.
Manual approval required. Pre-labels remain only in each labeler’s document copy and are not propagated to the reviewer's copy until explicitly accepted by the labeler.
Labeler Action
No action required. Labelers can modify or delete pre-labels as needed.
Manual review required. Labelers must explicitly accept or reject each pre-labeled label.
State in Reviewer Copy
Immediately visible (depends on consensus rules).
Not visible in the reviewer’s copy. They will only be propagated to the reviewer’s copy when the labeler accepts them.
This setting can be configured during the project creation process (Step 5), by enabling the “Require approval for pre-labeled labels” toggle.
Note that this setting can only be defined during project creation and cannot be modified afterwards.



Labeler Workflow and Outcomes
When this setting is enabled, labelers will encounter pre-labeled labels within their individual document copies. These pre-labeled labels are visually distinguished and include options to either accept or reject them.
Labelers must review and decide on each pre-labeled label individually:
Accept
The pre-labeled label is approved by the labeler and treated as the labeler’s own work. It is then converted into an accepted pre-label and propagated to the reviewer’s copy.
In Span Labeling: Labelers can accept a pre-labeled label by right-clicking on it and selecting Accept, or by applying the same label to the same span.
In Row Labeling:Labelers can accept a pre-labeled label by selecting the corresponding answer in the Row Labeling extension and submitting it.
Reject:
The pre-labeled label is permanently removed from the labeler’s copy and will not appear in the reviewer’s view.
In Span Labeling: Labelers can reject a pre-labeled label by right-clicking and selecting Reject.
In Row Labeling: Labelers can reject a pre-labeled label by leaving the answer unselected and submitting it.
This workflow ensures that all pre-labeled data undergoes explicit human labeler verification before progressing to the review stage, thereby enhancing overall labeling accuracy, data integrity, and quality assurance.
Last updated