Labeling Agent (beta)
In some projects, ML models are just as important as human labelers. Labeling Agents allows you to assign ML models as labelers in your project and evaluate their performance alongside human labelers. This helps you better understand which labeling approach works best for your needs — whether human, machine, or both.
Why Use Labeling Agents?
Labeling Agents simplify the process of testing and comparing ML models inside Datasaur:
You no longer need to create separate accounts or log in as the model to run predictions.
Model outputs are now part of the same analytics and comparison tools used for human labelers.
It’s easier to measure performance and decide what labeling strategy to use.
Requirements
Models must be in the same team workspace as the Data Studio project.
ML models must be deployed applications from LLM Labs with “Deployed” status.
This feature is currently only supported for Span Labeling project.
This is a current limitation that will be improved in the future.
Using Labeling Agents
1. Assign Models as Labelers
You can assign models during the project creation process:
Go to Projects page > Create New Project.
Upload files and select Span Labeling.
In the Assignment step, open the Labeling agents tab.
Select one or more deployed models to assign them as labelers.
Complete the project setup.
2. Launch the Project and Trigger Labeling
When you click Launch Project, models will automatically begin applying labels.
Current limitation:
Only the first label set is used.
Each span will only have one label.
Labeling agents cannot yet draw arrows.
3. Review Labels Applied by the Labeling Agent
Once all documents are fully labeled — either through external model assistance or manual input, the project can undergo a final review. This stage typically involves a reviewer ensuring the consistency and accuracy of all annotations before submission or export through Reviewer Mode.
4. View and Compare Performance
You can track the performance of both human labelers and models from the Analytics page.
From here, you’ll be able to compare IAA scores and other metrics across all labelers — human and model.
How to Create a Span-Based Labeling Agent
If you want to use a machine learning model as a labeling agent for a span-based task, you can do so by setting up your prompt using system and user instructions.
1. Define Your Instructions
To help the model understand what to label, you’ll need to provide clear instructions. Below is an example setup:
System Instruction
You are an expert data labeler
User Instruction
Given the document text, please extract the following information and present it in JSON format as shown below:
PERSON: People, including fictional.
DATE: Absolute or relative dates or periods.
ORG: Companies, agencies, institutions, etc.
Instructions Summary:
1. Extract and present the information in the specified JSON format.
2. Ensure that all extracted data is accurate and corresponds directly to the content of each document.
Return the value of extracted fields in JSON structure in plain text, following this JSON FORMAT
{
"PERSON": ["People, including fictional."],
"DATE": ["Absolute or relative dates or periods."],
"ORG": ["Companies, agencies, institutions, etc."],
}
VERY IMPORTANT
RETURN THE ANSWER WITHOUT ```json
ANSWER PRECISELY GIVEN FROM THE SENTENCE PROMPT AND DON'T MASK THE ANSWER, ANSWER BASED ON THE GIVEN SENTENCE
2. Prepare the Label Set
Your Labeling Agent must follow the same label set as used in your Span Labeling project. Below is a simple example of how to define your labels:
{
"name": "Labeling agent Label set",
"options": [
{ "id": "NhsjWIgaAQH3g6dsvtW6a", "color": "#f93b90", "parentId": null, "label": "PERSON" },
{ "id": "X1bKK7Nxf9SGaBfDpzH7g", "color": "#d4e455", "parentId": null, "label": "DATE" },
{ "id": "NP2RJr7tD5aMfVBnG6TOm", "color": "#85c98e", "parentId": null, "label": "ORG" }
]
}
Make sure the labels match between the prompt and the project configuration.
3. Test with a Prompt Example
To check if your instructions work as expected, you can test them using an example sentence. Here's how you might write a prompt:
Labelset:
- PERSON
- DATE
- ORG
Sentence:
Ivan Lee is the CEO and Founder of Datasaur.ai. He graduated with a Computer Science B.S. from Stanford University. He was chosen for the selective Mayfield Fellows entrepreneurship program in 2010. Ivan went on to found Loki Studios, an iOS game studio. After raising institutional funding from DCM's A-Fund and launching a profitable game, Loki was acquired by Yahoo.
Expected Output:
{
"PERSON": ["Ivan Lee"],
"DATE": ["2010"],
"ORG": ["Datasaur.ai", "Stanford University", "Mayfield Fellows", "Loki Studios", "DCM's A-Fund", "Yahoo"]
}
Best Practices
Use the external model as a timesaving aid but always include a human review step.
Train your model with high-quality data to improve suggestion accuracy.
Communicate clearly with labelers about how to handle model predictions.
Automate some of the work with consensus by using multiple models, e.g. use the consensus of 3 and deploy 3 Labeling Agents, then focus only on those that are not accepted through consensus.
FAQs
Can I assign multiple models to the same project?
Yes. You can assign up to 10 Labeling Agents.
Can I use Labeling Agents in Line Labeling?
Not yet. They can be assigned to Span + Line project but will only apply labels for Span Labeling.
How are Labeling Agent labels shown in the UI?
They are treated like human labelers but are masked. You’ll see their labels in the Reviewer mode and analytics.
Last updated