Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
      • Dropbox
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Example
  • Sample documents
  • Create the project
  • Initiate Backend to Perform Programmatic API Labeling
  • Receive Programmatic API Labeling API calls
  • Results
  • Layer Auto Labeling
  1. API
  2. Labeling

Programmatic API Labeling

This feature allows you to use an API to apply labels to multiple projects at once.

Last updated 1 year ago

The Programmatic API Labeling feature automates . This feature is best suited for users who want to compare between their model and a human labeler, or between two models.

After successfully , you can follow these steps:

Example

In this example, we will create a token-based project with 2 documents and 1 labeler. We will perform the auto labeling process against this project and add labels in the labeler's document.

Sample documents

Create the project

{
 "operationName": "launchTextProjectAsync",
 "variables": {
   "input": {
     "name": "Stories",
     "kinds": ["TOKEN_BASED"],
     "documents": ["little-prince.txt", "tragedy-of-hamlet.txt"],
     "documentSettings": { 
          "viewer": "TOKEN" 
           // ... other document settings 
      }
     // ... other project configurations
   }
 },
 "query": "mutation ..."
}

The API request above returns a response containing the project id: “PROJECT_ID_1”, which is going to be used for the next set of API requests.

Initiate Backend to Perform Programmatic API Labeling

This operation will ask our backend to perform the auto**-**labeling task. We perform the request in chunks. For example, if you have 500 files and 5 files will be sent per request, it will require 100 API calls.

Note: the number of files that can be sent per request depends on your internal server.

Initiate auto-labeling for the Velociraptor API for labeler@datasaur.ai

{
 "operationName": "AutoLabelTokenBasedProject",
 "variables": {
   "input": {
     "projectId": "PROJECT_ID_1",
     "labelerEmail": "labeler@datasaur.ai",
     "targetAPI": {
       "endpoint": "https://velociraptor.api/...",
       "secretKey": "raQa9of3jDj9Ksde6dLDdycr",
     },
     "options": {
       "numberOfFilesPerRequest": 2,
       // Optional
       "serviceProvider": "CUSTOM",
     },
     "role": "REVIEWER"
   }
 },
 "query": "mutation ..."
}

Receive Programmatic API Labeling API calls

Sample API request

{
 "id": "PROJECT_ID_1",
 "name": "Stories",
 "documents": [
   {
       "id": "DOCUMENT_ID_1",
       "fileName": "little-prince.txt",
       "sentences": [
         { "id": 0, "text": "The Little Prince is a novella by French aristocrat, writer, and aviator Antoine de Saint-Exupéry." },
         { "id": 1, "text": "It was first published in English and French in the US by Reynal & Hitchcock in April 1943, and posthumously in France following the liberation of France as Saint-Exupéry's works had been banned by the Vichy Regime." }
       ]
   },
   {
       "id": "DOCUMENT_ID_2",
       "fileName": "tragedy-of-hamlet.txt",
       "sentences": [
         { "id": 0, "text": "The Tragedy of Hamlet, Prince of Denmark, often shortened to Hamlet (/ˈhæmlɪt/), is a tragedy written by William Shakespeare sometime between 1599 and 1601." },
         { "id": 1, "text": "It is Shakespeare's longest play with 30,557 words." }
       ]
   },
 ]
}

Sample API response

The sample response below will apply two labels to the first document and three labels to the second document.

{
 "id": "PROJECT_ID_1",
 "documents": [
   {
     "id": "DOCUMENT_ID_1",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 16 }
         ]
       },
       {
         "id": 1,
         "entities": [
           { "label": "YEAR", "start_char": 86, "end_char": 89 }
         ]
       }
     ]
   },
   {
     "id": "DOCUMENT_ID_2",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 20 },
           { "label": "YEAR", "start_char": 142, "end_char": 145 },
           { "label": "YEAR", "start_char": 151, "end_char": 154 }
         ]
       }
     ]
   }
 ]
}

Results

Once the auto labeling process is complete, the labeler's document will look like the screenshots below:

Layer Auto Labeling

Layer Auto Labeling allows you to specify layer for each label.

On the example below, we will auto label TITLE on the Layer 1 and YEAR on the Layer 2.

The instructions are still the same as above. However, we will add a new key "layer" in the API response.

{
 "id": "PROJECT_ID_1",
 "documents": [
   {
     "id": "DOCUMENT_ID_1",
     "labels": [
       {
         "id": 0,
         "entities": [
           { "label": "TITLE", "start_char": 0, "end_char": 16, "layer": 1 }
         ]
       },
       {
         "id": 1,
         "entities": [
           { "label": "YEAR", "start_char": 86, "end_char": 89, "layer": 2 }
         ]
       }
     ]
   },

The labeler's document will look like the screenshots below:

Detailed guidelines can be found .

Our backend will make several API requests based on the configuration provided from the request above. From the sample configuration above, our backend will make an API request to ...

here
https://velociraptor.api/
ML-assisted labeling
creating the project
Initiate backend to run Programmatic API Labeling
Receive Programmatic API Labeling API calls
994B
little-prince.txt
little-prince
2KB
tragedy-of-hamlet.txt
tragedy-of-hamlet