Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Applied Labels/Answers
  • How to Count for Each Labeling Types
  • Cell
  • Characters and Spans
  • Consensus
  • Label Set and Label Class
  • Labeler
  • Labeler Mode and Reviewer Mode
  • Project
  • Reviewer
  1. Compatibility & Updates

Common Terminology

A glossary of words being used in the app, ensuring the same level of understanding for all users.

Last updated 2 months ago

Applied Labels/Answers

Labels or answers that are applied in:

  1. Labeler Mode: This includes manually applied labels, assisted labeling features, and pre-labeled data.

  2. Reviewer Mode: This includes all labels or answers reviewed manually, applied through assisted labeling features, and pre-labeled data, but excludes those automatically accepted through consensus.

Pre-labeled Data: Refers to the labels or answers that are already present in the input file by default when a project is first created.

Assisted Labeling Features: Search (bulk label application towards the search result), ML-assisted Labeling, Data Programming, and Predictive Labeling.

How to Count for Each Labeling Types

  1. Span Labeling: The total number of labels assigned to spans.

  2. Row or Document Labeling: The total number of answers.

  3. Bounding Box Labeling: The total number of bounding boxes created.

  4. Conversational Labeling: The total number of labels assigned to spans, including those assigned to each utterance.

  5. Mixed Labeling: The sum of all labels or answers based on the above calculations.

Cell

A Cell is a box that is used to display data in the Editor. For example, in the above picture, the box that contains the text "Sherlock Holmes become widely popular in 1891" is a Cell. Cells are structured in a matrix-like manner.

Cell's line is its position relative to the vertical axis, it's numbered from 0 starting from top to bottom.

Cell's index is its position relative to the horizontal axis, it's numbered from 0 starting from left to right. We refer to a Cell by using its Line and Index. For example, Cell with Line equals 3 and Index equals 0, is the Cell that contains the text "All but one are set in the Victorian or Edwardian eras, between about 1880 and 1914."

Characters and Spans

Typically, the atomic unit in a document. This can be a single word but can also refer to punctuation such as '.'.

Spans are indexed starting from 0 within each cell, and its characters are indexed starting from 0 as well. For example, the Span "popularity" has index 1 in the current Cell and the character "u" has index 3 in "popularity".

Consensus

The level of agreement required among labelers before a label is automatically accepted. This mechanism ensures that multiple labelers reach a mutual agreement on the label/answer.

  • For example, if the consensus is set at 2, and Labeler 1 (L1) labels a span as a PERSON, it will not be automatically accepted since it hasn't met the minimum requirement of two agreements. However, if Labeler 2 (L2) labels the same span as a PERSON, thereby meeting the consensus requirement, the label will be automatically accepted.

Label Set and Label Class

  • Label Set: A collection of related label classes. In the example above, "NER" is the name of the Label Set.

  • Label Class: The specific labels used for annotation. In the example above, "GEO," "ORG," "PER," and "GPE" are label classes.

Labeler

As the name suggests, the main goal of a labeler is to label all types of data across assigned projects. A labeler can be defined in two contexts: as a role in a project or as a role in a Workspace.

  • Project Level: Anyone can be assigned as a labeler for a project. See the Labeler Mode explanation below.

  • Workspace Level: A user assigned as a labeler in a Workspace has limited access to various features within the Workspace and is primarily granted access to the assigned projects only.

Labeler Mode and Reviewer Mode

It's an isolated version of a project where individual team members can work independently. This approach allows labelers to work on their own set of documents, which are later consolidated in Reviewer mode if any conflicts arise.

  • Each labeler has their own mode, which is basically having their own version of each assigned document.

  • The reviewers, on the other hand, have a single shared cabinet.

For example, in a project with ten labelers and three reviewers, there will be ten Labeler Modes and only one Reviewer Mode.

Project

A project represents the highly customized configuration of labeling tasks which can consist of multiple documents to be labeled.

Reviewer

As the name suggests, the main goal of a reviewer is to review all labelers' work and manage conflicts. The reviewer's decisions will be used as the final result of the labeling process. Same as a labeler, reviewer can be defined in two contexts: as a role in a project or as a role in a Workspace.

  • Project Level: Anyone can be assigned as a reviewer for a project. See the Reviewer Mode explanation above.

Workspace Level: A user assigned as a reviewer in a Workspace can have access to various things within the Workspace. For a more detailed scope, see .

this page