Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
      • Dropbox
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.112.0
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Row-based and Document-based
  • Go Menu
  • Row Page Navigation
  • Document Navigation
  • Required question
  • Sort and filter column
  • Keyboard shortcuts for dropdown questions
  • Filter rows
  • Hide and rename the headers
  • Mark as complete
  • Personalization Setting
  • Automatically jump to next document when marking as complete
  • Row-based with URL view
  • Applying answers for multiple rows
  • Auto-saved answers
  1. Data Studio Projects
  2. Let's Get Labeling!

Row & Document Based

Last updated 7 months ago

Row-based and Document-based

If you choose row-based or document-labeling as the task type, the goal of labeling is to answer the questions. You can answer the questions in the Document Labeling extension on the right side. ().

  • You can navigate to the next question by using your mouse or typing Tab on the keyboard.

Go Menu

You can move to the desired row via the Go menu.

  • Go to Start will take you to the first row.

  • Go to End will take you to the last row.

  • Go to Line will take you to a specific row.

  • Go to Next Unlabeled Line will take you to the next unlabeled line.

  • Go to Previous Unlabeled Line will take you to the previous unlabeled line.

  • Go to Next File will take you to the next file.

  • Go to Previous File will take you to the previous file.

Row Page Navigation

We not only support displaying all rows for row labeling but also allow you to customize the number of rows displayed on a single page. You can adjust this preference by changing the Number of rows displayed per page in the Preview Settings in Project Creation Wizard Step 2.

We support this Navigation Bar only for Row Based Project

Here are some examples of the Navigation Bar view:

  • Tabular view

  • URL view (image)

  • URL view (website)

Document Navigation

You can also navigate to your next document if you upload more than one. On the bottom left side of your labeling interface, there is a navigation bar that you can use to navigate to your next document.

Required question

The asterisk (*) next to the question indicates that the question requires an answer - leaving a required field blank will trigger an error.

Sort and filter column

If you create Text Field, Text Area, Dropdown, Hierarchical Dropdown, Date, Time, Checkbox, Slider, Grouped Attributes and URL question's type, you are able to sort and filter the columns.

For the Text Field, URL and Text Area columns, you can filter by searching the keyword.

For the Dropdown and Hierarchichal Dropdown column, you can filter based on the dropdown value.

For the Date and Time column, you can filter the date range and the time range.

For the Checkbox column, you can filter it based on the true or false value to check whether is it checked or not.

For the Slider column, you can filter it based on the value provided or based on the value range.

For the Grouped Attributes column, you can filter it by clicking the arrow down button besides the question type header name.

It will display a dropdown list of each question type in the Grouped Attributes question type.

You can start to apply a filter based on the labeled answer by clicking the question, and apply the filter.

If you already apply some filters, the arrow down button besides the question header name will be changed into filter icon.

Please note that we haven't supported filter for Radio question inside a Grouped Attributes question type

Keyboard shortcuts for dropdown questions

When the answer type is Dropdown, keyboard shortcuts are displayed in the extension. In the example below, you can click 1 on your keyboard to apply POSITIVE as an answer.

Filter rows

You are allowed to see all rows or the unlabeled rows by clicking the View menu. This feature will help you if your project has many rows.

This feature is only available for in the Reviewer Mode on Row based projects.

Filter Unlabeled row only

Filtering for unlabeled rows allows you to quickly identify and prioritize data points that require labeling. With this option selected, you can easily pinpoint unprocessed data, ensuring no crucial information gets overlooked.

Filter Conflicted only

Handling conflicts in labeled data is essential to achieve accurate results. By isolating conflicted rows, you gain a clear view of data points that have discrepancies in their labels. This empowers you to address conflicts promptly and make informed decisions to resolve them effectively.

Filter Unreviewed only

Efficiently manage your review process by filtering for unreviewed rows. This feature helps you keep track of which data points require verification, allowing you to ensure that all labeled data has undergone the necessary review before being finalized.

Unreviewed only includes both consensus and conflict rows, which essentially means all the rows where the reviewer has not yet submitted an answer.

Filter Reviewed only

If you need to work exclusively with reviewed and approved data, this filtering option is your go-to solution. By focusing solely on reviewed rows, you can confidently utilize accurate and reliable labeled data in your projects.

Hide and rename the headers

You can hide and rename headers by right-clicking on the header.

Mark as complete

Once you have finished labeling, click Mark as complete. This will signify to your team you are done with the project, and it is ready for Review or Export.

Personalization Setting

This setting allows users to customize their labeling experience according to their preferences. This is accessible through File > Settings menu > Personalization tab.

Automatically jump to next document when marking as complete

When this setting is enabled, marking a document as complete will automatically move you to the next document. This can be done either from the extension or by using a shortcut (Ctrl + m).

This setting eliminates the need for manual navigation between documents after marking one as complete.

Row-based with URL view

There's an option for you to label multiple images by providing the URL of the Images in a column of your Row-based file.

Prepare a Row-based file that contains a URL column

You can store your Images on your preferred storage options (make sure it's accessible). You can also add additional information for each of the images by adding the attributes to the columns (The data can't be edited later).

Check and set the preview of the row-based labeling

You can set how the media will be previewed on the labeling page. Here are some of the options:

  • Don't expand: Not previewing image from the URL

  • Thumbnail: Previewing smaller size of the image from the URL

  • Large: Previewing the larger size of the image from the URL

Set the viewer setting to URL view

Make sure to change the Viewer Setting from the Tabular View to the URL View. Also, set the URL Columns to the column name of your Row-based file that contains the URLs.

Labeling page of the row-based with URL View

Here you can see your links that are retrieved from the URL that you provided on the Row-based file. Now you can conduct the Row-based labeling with the help of URL View.

See additional information on the row labeling extension

The additional information that are available on the columns of the Row-based file can be found on the Row Labeling extension.

Applying answers for multiple rows

Some users may notice some patterns in their dataset. So, they may need to apply the same answer to multiple rows at once. The good news is Datasaur support that!

Users are allowed to select multiple rows and apply an answer to the selected rows. There are two ways to use this feature:

  1. Select the checkbox that is available per row

  2. Hold ctrl from the keyboard to select multiple rows or you can tick the check box to select multiple rows

After selecting the rows, you can select answers on the questions in Document Labeling extension, then click Submit to apply the answer.

How it works

  • Select multiple rows that have no answer, then bulk answer

    • All rows should have the same answers

  • Select two rows where one row has an answer and one row has no answer, then bulk answer

    • All rows should have the same answers

  • Select two rows where one row has an answer and one row has no answer, then answer question one and leave the rest

    • The same answer should be provided for question one only

  • If you select multiple rows, there is a possibility that the questions have different answers, hence the Mixed value will be shown below the question:

    Case 1: If the user changes the answer value, it will override all answers for the selected question if submitted.

    Case 2: Reset button will be shown in case user changes their mind and wish to change back to Mixed value

  • This capability is only available in the labeler mode.

  • This capability doesn’t apply for row labeling projects with any of these settings:

    • Number of rows per page: 1

    • URL viewer

Auto-saved answers

Auto-saved will save your answers as a draft before submitting, and the draft will remain even if you accidentally refresh or close the page. This feature is particularly suitable for row-based or document-based projects that have many questions.

Auto-saved answers apply to both labeler and reviewer. If another reviewer submits an answer in the extension, the current draft will be replaced with the latest answers (compare the timestamps).

Additionally, this feature includes the ability to discard drafts, which can be done by clicking the triple dots in the Document Labeling questions.

Please note that the auto-saved answer feature is only available for single row selection, and this capability also applies to discarding drafts.

  • For row-based projects, please note that the draft will only be shown in the Document Labeling extension, not in the table

    • Answers will be shown in the table once the submission is successful

  • Changing question sets will remove the auto-saved answers

Once you set the number of rows preferences in Step 2, you will have a similar interface to the following screenshot and can easily navigate and paginate your labeling process using the Navigation Bar above the table. More information about Table Data .

here
See this Youtube video for visual instructions on how to label row-based projects
821B
Datasaur sample -URL viewer.csv
Row based project
Project Creation Wizard Step 2
Row Page Navigation
Tabular view
URL view (image)
URL view (website)
Document Navigation
Text, URL and Text Area Example
Dropdown Example
Hierarchical Dropdown Example
Date Example
Time Example
Checkbox Example
Slider Example
Showing only unlabeled row
Showing only conflicted row
Showing only unreviewed row
Showing only reviewed row