Predictive Labeling

Predictive Labeling will allow you to optimize prediction performance when preparing labeled data to reduce the annotator's time consumption. Label less, predict more!

Introduction

Predictive labeling leverages machine learning to predict labels for data based on a subset of manually labeled entries. This feature is designed to significantly reduce the time and effort required for manual data labeling. Once predictions are made, they can be accepted or rejected based on project requirements. Predictive Labeling is especially useful for handling large volumes of data. However, it is important to review the predicted labels to avoid errors.

Key Features

Using Datasaur Predictive labeling offers several benefits:

  • Enhanced efficiency: Significantly reduces the time and effort required by annotators by leveraging predictions to minimize manual labeling.

  • Consistency: Ensures consistent labeling across large datasets, reducing variability that can occur with manual labeling.

  • Scalability: Efficiently manage and label large volumes of data, making it easier to scale projects and handle extensive datasets.

Quick Start Guide

Here's a quick guide on using Predictive labeling for row labeling project:

Step 1: Enable Predictive Labeling Extension

In a row labeling project, click the gear icon from the extension panel on the right to open the Manage extensions dialog. Then, enablethe Predictive labeling extension.

Manage extensions

Step 2: Configure Input and Output Fields

After enabling the extension, configure the input and output fields. For Input column(s), select the columns that will be used as context. For Target field, select the field where the predicted answers will be added.

Predictive labeling extension

Step 3: Save Configuration and Start Prediction

Click Save configuration to start the prediction. If the project already contains some labeled data, the system will immediately start showing predictions. If not, it is necessary to label a minimum of 5 data points for each category of the answer.

  • For instance, if there are two categories: POSITIVE and NEGATIVE, label at least 5 data points as POSITIVE and 5 as NEGATIVE.

Step 4: Review and Accept or Reject Predictions

Once predictions are displayed, they can be reviewed. Accept or reject the labels based on their accuracy and relevance. This iterative review process helps to refine the model and improve prediction accuracy over time.

Predictive labeling results

By following these steps and leveraging the capabilities of Datasaur Predictive labeling, projects can achieve a high level of efficiency and accuracy, ultimately leading to more effective data management and analysis.

Last updated