# Predictive Labeling

## Introduction

**Predictive labeling** leverages machine learning to predict labels for data based on a subset of manually labeled entries. This feature is designed to significantly reduce the time and effort required for manual data labeling. Once predictions are made, they can be accepted or rejected based on project requirements. Predictive Labeling is especially useful for handling large volumes of data. However, it is important to review the predicted labels to avoid errors.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f3781fc96281d4cb245211ab39a31427a5090b58%2FExtension%20-%20Predictive%20Labeling%20-%20cover.png?alt=media" alt=""><figcaption></figcaption></figure>

## Key Features

Using Datasaur **Predictive labeling** offers several benefits:

* **Enhanced efficiency**: Significantly reduces the time and effort required by annotators by leveraging predictions to minimize manual labeling.
* **Consistency**: Ensures consistent labeling across large datasets, reducing variability that can occur with manual labeling.
* **Scalability**: Efficiently manage and label large volumes of data, making it easier to scale projects and handle extensive datasets.

## Quick Start Guide

Here's a quick guide on using **Predictive labeling** for row labeling project:

### Step 1: Enable Predictive Labeling Extension

In a row labeling project, click the gear icon from the extension panel on the right to open the **Manage extensions** dialog. Then, enablethe **Predictive labeling** extension.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-8c1fe60f8e099c41be3844f68f710590550ddd71%2FExtension%20-%20Manage%20extensions%20-%20Predictive%20labeling.png?alt=media" alt=""><figcaption><p>Manage extensions</p></figcaption></figure>

### Step 2: Configure Input and Output Fields

After enabling the extension, configure the input and output fields. For **Input column(s),** select the columns that will be used as context. For **Target field,** select the field where the predicted answers will be added.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-25db4dbaea602fadeb5ba510e82d97f1336b507d%2FExtension%20-%20Predictive%20Labeling%20-%20highlight%20-%20initials.png?alt=media" alt=""><figcaption><p><strong>Predictive labeling</strong> extension</p></figcaption></figure>

### Step 3: Save Configuration and Start Prediction

Click **Save configuration** to start the prediction. If the project already contains some labeled data, the system will immediately start showing predictions. If not, it is necessary to label a minimum of 5 data points for each category of the answer.

* For instance, if there are two categories: `POSITIVE` and `NEGATIVE,` label **at least 5 data points** as `POSITIVE` and 5 as `NEGATIVE.`

### Step 4: Review and Accept or Reject Predictions

Once predictions are displayed, they can be reviewed. Accept or reject the labels based on their accuracy and relevance. This iterative review process helps to refine the model and improve prediction accuracy over time.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-2abcc7575ec7f63c078000ee3fe5bce0691aa53b%2FExtension%20-%20Predictive%20labeling%20-%20project%20-%20results%20available.png?alt=media" alt=""><figcaption><p><strong>Predictive labeling</strong> results</p></figcaption></figure>

By following these steps and leveraging the capabilities of Datasaur **Predictive labeling**, projects can achieve a high level of efficiency and accuracy, ultimately leading to more effective data management and analysis.
