# Predictive Labeling

## Overview

Datasaur's **Predictive labeling** utilizes machine learning to automate the labeling process by predicting labels for data based on a subset of manually labeled entries. This feature significantly reduces the time and effort required for manual data labeling, especially useful for large datasets. The **Predictive labeling** extension allows users to streamline their labeling workflow, enhance consistency, and achieve cost-effective data labeling process.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-25db4dbaea602fadeb5ba510e82d97f1336b507d%2FExtension%20-%20Predictive%20Labeling%20-%20highlight%20-%20initials.png?alt=media" alt=""><figcaption><p><strong>Predictive labeling</strong> extension</p></figcaption></figure>

## Use Case

Let's use **Predictive labeling** to annotate our spam message detection model.

1. Create a project: Follow the guide [here](https://docs.datasaur.ai/nlp-projects/creating-a-project) to create a row labeling project. Here’s what the data looks like.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-602d93006562b86feb8af098c3cef49c216b0229%2FExtension%20-%20Predictive%20labeling%20-%20project%20-%20unlabeled.png?alt=media" alt=""><figcaption><p>Empty Row Based Project</p></figcaption></figure>
2. Enable **Predictive labeling**: Click the gear icon from the extension panel on the right to open the **Manage extensions** dialog, then enable the **Predictive labeling** extension.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-8c1fe60f8e099c41be3844f68f710590550ddd71%2FExtension%20-%20Manage%20extensions%20-%20Predictive%20labeling.png?alt=media" alt=""><figcaption><p>Manage Extension</p></figcaption></figure>
3. Manually label the data: Once it's enabled, you can start labeling your data. Make sure to label a minimum of five items for each answer category. For example, if you have two categories—`True` and `False`—label at least five items as `True` and five items as `False`.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-e96104b41b6ea6fa36f21ac4576bf07fbd69e7d1%2FExtension%20-%20Predictive%20labeling%20-%20project%20-%20data%20labeled.png?alt=media" alt=""><figcaption><p>Manual Labeling</p></figcaption></figure>
4. Predicting the labels: You can select the **Input column(s)** as the context, in this case, it will be **Message** and the **Target field** as the column for the predicted answer. Then, click **Save configuration**. Voilà, the **Predictive labeling** extension has automatically predicted all the labels.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-2abcc7575ec7f63c078000ee3fe5bce0691aa53b%2FExtension%20-%20Predictive%20labeling%20-%20project%20-%20results%20available.png?alt=media" alt=""><figcaption><p>Predict Labeling Result</p></figcaption></figure>
5. Review the results: You can now review the results, individually accept/reject them, or click **Accept all** or **Reject all**.

For further details, please visit the [Assisted Labeling - Predictive Labeling](https://docs.datasaur.ai/assisted-labeling/predictive-labeling).
