Predictive Labeling

Predictive Labeling generates labels based on your existing labeled data, assisting in answering questions or tasks provided.

Overview

Datasaur's Predictive Labeling utilizes machine learning to automate the labeling process by predicting labels for data based on a subset of manually labeled entries. This feature significantly reduces the time and effort required for manual data labeling, especially useful for large datasets. The Predictive Labeling Extension allows users to streamline their labeling workflow, enhance consistency, and achieve cost-effective data labeling process.

Use Case

Let's use Predictive Labeling to annotate our Spam Message detection model.

  1. Create Project: Follow the guide here to create a row-based project. Here’s what the data looks like below.

  2. Enable Predictive Labeling: Click 'Manage extension' icon on the right bar and toggle on the Predictive Labeling feature.

  3. Manually label the data: Once it's enabled, you can start labeling your data. Make sure to label a minimum of five items for each answer category. For example, if you have two categories—True and False—label at least five items as True and five items as False.

  4. Predicting the labels: You can select the "Input Column(s)" as the context, in this case, it will be "Message" and the "Target Field" as the column for the predicted answer. Then click "Save Configuration". Voilà, the predictive labeling feature has automatically predicted all the labels.

  5. Review the results: You can now review the results, individually label them, or click "Accept All" or "Reject All".

For further details, please visit the Assisted Labeling - Predictive Labeling.

Last updated