Predictive Labeling

Predictive Labeling generates labels based on your existing labeled data, assisting in answering questions or tasks provided.

Overview

Datasaur's Predictive labeling utilizes machine learning to automate the labeling process by predicting labels for data based on a subset of manually labeled entries. This feature significantly reduces the time and effort required for manual data labeling, especially useful for large datasets. The Predictive labeling extension allows users to streamline their labeling workflow, enhance consistency, and achieve cost-effective data labeling process.

Predictive labeling extension

Use Case

Let's use Predictive labeling to annotate our spam message detection model.

  1. Create a project: Follow the guide here to create a row labeling project. Here’s what the data looks like.

    Empty Row Based Project
  2. Enable Predictive labeling: Click the gear icon from the extension panel on the right to open the Manage extensions dialog, then enable the Predictive labeling extension.

    Manage Extension
  3. Manually label the data: Once it's enabled, you can start labeling your data. Make sure to label a minimum of five items for each answer category. For example, if you have two categories—True and False—label at least five items as True and five items as False.

    Manual Labeling
  4. Predicting the labels: You can select the Input column(s) as the context, in this case, it will be Message and the Target field as the column for the predicted answer. Then, click Save configuration. Voilà, the Predictive labeling extension has automatically predicted all the labels.

    Predict Labeling Result
  5. Review the results: You can now review the results, individually accept/reject them, or click Accept all or Reject all.

For further details, please visit the Assisted Labeling - Predictive Labeling.

Last updated