Datasaur Dinamic with Hugging Face

Hugging Face

Datasaur Dinamic with Hugging Face Auto Train integration offers a comprehensive end-to-end model building capability. This powerful feature allows you to effortlessly train models using labeled data. Moreover, you can seamlessly combine it with ML Assisted to automate predictions for unlabeled data, all within a streamlined and efficient workflow. This feature is available for both Row and Span labeling projects.

Here's a step-by-step guide to achieving optimal results with our Datasaur Dinamic using Hugging Face Auto Train provider:

  1. Create a custom project, where you can utilize your data, whether it's pre-labeled or unlabeled.

    • When working on projects that involve unlabeled data, you can choose to label all data or start from a subset of the dataset. For instance, if you have 20 rows of data, try to label 10 or 15 rows initially, and the rest can be automated later.

  2. Now you can enable the Datasaur Dinamic extension by adjusting it in the extension settings.\

  3. Provide the details on the extension.

    • Provide your Hugging Face authentication credentials, including your username/organization name and API token.

    • Define the target text, which is the column selected for data classification.

    • Set up your preferred label set or question options as a target question (for Row Labeling Projects) or as a label set (for Span Labeling Projects).

  4. Click on “Train” to initiate the training process for the labeled data.

  5. Now you can wait for the model to be deployed. Additionally, you can monitor the updates to your datasets and model on your Hugging Face profile.

  6. After the model is deployed, the extension will display the URL. You can copy this URL and use it with our ML Assisted feature through the Hugging Face provider.\

Automate the rest of the data with ML Assisted

As of now, it is assumed that you have successfully deployed a model on Hugging Face.

  1. Enable the ML-Assisted Labeling extension and select Hugging Face as your provider.

  2. Fill all the ML-Assisted Labeling fields by copying and pasting the model name from the Datasaur Dinamic extension or your preferred model. Provide your API token and set the desired confidence score.\

  3. Click “Predict labels” to generate labels for the corresponding rows.

By following these steps, we significantly reduce the time required for labeling the entire dataset, promoting a more efficient and streamlined workflow.

Last updated