Hugging Face

Supported Labeling Types: Span Labeling, Row Labeling

Datasaur integrates directly with HuggingFace, providing access to their 10k+ models.

After choosing HuggingFace as the option, you can navigate to Hugging Face and choose the available model. If you already host your own private models on Hugging Face, you can use those as well.

Span Labeling

For Span Labeling, you can either enter the model name or the endpoint URL if you're using a self-hosted model. There's no need to provide a model name or API token when using your own endpoint. You can also set the confidence score to manually adjust the prediction threshold based on your needs.

Image of ML Assisted with Hugging Face for Span Based

Row Labeling

In Row Labeling, you can choose the Target Text as your input and the Target Question as your desired output. To get started, enter either the model name or the Hugging Face Inference Endpoint URL, along with your API token.

When choosing models for predicting labels, you use a text-classification model, the model should return a list of dictionaries/object where each object contain all prediction (positive, negative, neutral) like this

[
[ { label: "positive", score: 0.8 }, { label: "neutral", score: 0.15 }, { label: "negative", score: 0.05 } ],
[ { label: "negative", score: 0.6 }, { label: "neutral", score: 0.3 }, { label: "positive", score: 0.1 } ]
]

or just a single list/array that contains objects of single prediction (the highest score) like this

[
{ label: "positive", score: 0.8 },
{ label: "negative", score: 0.6 }
]

This feature also includes an option for Faster Prediction Speed, which significantly improves performance by processing entire rows at once. However, this action can’t be undone.

Finally, you can adjust the Confidence Score to manually set the prediction threshold according to your preference.

Image of ML Assisted with Hugging Face for Row Based

If you click the Predict Labels button, the project will automatically apply labels to the document based on the loaded model.

Last updated 1 month ago