LLM Assisted Labeling

Datasaur ML Assisted Labeling with Large Language Model supports users to integrate the labeling process to be assisted labeled with recents top notch model from OpenAI, Azure OpenAI, Anthropic, Gemini and Cohere. This will make labeling easier and general for all use cases!

Providers and Model Support

Provider NameModel

OpenAI

gpt-3.5-turbo, gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini

Azure Open AI

Anthropic

claude-2, claude-2.1, claude-3-haiku-20240307, claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-5-sonnet-20240620

Gemini

gemini-pro, gemini-1.5-flash, gemini-1.5-pro

Cohere

command-light, command-r, command-r-plus

Quick Guide

After successfully creating the project, you need to activate the ML-assisted labeling extension and select LLM Assisted Labeling as the provider. Once you have chosen LLM Assisted Labeling, you can access several fields under the extension. These fields include:

  1. LLM provider: You can choose from the variety of LLM that Datasaur support.

  2. Target text: Define your text column(s) that is going to be treated as input and prompt context.

  3. Target question: Select your question to be answered.

  4. System prompt: Sets the behavior and context for the language model.

  5. User prompt: User definition of a task to be completed in a specific labeling workflow.

  6. API key: The LLM Provider secret key

  7. API version: The API Version from your Azure OpenAI.

  8. API base URL: The base URL for your Azure OpenAI API model.

  9. Model deployment: The deployment model name from Azure OpenAI.

  10. Advanced Settings

    1. Top P: Limits predictions to the smallest set with a cumulative probability of P.

    2. Temperature: Controls randomness; lower values make responses more predictable.

    3. Maximum tokens: Limits the length of the generated response.

    4. Model Name: The specific version of the language model.

For guidance, you can refer to our prompt examples: Row-Based and Span-Based.

Once all fields have been filled, you can predict the label by clicking “Predict label” then you will see the assisted labeling recommendation from your prompts and settings.

If you are experiencing the 429 error, the limitation came from the LLM Provider package. Please take a look at your current usage of LLM Provider API.

Last updated