Custom model

Overview

Integrate your own LLMs to Datasaur with custom models, and use them for exploration in Sandbox or for evaluation.

Integration

While Datasaur offers direct integrations with some providers, you can also connect to models hosted on various third-party platforms or your own infrastructure using the "Custom model" feature.

To simplify integration, Datasaur's Custom Model connection adheres to the widely adopted API structure defined by OpenAI for its completions or chat completions endpoints. This means that if your self-hosted model or third-party serving framework exposes an endpoint that mimics the OpenAI API format, connecting it to Datasaur is straightforward.

API specification

To make things easy, Datasaur connects to these custom models using a common method – an API structure that looks just like the one used by OpenAI (specifically, the "Chat Completions" API).

Think of it like using a standard plug: if your model hosting tool (like TGI) provides this standard API interface (an OpenAI-compatible API), Datasaur can plug right in.

How the API connection works

Please note that the streaming option is currently disabled for custom models.

When Datasaur uses your custom model, here’s basically what happens:

  1. Datasaur Sends a Request: Datasaur sends information to your model's address using a standard web method (POST). This request goes to a specific path, usually /v1/chat/completions, added to the main address you provide. The request includes:

    • Extra Info (Headers): Tells the server the data is in JSON format. If you added an API Key in Datasaur, it sends that key for security (Authorization: Bearer YOUR_API_KEY).

    • The Actual Data (JSON Body): This contains the important bits:

      • model: The name of the specific model you want to use.

      • messages: The conversation history, including instructions ("system" message) and the user's input ("user" message).

      • Optional settings like temperature (for creativity) or max_tokens (to limit response length).

    Example Request Data:

    {
      "model": "tgi",
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful assistant."
        },
        {
          "role": "user",
          "content": "Explain what an LLM is in simple terms."
        }
      ],
      "temperature": 0.7,
      "max_tokens": 150
    }
  2. Your Model Sends a Response: If everything works, your model hosting tool sends back a success message (200 OK) with its own JSON data, including:

    • id: A unique ID for this conversation turn.

    • model: Which model actually answered.

    • choices: An array (usually just one item) containing the model's reply:

      • message: The actual text generated by the model (content) and its role (assistant).

    • usage (Optional): Information about how many "pieces" of text (tokens) were used for the prompt and the answer.

    Example Response Data:

    {
      "id": "chatcmpl-randomid12345",
      "object": "chat.completion",
      "created": 1713867000, // Timestamp
      "model": "your-model-name",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "An LLM, or Large Language Model, is like a super-smart computer program that's read tons of text, so it can understand and write text almost like a human!"
          },
          "finish_reason": "stop" // Why the model stopped writing
        }
      ],
      "usage": {
        "prompt_tokens": 30,
        "completion_tokens": 55,
        "total_tokens": 85
      }
    }
    

Text Generation Inference (TGI) & Huggingface

  • Main Address (Base URL): Put the address where your TGI server is running. This might look like http://your-tgi-server-ip:8080/v1 or maybe use port 80 if using certain hosting like Hugging Face Inference Endpoints. Add /v1 at the end.

  • Model Name (in Request Data): TGI usually runs one main model at a time. You might just need to put "tgi" as the model name, or use the specific Hugging Face name the model was started with (like "NousResearch/Nous-Hermes-2"). Check your TGI setup. Learn more about TGI.

  • API Key: If you're using TGI, especially through services like Hugging Face, you'll likely need an API Key or Token. Get this key from your TGI provider (like your Hugging Face Access Token) and put it in the API Key field in Datasaur's Custom Model settings.

Connect custom models

To connect a custom model:

  1. Navigate to the Models catalog page, then click Manage providers.

  2. Select Custom model.

    Custom models option
  3. Input your model credentials. The required credentials are:

    1. Endpoint URL: The endpoint URL of your model.

    2. API key: The API key of your model.

    3. Model name: Your desired model name to be used in LLM Labs.

If you are adding a custom model from an LLM provider like Hugging Face, you only need to input the endpoint URL without the /v1/chat/completions suffix.

  1. Once you’ve added your credentials, click the Add custom model button, and your custom model will be available in LLM Labs.

    Models available

Manage custom models

To manage your custom models, click the three-dots icon on the model card. From there, you can:

  • View details

  • Try in Sandbox

  • Edit

  • Delete

Try in Sandbox

Click Try in Sandbox and you'll be automatically taken to a Sandbox project. This allows you to use it as a base model and test how it works with additional instructions and various prompts. Learn more about Sandbox

Try in Sandbox

Edit custom model

Click Edit to modify the endpoint URL, API key, and model name. Once you've updated the model credentials, click Save custom model to save your changes.

Delete custom model

Click Delete to delete your model from LLM Labs. Confirm the deletion by clicking the Delete custom model button.

Delete custom model

Last updated