Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
      • Chunk Editor
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
      • Dropbox
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.113.0
        • 6.112.0
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Overview
  • Run Deployed LLM Application with Chat Completion
  • Endpoint
  • Request Body
  • Advanced Features
  • Multimedia Content Types
  • Advanced Retrieval: Query Rewriting
  • Advanced Retrieval: Metadata Filtering
  • Example Usage
  • Simple Text Query
  • Multi-turn Conversation
  • Text with Images
  • Text with PDF Analysis
  • URL with Base64 Screenshot
  • Metadata Filtering
  • Response
  • Non-streaming Response
  • Streaming Response
  • Error Handling
  • 400 Bad Request
  • 401 Unauthorized
  • 403 Forbidden
  • 429 Too Many Requests
  • 500 Internal Server Error
  • Rate Limiting
  • Best Practices
  1. LLM Projects
  2. Deployment

Deployment API

Overview

This page explains how to use our Deployment API with several use cases, providing detailed instructions and examples for different scenarios.

Run Deployed LLM Application with Chat Completion

Executes a RAG-enhanced chat completion request against a deployed LLM application.

Endpoint

POST https://deployment.datasaur.ai/api/deployment/:teamId/:deploymentId/chat/completions

Path Parameters

Parameter
Type
Description

teamId

string

Your team identifier

deploymentId

string

The ID of your LLM application deployment

Query Parameters

Parameter
Type
Description

source

string

(Optional) Source identifier

sourceId

string

(Optional) Specific source ID

Request Body

The request body follows the OpenAI-compatible chat completion format, with additional RAG-specific enhancements.

Key Properties

Property
Type
Required
Description

messages

array

Yes

Array of message objects representing the conversation

stream

boolean

No

Enable streaming responses

include_usage

boolean

No

rewrite_query

string

No

Override the behavior of summarizing the messages as query with an arbitrary text.

filter_metadata

object | string

No

Filter data based on the File Properties attached to each file.

Message Types

Messages can be of the following roles:

  • user: User inputs

  • assistant: Assistant responses

  • tool: Tool/function responses

Advanced Features

Multimedia Content Types

The content field in messages supports various types:

type ContentPart =
	| TextContent
	| ImageContent
	| URLContent;

Text Content

{
  "type": "text",
  "text": "Your text here"
}

Image Content

{
  "type": "image",
  "image_url": {
    "url": "https://example.com/image.jpg",
    "detail": "high"
  }
}

URL Content

{
  "type": "url",
  "url": "https://example.com",
  "name": "Optional name",
  "options": {
    "select_pages": "1-3",
    "include_page_screenshot_as_image": false
  }
}

The URL content type supports both standard web URLs and base64-encoded data URLs, allowing you to:

  1. Reference external web content: Use standard URLs (https://example.com)

  2. Embed file content directly: Use base64 data URLs for PDF, HTML, or other content types

    "url": "data:application/pdf;base64,JVBERi0xLjMKJcTl8uXrp..."

Options:

  • select_pages: Specifies which pages to process from multi-page documents (PDFs)

    • Format: "1-5" (range), "1,3,5" (specific pages), or "1-3,7,9-11" (combination)

    • Example: "select_pages": "1-3,5,8-10"

    • Default: All pages if not specified

  • include_page_screenshot_as_image: When set to true, includes visual representations of pages. The visual representations are also sent to the model when the model selected in the sandbox application supports visual capability.

    • For PDFs: Renders page as image for visual analysis

    • For websites: Captures screenshot of the rendered page

    • Enables the model to analyze visual layouts, charts, and non-text elements

    • Default: false

Base64 encoding is particularly useful for:

  • Embedding content directly without requiring separate file uploads

  • Processing temporary or dynamically generated content

  • Working with content that doesn't have a public URL

Advanced Retrieval: Query Rewriting

Query rewriting is a technique used in information retrieval and search systems to modify or enhance the original search query to improve search results. There might be cases where you prefer a customized way to summarize the message. You can rewrite the query with your version by specifying the rewrite_query. For example:

{
  "messages": [
	  { "role": "user", "content": "What's the weather in Bali?" },
	  { "role": "assistant", "content": "It's hot and humid." },
	  { "role": "user", "content": "How about in Jakarta?" }
  ],
  "rewrite_query": "What's the weather in Jakarta?"
}

Advanced Retrieval: Metadata Filtering

Another way to improve the accuracy of the retrieval is through additional filtering, to ensure only relevant information is retrieved. Below is an example of filtering search result based on the jurisdiction and the date.

{
  "filter_metadata": {
    "bool" : {
      "must" : [
        {
          "bool" : {
            "should" : [
              { "term" : { "jurisdiction" : "alabama" } },
              { "term" : { "jurisdiction" : "florida" } },
              { "term" : { "jurisdiction" : "nevada" } }
            ]
          }
        },
        {
          "range" : { "date" : {"gt": "2024-10-29" } }
        }
      ]
    }
  }
}

Example Usage

Simple Text Query

{
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}

Multi-turn Conversation

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant who specializes in geography."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    },
    {
      "role": "assistant",
      "content": "The capital of France is Paris. It's often called the 'City of Light' (Ville Lumière)."
    },
    {
      "role": "user",
      "content": "Tell me more about its population."
    }
  ]
}

Text with Images

{
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What's in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg",
            "detail": "high"
          }
        }
      ]
    }
  ]
}

Text with PDF Analysis

{
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Please summarize this research paper"
        },
        {
          "type": "url",
          "url": "https://doompdf.pages.dev/doom.pdf",
          "options": {
            "select_pages": "1-5",
            "include_page_screenshot_as_image": true
          }
        }
      ]
    }
  ]
}

URL with Base64 Screenshot

{
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What does this webpage contain?"
        },
        {
          "type": "url",
          "url": "data:text/html;base64,SGVsbG8gV29ybGQ=",
          "name": "Example Page",
          "options": {
            "include_page_screenshot_as_image": true
          }
        }
      ]
    }
  ]
}

Metadata Filtering

{
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What's in pages 2-5 of the technical documentation?"
        }
      ]
    }
  ],
  "filter_metadata": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "term": {
                  "jurisdiction": "alabama"
                }
              },
              {
                "term": {
                  "jurisdiction": "florida"
                }
              },
              {
                "term": {
                  "jurisdiction": "nevada"
                }
              }
            ]
          }
        },
        {
          "range": {
            "date": {
              "gt": "2024-10-29"
            }
          }
        }
      ]
    }
  }
}

Each example can be enhanced with additional parameters such as:

  • stream: true for streaming responses

  • tools for function calling capabilities

  • rewrite_query for optimized RAG queries

Response

Non-streaming Response

{
  id: string;
  choices: Array<{
    message: {
      role: 'assistant';
      content: string;
      tool_calls?: Array<ToolCall>;
    };
    finish_reason: 'stop' | 'length' | 'tool_calls' | 'content_filter';
    index: number;
  }>;
  usage: {
    prompt_tokens: number;
    completion_tokens: number;
    total_tokens: number;
    prompt_embedding_tokens?: number;
  };
  contexts?: Array<RagContext>;
}

Streaming Response

Sends chunks of the response as Server-Sent Events (SSE) with the following format:

{
  id: string;
  choices: Array<{
    delta: {
      content?: string;
      role?: 'assistant';
      tool_calls?: Array<ToolCall>;
    };
    finish_reason: string | null;
    index: number;
  }>;
  usage: null | UsageInfo;
  contexts?: Array<RagContext> | null;
}

Error Handling

The API uses standard HTTP status codes with specific actions for resolution:

400 Bad Request

Cause: Invalid parameters or malformed request

Recommended resolution:

  • Check request body format and required fields

  • Validate parameter types and values

  • Ensure message array is not empty

  • Check if URLs are properly formatted and accessible

401 Unauthorized

Cause: Missing or invalid authentication

Recommended resolution:

  • Check if API key is included in the request header

  • Verify API key is valid and not expired

  • Ensure API key has correct format

  • Generate a new API key if necessary

403 Forbidden

Cause: Insufficient permissions for the requested operation

Recommended resolution:

  • Verify team membership and permissions

  • Check if you have access to the specified LLM application

  • Ensure your subscription covers the requested features

  • Request necessary permissions from team admin

  • Upgrade subscription tier if needed

429 Too Many Requests

Recommended resolution:

  • Implement exponential backoff retry logic

  • Check rate limits in response headers

  • Reduce request frequency

  • Consider upgrading your plan for higher limits

  • Optimize batch operations to reduce API calls

500 Internal Server Error

Cause: Server-side error

Recommended resolution:

  • Retry request after a brief delay

  • Verify request payload size is within limits

  • Save error response for troubleshooting

For all errors, the response will include a detailed error message to help diagnose the issue. If problems persist after taking the recommended actions, please contact support@datasaur.ai with the error details.

Rate Limiting

The API enforces the following rate limits. If any of these limits is reached, subsequent requests will be rejected with a 429 (Too Many Requests) status code until the limit resets:

  • Origin IP address-based: 1500 requests per 60 seconds

  • Deployment ID-based: 300 requests per 60 seconds

  • Team-based: Daily limits apply for free trial accounts

Note: These limits are evaluated independently - hitting any single limit will result in request rejection, regardless of the status of other limits.

Best Practices

  1. Set appropriate temperature values in the sandbox:

    1. Lower (0.2) for factual responses

    2. Higher (0.8) for creative responses

  2. Enable stream: true for real-time responses

  3. Use rewrite_query for optimized RAG queries

  4. Include relevant file and URL content for context-aware responses

Last updated 2 months ago

This API doesn’t support the system role and will inherit the System Instruction from the sandbox. See:

Contact if API key should be valid

Cause: Rate limit exceeded, see section below for more details

Check for outages

Contact if error persists

https://docs.datasaur.ai/llm-projects/deployment
support@datasaur.ai
Rate Limiting
system status page
support@datasaur.ai