Direct Access LLMs
Last updated
Last updated
Datasaur offers Direct Access LLMs, a new feature that allows users to instantly access and call the most popular Large Language Models (LLMs) within the platform. This feature eliminates the need for complex API key setup and multi-cloud configurations. Additionally, users can cut wait lines and immediately access the latest state of the art models.
Datasaur's Direct Access LLM feature currently supports Azure OpenAI, OpenAI, Amazon Bedrock and Google Vertex, each offering a unique set of LLM models. Below, we'll delve into the details of each provider and the models they offer.
With Azure OpenAI, users can utilize the following models:
gpt-4o: A highly advanced model boasting an expansive knowledge base for richer and more comprehensive responses.
gpt-4-32k: A variant of the gpt-4 model, with greater capacity to handle longer inputs.
gpt-4 turbo: A high-performance model optimized for speed and efficiency.
gpt-4: A powerful model offering advanced language understanding and generation capabilities.
gpt-35-turbo-16k: A variant of the gpt-35-turbo model, with greater capacity to handle longer inputs.
gpt-35-turbo: A fast and efficient model ideal for applications requiring rapid response times.
With OpenAI, users can utilize the following model:
o1-mini-2024-09-12: A compact and efficient model well-suited for tasks that require fast inference and lower computational resources. It excels in short-form text generation, question answering, and text summarization.
o1-preview-2024-09-12: A preview model offering advanced capabilities and access to the latest developments in OpenAI's LLM technology. This model is ideal for exploring cutting-edge language processing tasks and experimenting with potential future functionalities.
gpt-4o-mini: A streamlined and efficient version of the advanced gpt-4o model, designed to deliver rich responses while requiring less computational power.
gpt-4o: A highly advanced model boasting an expansive knowledge base for richer and more comprehensive responses.
gpt-4 turbo: A high-performance model optimized for speed and efficiency.
gpt-4: A powerful model offering advanced language understanding and generation capabilities.
gpt-35-turbo-16k: A variant of the gpt-35-turbo model, with greater capacity to handle longer inputs.
gpt-35-turbo: A fast and efficient model ideal for applications requiring rapid response times.
With Amazon Bedrock, Datasaur is able to provide several Open Source Models, such as:
Claude 3.5 Sonnet: An enhanced version of Claude 3 Sonnet, with updated knowledge and improved reasoning capabilities.
Claude 3 Sonnet: A more verbose Claude model, offering deeper analysis and extended conversations.
Claude 3 Opus: The most comprehensive Claude model, providing in-depth expertise across a wide range of subjects.
Claude 3 Haiku: A concise and efficient AI assistant, perfect for brief, focused interactions.
Claude 2.1: An updated version of Claude 2.0, featuring refinements in language understanding and generation.
Claude 2.0: An upgraded Claude model with expanded knowledge and improved conversational abilities.
Claude Instant: A rapid-response AI assistant for quick, concise interactions.
Meta Llama 3 70b Instruct: A variant of the Meta Llama 3 8b Instruct model, with increased capacity and performance.
Meta Llama 3 8b Instruct: A newer model optimized for instruction-following tasks, offering high accuracy and reliability.
Meta Llama 2 Chat 70B: A variant of the Meta Llama 2 Chat 13B model, with increased capacity and performance.
Meta Llama 2 Chat 13B: A highly advanced model designed for conversational AI applications.
Mistral Large: A more expansive version of Mistral, offering deeper knowledge and more nuanced interactions.
Mixtral 8x7B Instruct: An advanced instruction-following model combining multiple expert systems for enhanced performance.
Mistral 7B Instruct: A compact yet powerful model designed for following instructions with precision.
Mistral Small: A nimble AI assistant optimized for quick responses and everyday tasks.
Command R+: The most advanced Command model, featuring superior problem-solving and creative abilities.
Command R: An enhanced version of Command, with improved reasoning and analytical skills.
Command: A versatile AI assistant balancing speed and capability for various applications.
Command Light: A streamlined AI model for efficient, straightforward task completion.
Amazon Titan Text Premier: Amazon's most advanced text AI, offering sophisticated language understanding and generation.
Amazon Titan Text Express: A mid-range AI assistant balancing efficiency and capability for various text-based applications.
Amazon Titan Text Lite: A lightweight AI model for basic text processing and generation tasks.
With Vertex AI, users can utilize the following models:
Gemini 1.5 Pro: A high-performance model offering advanced language understanding and generation capabilities.
Gemini 1.5 Flash: A variant of the Gemini 1.0 Pro model, optimized for speed and efficiency.
Gemini 1.0 Pro: A highly advanced model offering exceptional language understanding and generation capabilities.
With Azure AI, users can utilize the following models:
Meta-Llama-3-1-405B-Instruct: A massive 405 billion parameter version of Meta-Llama hosted on Azure AI, specifically optimized for following instructions and demonstrating exceptional proficiency in complex language understanding and generation.
Meta-Llama-3-1-70B-Instruct: A powerful 70 billion parameter model hosted on Azure AI, fine-tuned for instruction following. This model offers a balance between scale and efficiency, making it well-suited for a wide range of LLM tasks.
Meta-Llama-3-1-8B-Instruct: An efficient 8 billion parameter version of Meta-Llama on Azure AI, suitable for tasks where resource efficiency is crucial while maintaining respectable language processing capabilities.
With Hugging Face, users can utilize the following models:
Meta-Llama-3.1-70B-Instruct: A powerful 70 billion parameter model from Meta fine-tuned for following instructions. This model excels at complex language tasks, text generation, question answering, and code generation.
Meta-Llama-3.1-8B-Instruct: A smaller 8 billion parameter version of Meta-Llama, offering a good balance between performance and efficiency. It is suitable for tasks where resource constraints are a factor while still maintaining good language understanding and generation capabilities.
Mistral-7B-Instruct-v0.1: The first iteration of the Mistral-7B model fine-tuned for instruction following. This model excels in code generation, reasoning tasks, and understanding complex instructions.
Mistral-7B-Instruct-v0.2: An improved version of the Mistral-7B-Instruct model with enhanced instruction-following capabilities and performance.
Mistral-7B-Instruct-v0.3: The latest iteration of the Mistral-7B-Instruct model, further refined for better accuracy, coherence, and instruction adherence in text generation tasks.
Mistral-Nemo-Instruct-2407: A specialized version of the Mistral model trained on a massive dataset for improved code generation and technical language understanding. This model is particularly useful for tasks involving code-related data or highly technical language.
Mixtral-8x7B-Instruct-v0.1: A powerful model combining eight smaller 7 billion parameter models for enhanced performance and capabilities. This model showcases advancements in mixture-of-experts architecture, demonstrating strong performance in various natural language processing tasks.
To get started with Direct Access LLM, simply navigate to the Datasaur LLM Labs platform and create a new Sandbox.
Once the Sandbox is created, you just have to choose your desired provider by clicking the models button.
You can configure your application with your desired settings. Several parameters that you can adjust based on your needs are:
Temperature: This parameter controls the randomness of the generated text. A higher temperature will result in more diverse and creative responses, while a lower temperature will produce more focused and predictable responses.
Top P: This parameter, also known as nucleus sampling, controls the cumulative probability threshold for token selection. A higher Top P value will result in more focused and relevant responses, while a lower Top P value will allow for more diverse and unexpected responses.
Maximum output tokens: This parameter sets the maximum number of tokens that will be generated in the response. A higher maximum length will allow for longer and more detailed responses, while a lower maximum length will result in shorter and more concise responses.
Maximum knowledge base tokens: This parameter, defines the upper limit on the number of tokens that can be stored in the knowledge base. This is crucial for maintaining efficient storage and retrieval times. It ensures that the knowledge base doesn't become overloaded with data, which can slow down queries and decrease performance.
Similarity score: This parameter controls how closely the generated text matches the original prompt in terms of content and style. A higher similarity score will result in responses that more closely align with the prompt, while a lower similarity score will allow for more divergent responses.