Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
      • Dropbox
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Terminology
  • Scope
  • How to Integrate
  • SAML Integration Form
  • Authentication
  • Registration
  • Provisioning Through Custom Attributes
  1. Integrations

SAML

Integrated authentication with your preferred Identity Provider.

Last updated 2 months ago

Datasaur supports SAML as an authentication method. As you may already know, it requires an Identity Provider to function properly. This integration is managed under "Settings," which means the scope of SAML authentication is tightly coupled with each Workspace.

Terminology

  1. IdP = Identity Provider, e.g. Okta, Microsoft Entra ID, etc.

  2. SP = Service Provider. In this case, it's Datasaur.

Scope

Users who have been invited to a Workspace with SAML integration can automatically sign in using SAML. In cases where a user has multiple Workspaces, each with its own SAML integration, Datasaur allows users to sign in using different IdPs. That is why Datasaur needs Company ID attribute to correctly select the appropriate IdP for the authentication process when using SAML.

How to Integrate

To integrate your specific IdP with Datasaur, you can initiate the process by enabling it through SAML page on Settings, just like mentioned above. Here is the overview process:

  1. Open the Datasaur app and select a Workspace. Then, navigate to Settings > SAML and click the enable SAML button. Ensure you do not close this form until the process is completed.

  2. In your preferred IdP console, connect it to Datasaur. Detailed guides for specific IdPs can be found in the . Use both the "SP Sign-in URL" and "SP Issuer" values on the form to successfully integrate the IdP with Datasaur.

  3. After the integration is complete on the IdP app, fill in these three fields by referencing the values directly from the IdP to complete the form in Datasaur:

    1. IdP Sign-in URL

    2. IdP Issuer

    3. Public certificate

  4. Continue with the additional configuration that may need to be done that highly depends on each IdP.

Specific Guide for an Identity Provider App

SAML Integration Form

Values for IdP

  1. SP Sign-in URL: This is the Datasaur endpoint where SAML responses are posted. You will need to provide this link to your IdP during the integration process.

  2. SP Issuer: The default value is datasaur, but you can customize it to your preferences. This value must match the one you set on the IdP.

Values to be Filled

  1. Company ID: A unique value to help Datasaur distinguish between multiple IdPs. This attribute will be asked when signing in using SAML so that Datasaur can connect to the appropriate IdP.

  2. IdP Sign-in URL: This URL will be requested by Datasaur to perform SAML authentication.

  3. IdP Issuer: This value will be used to determine which IdP Datasaur should look into when receiving the SAML response.

  4. Public Certificate: Datasaur will use this certificate to validate the signature of SAML responses when they arrive.

Authentication

  1. Click the SAML button on the authentication page.

  2. Ensure the Company ID is accurately entered. If you are unsure about the Company ID, please consult your Admin. This ID should already be configured during the SAML integration setup.

  3. Proceed with the authentication process on your IdP.

Registration

If you have never signed in using your IdP account before, registration is required. This process is also conducted through SAML by clicking the corresponding button, as described above. Upon successful registration, the new user will be automatically added to the Workspace equipped with SAML integration.

Provisioning Through Custom Attributes

To enable this functionality, your IdP should send the roles attribute, containing an array of items formatted as <yourCompanyId>_<UPPER CASED-ROLE>. Let's say company-a is the company ID for Workspace A and company-b is the company ID for Workspace B. If a user needs to be assigned to both Workspaces (Workspace A as an Admin and Workspace B as a Reviewer), set the roles attribute like the example below:

...
<saml2:Attribute
  Name="roles"
  NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:unspecified"
>
  <saml2:AttributeValue
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:type="xs:string"
  >
    company-a_ADMIN
  </saml2:AttributeValue>
  <saml2:AttributeValue
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:type="xs:string"
  >
    company-b_REVIEWER
  </saml2:AttributeValue>
</saml2:Attribute>
...

The additional process of handling custom attributes only occurs if roles attribute is defined.

  • If there is no match (the input company ID is not found in any of the roles), the authentication process will fail.

    • If a specific user has been invited to a Workspace but the custom attribute is missing, the user won't be able to authenticate but will still exist in the Workspace.

  • If there at least one match (the input company ID is found in one of the roles), authentication will succeed, and the following side effects will occur:

    1. Auto register if the user doesn't exist.

    2. Auto invite (accepted, not just an email invitation) if the user hasn't been invited to the Workspace.

    3. Auto sync the role if the current Workspace role differs from the role specified in the custom attribute.

Datasaur supports role assignment through custom SAML attributes provided in the SAML response. This feature is optional but useful for managing multiple Workspaces with a single Identity Provider app without using SCIM. There are : Admin, Supervisor, Reviewer, and Labeler.

Okta
Microsoft Entra ID
section below
four available Workspace roles