Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
      • Chunk Editor
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
      • Dropbox
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.112.0
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Creating and/or Edit an Action
  • Running the Action and See the Result
  1. Workspace Management
  2. Automation

Action: Create Projects

Automate creating projects using files directly from your object storage

Last updated 25 days ago

This Action can be accessed from the sidebar which is called Automation.

is required for this feature to work. Datasaur will access the files from a specified path in your bucket for the automations.

is also required. Datasaur uses the template to configure what kind of projects that will be created.

Please note that currently we are still in the progress of supporting Bounding Box labeling with Action.

Creating and/or Edit an Action

  • To create, click the New Action button on the page.

  • To edit, click on the triple dot menu at the top right corner of an Action card, then click Edit.

See the more detailed explanation for some of the fields.

Step 1

All fields are required to be filled.

  1. External object storage. It must be chosen from an existing one. You have to integrate your object storage first before setting up the action. If the Action is run, it will fetch the data from this bucket.

  2. Project template. Same as above, you have to create a project template first before setting up the action. The projects that will be created through the Action will follow the exact configuration of the project template.

  3. Input path. When running the Action, Datasaur will iterate files from this path. Each folder represents a project and each file inside it will be treated as the documents.

    1. Please note that this path is from the root of the bucket. So, if you want to process S3://test-external-object-storage/input, you will only need to fill input as the value. The actual bucket information is already being configured previously via External Object Storage that you've chosen on the other field.

  4. Result path. After successfully creating a project, Datasaur will move the files to the result folder so that the next time the same Action is run again, the files won't be processed twice. It's essential that you don't move any files inside this result path because it would make Datasaur cannot access the files appropriately. If file conversion process is required (e.g. converting docx to PDF), the converted files will be also available in this folder.

  5. Tags (optional). You can add custom tags to each project that is created through Action by filling this field.

Step 2

  1. Assignees. Select a pool of labelers and reviewers for the Action that will be distributed based on your preferences. There has to be at least one labeler and one reviewer.

  2. Number of labelers per project. Each time a project is created with this Action, this field will determine how many labelers (from all the selected labelers above) that will be assigned to the project. If it's lower than the total number of labelers, the assignment would be distributed using a round-robin algorithm.

  3. Number of reviewers per project. Exactly like the above but for reviewers.

Step 3

  1. Number of labelers per document. Essentially, it's the same as above. But this assignment is specific for each document inside the project.

Step 4

It's basically a recap of what the settings look like coming all together. There is also a preview using dummy data of how the projects are going to be configured, including the assignment illustration when the Action runs.

Running the Action and See the Result

From the App

  1. Go to Actions.

  2. To view the activities of an Action, click on View Run.

    1. Each time an Action is run, it will be represented as one Action Run (one row) which contains information related to the automation process. You can also see the previous runs.

    2. Note that in one Action Run, it could create more than one project, which will be covered in the View Details.

    3. If the View Details is not available, then there is no new data found on the bucket. Hence, the Action cannot create any projects since there aren't any folders with files that can be processed.

  3. Action Run Detail will show the information regarding each attempt when creating a project (one row represents one attempt).

Through API

See the illustration below. If both folders above are located in the input/ of the bucket, then you would have to fill this field with input as the value (without trailing slash). From the example above, Datasaur will create two projects if the Action is run. The first one will be named Project 1 that has two documents, i.e. lorem.txt and ipsum.txt. The second one will be named Project 2 that has only one document, i.e. lipsum.txt.

Conflict resolution. Please click to get a more detailed explanation.

Click Run button on a specific card. To view, delete, or update Action, click on the triple dot menu in the top right corner of the card.

To do an API call, start from .

Mutation:

here
here
runAction
⚠️
External object storage
Project template