> For the complete documentation index, see [llms.txt](https://docs.datasaur.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.datasaur.ai/integrations/robosaur/commands/create-projects.md).

# Create Projects

## How it works <a href="#using-pcw-payload-1" id="using-pcw-payload-1"></a>

{% code overflow="wrap" %}

```bash
$ npm run start -- create-projects -h
Usage: robosaur create-projects [options] <configFile>

Create Datasaur projects based on the given config file

Options:
  --dry-run      Simulates what the script is doing without creating the projects
  --without-pcw  Use legacy Robosaur configuration (default: false)
  --use-pcw      Use the payload from Project Creation Wizard in Datasaur UI (default: true)
  -h, --help     display help for command
```

{% endcode %}

Robosaur creates a project for each folder inside the `create.files` directory.

For example, if `quickstart/token-based/documents` contains the structure below, Robosaur creates:

* `Project 1` with 1 document: `lorem.txt`
* `Project 2` with 1 document: `ipsum.txt`

This attribute can point to either a local directory or any supported object [storage provider](/integrations/robosaur/storage-options.md).

```bash
$ ls -lR quickstart/token-based/documents
total 0
drwxr-xr-x  3 user  group  Project 1
drwxr-xr-x  3 user  group  Project 2

quickstart/token-based/documents/Project 1:
total 8
-rw-r--r--  1 user  group  lorem.txt

quickstart/token-based/documents/Project 2:
total 8
-rw-r--r--  1 user  group  ipsum.txt
```

All successfully created projects are tracked in the state file configured through the `projectState` attribute. When you run the same command again, Robosaur skips projects that were already created successfully to prevent duplication. Only new or previously failed projects are processed.

## Recommended steps

1. Select a configuration example from the `quickstart` folder.
2. Specify the `create.files` value. This attribute defines the data source for the projects.
3. Go to Datasaur and select the workspace you want to use from the profile menu in the top-right corner.
4. Clik **Create project**.
5. Configure the project settings you want to automate. Complete all steps in the wizard, including assigning labelers and reviewers.
6. On the final step, click **View script** in the top-right corner.

   <figure><img src="/files/IQUNMsxZOzrLJGHAkbop" alt=""><figcaption><p>View script</p></figcaption></figure>
7. Copy the generated values.
8. Paste the value directly to `create.pcwPayload` and make sure the `create.pcwPayloadSource` value is properly filled. Learn more in [PCW payload](#pcw-payload).
9. Specify the `pcwAssignmentStrategy`. The value could be `ALL` (default) or `AUTO`. Learn more in [Distribution](#distribution).
10. Run the command.

## PCW Payload

You can configure the PCW payload in two ways.

* Inline payload.
* External storage.

### Inline payload (recommended)

Paste the payload directly into the configuration file using `create.pcwPayload`. Make sure `create.pcwPayloadSource` is set to `inline`.

```json
{
  ...
  "create": {
    ...
    "pcwPayloadSource": { "source": "inline" },
    "pcwPayload": <paste the values from PCW>
  }
  ...
}
```

### External storage

You can also store the payload in:

* A local file.
* Any supported cloud object storage provider.

The example below uses Google Cloud Storage (GCS). Paste the value to a JSON file in your bucket and fill `create.pcwPayload` with the path. Another attributes that must be filled are `create.pcwPayloadSource` and `credentials`. For other supported object storage, see [here](/integrations/robosaur/storage-options.md).

```json
{
  ...
  "credentials": {
    "gcs": { "gcsCredentialJson": "<path-to-JSON-service-account-credential>" }
  },
  "create": {
    ...
    "pcwPayloadSource": {
      "source": "gcs",
      "bucketName": "my-bucket-name"
    },
    "pcwPayload": <path-to-the-payload-in-JSON-file>
  }
  ...
}
```

## Assignment <a href="#pcw-assignment" id="pcw-assignment"></a>

### List of assignees

There are two ways to define the list of labelers and reviewers.

1. Use assignees from the PCW payload (default). By default, Robosaur uses the labelers and reviewers already configured in the project creation wizard (PCW). No additional setup is required because the assignees are included automatically in the copied PCW payload.
2. Specify assignees manually. Create a file and specify the path on `create.assignment` attribute. Configuration rules:

   * If `useTeamMemberId` is `true`, fill both `labelers` and `reviewers` with `teamMemberId`.
   * If `useTeamMemberId` is `false`, fill both `labelers` and `reviewers` with email addresses.

   ```json
   {
     "labelers": [...], // list of emails
     "reviewers": [...], // list of emails
     "useTeamMemberId": false
   }
   ```

### Distribution

Robosaur currently supports two assignment distribution methods.

#### Across documents (default)

This approach distributes assignments across documents within a project. To use it, set the `create.pcwAssignmentStrategy` value. Supported strategies:

* `AUTO`: Distributes documents to labelers using a round-robin algorithm. Each document is assigned to exactly one labeler.
* `ALL`: Assigns all labelers to all documents.

Reviewers are always assigned to all projects and documents.

#### Across projects

This approach distributes assignments across projects instead of documents. To use this approach:

1. Create a custom assignment file as described in the [List of Assignees](#list-of-assignees-labelers-and-reviewers) section.
2. Create the assignment file and specify it on `create.assignment`.
3. Set the value of `create.assignment.by` to `PROJECT`.
4. Set the value of `create.assignment.strategy` to `AUTO` or `ALL`.
   1. `AUTO`: Distributes both labelers and reviewers using a round-robin algorithm. Each project is assigned to exactly one labeler and one reviewer.
   2. `ALL`: Assigns all labelers and reviewers to every project.
5. Remove `create.pcwAssignmentStrategy` attribute and `documentAssignments` attribute from `pcwPayload`.

Example:

```json
{
  ...
  "create": {
    ...
    "assignment": {
      "source": "local",
      "path": "quickstart/token-based/config/assignment.json",
      "by": "PROJECT",
      "strategy": "AUTO"
    },
    // remove pcwAssignmentStrategy
    // remove documentAssignments from pcwPayload
    ...
  }
}
```

## Projects tags

Robosaur can automatically apply tags to newly created projects.

From the PCW payload that you copied using the recommended approach from the [previous section](#pcw-payload) (directly in the configuration file), add a new field called `tagNames` under `create.pcwPayload.variables.input` and specify the tags for the projects. If a tag does not already exist, it will be created automatically.

```json
{
  ...
  "create": {
    ...
    "pcwPayload": {
      ...
      "variables": {
        ...
        "input": {
          ...
          "tagNames": ["TAG 1", "TAG 2"]
        }
      }
    }
  }
  ...
}
```

If the PCW payload is stored in an external file (whether it is local or from a cloud storage), add the `tagNames` field under `variables.input`, and specify the tags for the projects.

```json
{
  ...
  "variables": {
    ...
    "input": {
      ...
      "tagNames": ["TAG 1", "TAG 2"]
    }
  }
}
```

## ML-assisted labeling

You can automate labeling for newly created projects using **ML-assisted labeling**.

Add the `autoLabel` field under `create` in the configuration file and fill in the required values. The target API requires the project to already have a label set configured.

```json
{
  ...
  "create": {
    ...
    "autoLabel": {
      "enableAutoLabel": true,
      "labelerEmail": "<EMAIL>", // use your Datasaur's account email
      "targetApiEndpoint": "<API_ENDPOINT>", // your custom API model
      "targetApiSecretKey": "<API_SECRET>", // if needed
      "numberOfFilesPerRequest": 1
    }
  }
  ...
}
```

After configuration, ML-assisted labeling automatically runs whenever a new project is created. Labels are applied based on the response from your custom API model.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.datasaur.ai/integrations/robosaur/commands/create-projects.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
