# Create Projects

### How It Works <a href="#using-pcw-payload-1" id="using-pcw-payload-1"></a>

```bash
$ npm run start -- create-projects -h
Usage: robosaur create-projects [options] <configFile>

Create Datasaur projects based on the given config file

Options:
  --dry-run      Simulates what the script is doing without creating the projects
  --without-pcw  Use legacy Robosaur configuration (default: false)
  --use-pcw      Use the payload from Project Creation Wizard in Datasaur UI (default: true)
  -h, --help     display help for command
```

* Robosaur will try to create a project for each folder inside the `create.files` folder. If the contents of `quickstart/token-based/documents` looks like the example below, Robosaur will create two projects named `Project 1` and `Project 2` with each project has one document named `lorem.txt` and `ipsum.txt` respectively. This attribute could be a path to your local drive or any supported object storage, the details can be seen [here](https://docs.datasaur.ai/integrations/robosaur/storage-options).

  ```bash
  $ ls -lR quickstart/token-based/documents
  total 0
  drwxr-xr-x  3 user  group  Project 1
  drwxr-xr-x  3 user  group  Project 2

  quickstart/token-based/documents/Project 1:
  total 8
  -rw-r--r--  1 user  group  lorem.txt

  quickstart/token-based/documents/Project 2:
  total 8
  -rw-r--r--  1 user  group  ipsum.txt
  ```
* All successful project creation is saved on the **state** that is configured by `projectState` attribute. So, the next time you run the same command, there will be no project duplication. It will only process the new project(s) or the failed ones.

### Recommended Steps

* Select a configuration example from the `quickstart` folder.
* Specify the `create.files` value. As mentioned above, this attribute will be the data source of the projects.
* Open the [app](https://app.datasaur.ai) and select your preferred team to work on by clicking your profile on the top right corner.
* Create a new project using the Project Creation Wizard (PCW) by clicking the `+ Custom Project`.
* Configure what kind of projects that you want to automate. Go through until the last step, including choosing labelers and reviewers, and click `<> View Script` in the top right corner.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-7897ebe40e17fbd7181247379c5eddc156fbcbc2%2FCreate%20Projects%20Recommended%20Steps.gif?alt=media" alt=""><figcaption><p>View Script</p></figcaption></figure>

* After that, copy the values.
* Paste the value directly to `create.pcwPayload` and make sure the `create.pcwPayloadSource` value is properly filled. See the detailed [below](#pcw-payload).
* Specify the `pcwAssignmentStrategy`. The value could be `ALL` (default) or `AUTO`. See the detailed [below](#distribution).
* Run the command.

### PCW Payload

1. Directly on the configuration file which is the recommended approach. Paste the payload to `create.pcwPayload` and make sure the value of `create.pcwPayloadSource` is like the example below.

   ```json
   {
     ...
     "create": {
       ...
       "pcwPayloadSource": { "source": "inline" },
       "pcwPayload": <paste the values from PCW>
     }
     ...
   }
   ```
2. Use a storage (could be local file or any supported cloud storage). Below is the example using GCS. Paste the value to a JSON file in your bucket and fill `create.pcwPayload` with the path. Another attributes that must be filled are `create.pcwPayloadSource` and `credentials`. For other supported object storage, see [here](https://docs.datasaur.ai/integrations/robosaur/storage-options).

```json
{
  ...
  "credentials": {
    "gcs": { "gcsCredentialJson": "<path-to-JSON-service-account-credential>" }
  },
  "create": {
    ...
    "pcwPayloadSource": {
      "source": "gcs",
      "bucketName": "my-bucket-name"
    },
    "pcwPayload": <path-to-the-payload-in-JSON-file>
  }
  ...
}
```

### Assignment <a href="#pcw-assignment" id="pcw-assignment"></a>

#### List of Assignees (Labelers and Reviewers)

There are two ways to specify the list.

1. Using the labelers and reviewers that are already assigned on PCW. This is the **default approach** and **you won't have to do a thing** because it's already included on the configuration when you paste it from PCW.
2. Specify the list on your own. Create a file and specify the path on `create.assignment` attribute. The values of the file should be like this below.

   * If `useTeamMemberId` is `true`, fill both labelers and reviewers with `teamMembeId`.
   * If `useTeamMemberId` is `false`, fill both labelers and reviewers with their emails.

   ```json
   {
     "labelers": [...], // list of emails
     "reviewers": [...], // list of emails
     "useTeamMemberId": false
   }
   ```

#### Distribution

Currently, we are supporting two assignment distributions.

1. **Across documents** (default approach). You would only need to specify `create.pcwAssignmentStrategy` value. Here is the supported approach.

   * **AUTO**: distribute documents to labelers using round-robin algorithm, i.e. each document will only be assigned by exactly one labeler.
   * **ALL**: labelers will be assigned to all documents.

   Please note that the reviewers **will be assigned to all** **projects and documents**.
2. **Across projects**. To use this approach, you would have to specify the labelers and reviewers list on your own just like mentioned on the [List of Assignees](#list-of-assignees-labelers-and-reviewers) section. Follow the steps below.

   ```json
   {
     ...
     "create": {
       ...
       "assignment": {
         "source": "local",
         "path": "quickstart/token-based/config/assignment.json",
         "by": "PROJECT",
         "strategy": "AUTO"
       },
       // remove pcwAssignmentStrategy
       // remove documentAssignments from pcwPayload
       ...
     }
   }
   ```

   1. Create the assignment file and specify it on `create.assignment`.
   2. Fill `project` as the value of `create.assignment.by` attribute.
   3. Select assignment strategy by filling the `create.assignment.strategy`. There are two ways supported.
      1. **AUTO**: distribute both labelers and reviewers using round-robin. Only one labeler and reviewer for each project.
      2. **ALL**: all reviewers and labelers will be assigned to each project.
   4. Remove `create.pcwAssignmentStrategy` attribute and `documentAssignments` attribute from `pcwPayload`.

### Tagging Projects

Newly created projects from Robosaur can be tagged automatically.

From the PCW payload that you have copied using the recommended approach from the [previous section](#pcw-payload) (directly on the config file), add a new field called `tagNames` under `create.pcwPayload.variables.input`, and specify the tags for the projects. If the tags did not exist yet, they will be created for you.

```json
{
  ...
  "create": {
    ...
    "pcwPayload": {
      ...
      "variables": {
        ...
        "input": {
          ...
          "tagNames": ["TAG 1", "TAG 2"]
        }
      }
    }
  }
  ...
}
```

Or, if the PCW Payload is on an external file (whether it is local or from a cloud storage), add the `tagNames` field in `variables.input`, and specify the tags for the projects.

```json
{
  ...
  "variables": {
    ...
    "input": {
      ...
      "tagNames": ["TAG 1", "TAG 2"]
    }
  }
}
```

### ML-Assisted Labeling

Automate the labeling process on the newly created projects using ML-assisted labeling.

In the config file, add the `autoLabel` field under create and fill in the required fields. The target API requires the project to have a label set to be able to work properly.

```json
{
  ...
  "create": {
    ...
    "autoLabel": {
      "enableAutoLabel": true,
      "labelerEmail": "<EMAIL>", // use your Datasaur's account email
      "targetApiEndpoint": "<API_ENDPOINT>", // your custom API model
      "targetApiSecretKey": "<API_SECRET>", // if needed
      "numberOfFilesPerRequest": 1
    }
  }
  ...
}
```

With this, every time a project is created, the ML-assisted labeling will be triggered and there will be labels applied on the new project, depending on your custom API model response.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/integrations/robosaur/commands/create-projects.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
