# Create Projects

### How It Works <a href="#using-pcw-payload-1" id="using-pcw-payload-1"></a>

```bash
$ npm run start -- create-projects -h
Usage: robosaur create-projects [options] <configFile>

Create Datasaur projects based on the given config file

Options:
  --dry-run      Simulates what the script is doing without creating the projects
  --without-pcw  Use legacy Robosaur configuration (default: false)
  --use-pcw      Use the payload from Project Creation Wizard in Datasaur UI (default: true)
  -h, --help     display help for command
```

* Robosaur will try to create a project for each folder inside the `create.files` folder. If the contents of `quickstart/token-based/documents` looks like the example below, Robosaur will create two projects named `Project 1` and `Project 2` with each project has one document named `lorem.txt` and `ipsum.txt` respectively. This attribute could be a path to your local drive or any supported object storage, the details can be seen [here](https://docs.datasaur.ai/integrations/robosaur/storage-options).

  ```bash
  $ ls -lR quickstart/token-based/documents
  total 0
  drwxr-xr-x  3 user  group  Project 1
  drwxr-xr-x  3 user  group  Project 2

  quickstart/token-based/documents/Project 1:
  total 8
  -rw-r--r--  1 user  group  lorem.txt

  quickstart/token-based/documents/Project 2:
  total 8
  -rw-r--r--  1 user  group  ipsum.txt
  ```
* All successful project creation is saved on the **state** that is configured by `projectState` attribute. So, the next time you run the same command, there will be no project duplication. It will only process the new project(s) or the failed ones.

### Recommended Steps

* Select a configuration example from the `quickstart` folder.
* Specify the `create.files` value. As mentioned above, this attribute will be the data source of the projects.
* Open the [app](https://app.datasaur.ai) and select your preferred team to work on by clicking your profile on the top right corner.
* Create a new project using the Project Creation Wizard (PCW) by clicking the `+ Custom Project`.
* Configure what kind of projects that you want to automate. Go through until the last step, including choosing labelers and reviewers, and click `<> View Script` in the top right corner.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-7897ebe40e17fbd7181247379c5eddc156fbcbc2%2FCreate%20Projects%20Recommended%20Steps.gif?alt=media" alt=""><figcaption><p>View Script</p></figcaption></figure>

* After that, copy the values.
* Paste the value directly to `create.pcwPayload` and make sure the `create.pcwPayloadSource` value is properly filled. See the detailed [below](#pcw-payload).
* Specify the `pcwAssignmentStrategy`. The value could be `ALL` (default) or `AUTO`. See the detailed [below](#distribution).
* Run the command.

### PCW Payload

1. Directly on the configuration file which is the recommended approach. Paste the payload to `create.pcwPayload` and make sure the value of `create.pcwPayloadSource` is like the example below.

   ```json
   {
     ...
     "create": {
       ...
       "pcwPayloadSource": { "source": "inline" },
       "pcwPayload": <paste the values from PCW>
     }
     ...
   }
   ```
2. Use a storage (could be local file or any supported cloud storage). Below is the example using GCS. Paste the value to a JSON file in your bucket and fill `create.pcwPayload` with the path. Another attributes that must be filled are `create.pcwPayloadSource` and `credentials`. For other supported object storage, see [here](https://docs.datasaur.ai/integrations/robosaur/storage-options).

```json
{
  ...
  "credentials": {
    "gcs": { "gcsCredentialJson": "<path-to-JSON-service-account-credential>" }
  },
  "create": {
    ...
    "pcwPayloadSource": {
      "source": "gcs",
      "bucketName": "my-bucket-name"
    },
    "pcwPayload": <path-to-the-payload-in-JSON-file>
  }
  ...
}
```

### Assignment <a href="#pcw-assignment" id="pcw-assignment"></a>

#### List of Assignees (Labelers and Reviewers)

There are two ways to specify the list.

1. Using the labelers and reviewers that are already assigned on PCW. This is the **default approach** and **you won't have to do a thing** because it's already included on the configuration when you paste it from PCW.
2. Specify the list on your own. Create a file and specify the path on `create.assignment` attribute. The values of the file should be like this below.

   * If `useTeamMemberId` is `true`, fill both labelers and reviewers with `teamMembeId`.
   * If `useTeamMemberId` is `false`, fill both labelers and reviewers with their emails.

   ```json
   {
     "labelers": [...], // list of emails
     "reviewers": [...], // list of emails
     "useTeamMemberId": false
   }
   ```

#### Distribution

Currently, we are supporting two assignment distributions.

1. **Across documents** (default approach). You would only need to specify `create.pcwAssignmentStrategy` value. Here is the supported approach.

   * **AUTO**: distribute documents to labelers using round-robin algorithm, i.e. each document will only be assigned by exactly one labeler.
   * **ALL**: labelers will be assigned to all documents.

   Please note that the reviewers **will be assigned to all** **projects and documents**.
2. **Across projects**. To use this approach, you would have to specify the labelers and reviewers list on your own just like mentioned on the [List of Assignees](#list-of-assignees-labelers-and-reviewers) section. Follow the steps below.

   ```json
   {
     ...
     "create": {
       ...
       "assignment": {
         "source": "local",
         "path": "quickstart/token-based/config/assignment.json",
         "by": "PROJECT",
         "strategy": "AUTO"
       },
       // remove pcwAssignmentStrategy
       // remove documentAssignments from pcwPayload
       ...
     }
   }
   ```

   1. Create the assignment file and specify it on `create.assignment`.
   2. Fill `project` as the value of `create.assignment.by` attribute.
   3. Select assignment strategy by filling the `create.assignment.strategy`. There are two ways supported.
      1. **AUTO**: distribute both labelers and reviewers using round-robin. Only one labeler and reviewer for each project.
      2. **ALL**: all reviewers and labelers will be assigned to each project.
   4. Remove `create.pcwAssignmentStrategy` attribute and `documentAssignments` attribute from `pcwPayload`.

### Tagging Projects

Newly created projects from Robosaur can be tagged automatically.

From the PCW payload that you have copied using the recommended approach from the [previous section](#pcw-payload) (directly on the config file), add a new field called `tagNames` under `create.pcwPayload.variables.input`, and specify the tags for the projects. If the tags did not exist yet, they will be created for you.

```json
{
  ...
  "create": {
    ...
    "pcwPayload": {
      ...
      "variables": {
        ...
        "input": {
          ...
          "tagNames": ["TAG 1", "TAG 2"]
        }
      }
    }
  }
  ...
}
```

Or, if the PCW Payload is on an external file (whether it is local or from a cloud storage), add the `tagNames` field in `variables.input`, and specify the tags for the projects.

```json
{
  ...
  "variables": {
    ...
    "input": {
      ...
      "tagNames": ["TAG 1", "TAG 2"]
    }
  }
}
```

### ML-Assisted Labeling

Automate the labeling process on the newly created projects using ML-assisted labeling.

In the config file, add the `autoLabel` field under create and fill in the required fields. The target API requires the project to have a label set to be able to work properly.

```json
{
  ...
  "create": {
    ...
    "autoLabel": {
      "enableAutoLabel": true,
      "labelerEmail": "<EMAIL>", // use your Datasaur's account email
      "targetApiEndpoint": "<API_ENDPOINT>", // your custom API model
      "targetApiSecretKey": "<API_SECRET>", // if needed
      "numberOfFilesPerRequest": 1
    }
  }
  ...
}
```

With this, every time a project is created, the ML-assisted labeling will be triggered and there will be labels applied on the new project, depending on your custom API model response.
