Export Project

There are two types of exports: exporting a single file in a specific project or exporting all files in a project.

The Export API

Export a File

Please note that this API will only return the latest state of the project.

Export All Files

Specifically for exporting all files, there is a Python script examplearrow-up-right that you can refer to or use. This API will return a zip which consist of the latest state of the project, as well as each labeler's work.

Asynchronous Process

All the processes above are done asynchronously. To check the export job, you need to request the getExportDeliveryStatusarrow-up-right query.

Specifically for the FILE_STORAGE method, the fileUrl response could return 404. If this happens, it means the export result hasn't been uploaded yet. Please wait or polling the query above to know when exactly the fileUrl is ready and can be downloaded.

Required Request Payloads

Document ID

See how to get the value here.

Method

There are four different methods to obtain the export result, i.e. download, email, webhook, and external object storage. These methods are fully explained here.

For the API payload reference, it can be accessed herearrow-up-right.

The method will be the same and applicable for both types of export. Below is the additional explanation for each method. Ensure you fill all the required attributes, and then follow these hints.

  1. Download - For the method attribute, fill it with FILE_STORAGE.

  2. Email - For the method attribute, fill it with EMAIL.

  3. Webhook - For the method attribute, fill it with CUSTOM_WEBHOOK. - You also need to fill url and secret attribute. - The detailed explanation can be seen here.

  4. External Object Storage - For the method attribute, fill with EXTERNAL_OBJECT_STORAGE. - You also need to fill externalObjectStorageParameter.

Format - Extension Mapping

  • DATASAUR_SCHEMA can be used for any kinds of project.

  • XLSX, CSV, JSON_TABULAR, and TSV for Row and Doc Labeling.

    • CSV will also work and is compatible for Hugging Face.

  • TSV_IOB, TSV_NON_IOB, JSON_ADVANCED for Token Labeling.

  • CUSTOM for export using File Transformer.

  • JSON for JSON Simplified format.

  • PLAIN for exporting only the text (without any labels) of a Token Labeling project.

  • AMAZON_COMPREHEND_CSV can be used for both Token Labelingarrow-up-right and Row Labelingarrow-up-right.

    • There is a specific scenario to consider with Token Labeling. Since Comprehend can only reference a file on S3, it is important to export the text after making any edits by doing another export with the PLAIN extension as explained above. This ensures that you can correctly reference the annotation data.

  • GCP_VERTEX_AI_CSV can be used for both Token Labelingarrow-up-right and Row Labelingarrow-up-right.

Last updated