# Custom OCR

## OCR Custom Text Extraction API

This custom text extraction API is a Datasaur feature which allows creating a custom [OCR](https://docs.datasaur.ai/data-studio-projects/nlp-task-types/project-templates#optical-character-recognition) project using your own text extraction API.

### Request from Datasaur

> **POST** <https://custom-text-extractor.com/text-extraction/example>

| **Request headers** |                              |
| ------------------- | ---------------------------- |
| Accept              | application/json, text/plain |

| **Form Data Parameters** |                                                                                                                                                |
| ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| upload                   | Your document file (e.g.: [receipt.jpg](https://user-images.githubusercontent.com/1897341/108043465-b844d200-7073-11eb-9beb-a69305024e15.jpg)) |

### **Expected API Response**

Datasaur can process the response differently based on the **`Content-Type`** header returned from the API response.

#### Text response (`Content-Type: text/plain)`

```
SHIHLIN TAIWAN
STREET SNACKS
Grand Galaxy Park
DATE 26/02/20 15:53
CASHIER: Reny
No. Customer: 1
```

#### JSON response (`Content-Type: application/json`)

Datasaur uses [Importable format](https://docs.datasaur.ai/api/custom-ocr/importable-format) to process the API response.

```javascript
{
  "cells": [
    {
      "content": "SHIHLIN TAIWAN",
      "index": 0,
      "line": 0,
      "metadata": [],
      "tokens": [
        "SHIHLIN",
        "TAIWAN"
      ]
    },
    {
      "content": "STREET SNACKS",
      "index": 0,
      "line": 1,
      "metadata": [],
      "tokens": [
        "STREET",
        "SNACKS"
      ]
    }
  ],
  "labelSets": [],
  "labels": [
    {
      "startCellLine": 0,
      "startCellIndex": 0,
      "startTokenIndex": 0,
      "startCharIndex": 0,
      "endCellLine": 0,
      "endCellIndex": 0,
      "endTokenIndex": 0,
      "endCharIndex": 6,
      "layer": 0,
      "counter": 0,
      "pageIndex": 0,
      "type": "BOUNDING_BOX",
      "nodeCount": 4,
      "x0": 130,
      "y0": 154,
      "x1": 255,
      "y1": 154,
      "x2": 255,
      "y2": 186,
      "x3": 130,
      "y3": 186
    },
    {
      "startCellLine": 0,
      "startCellIndex": 0,
      "startTokenIndex": 1,
      "startCharIndex": 0,
      "endCellLine": 0,
      "endCellIndex": 0,
      "endTokenIndex": 1,
      "endCharIndex": 5,
      "layer": 0,
      "counter": 0,
      "pageIndex": 0,
      "type": "BOUNDING_BOX",
      "nodeCount": 4,
      "x0": 261,
      "y0": 154,
      "x1": 375,
      "y1": 154,
      "x2": 375,
      "y2": 186,
      "x3": 261,
      "y3": 186
    }
  ],
  "name": "receipt.jpg",
  "pages": [
    {
      "pageIndex": 0,
      "pageHeight": 619,
      "pageWidth": 551
    }
  ],
  "type": "BOUNDING_BOX"
}
```

### Apply custom API

* Upload the PDF or images through [Project Creation Wizard](https://docs.datasaur.ai/data-studio-projects/creating-a-project#project-creation-wizard)
* In Step 2,
  * Select **+Add new API...** as the OCR method

    <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f8cac34147577f5b995cdac3ef1a6fedcceac643%2FPCW%20-%20Step%202%20-%20OCR%20labeling%20-%20document%20preview%20-%20apply%20OCR%20method%20-%20dropdown.png?alt=media" alt=""><figcaption></figcaption></figure>
  * Put your API name, API URL, and the secret

    <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f90846985d6c248613d83c81336d2479a77c03ac%2FPCW%20-%20Step%202%20-%20OCR%20labeling%20-%20add%20custom%20OCR%20API%20dialog.png?alt=media" alt=""><figcaption></figcaption></figure>
  * After clicking Save, the custom API will be saved to the list, and you can choose it as the OCR method
* [Create a label set, assign labelers as appropriate, then launch the project](https://docs.datasaur.ai/workspace-management/workspace#creating-a-project)
* The interface will appear side-by-side with the PDF on the left and the transcription on the right

  <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3bcc2fec0e45d2b8efd94cc9bbbb547838ee8e8f%2FProject%20-%20OCR%20labeling%20-%20labeler%20mode%20-%20labeled.png?alt=media" alt=""><figcaption></figcaption></figure>

## ASR Custom Text Extraction API

This custom text extraction API is a Datasaur feature which allows creating a custom[ Audio](https://datasaurai.gitbook.io/datasaur/nlp-projects/nlp-task-types/audio-project) project using your own text extraction API.

### Request from Datasaur

> **POST** <https://custom-text-extractor.com/text-extraction/example>

| **Request headers** |                              |
| ------------------- | ---------------------------- |
| Accept              | application/json, text/plain |

| **Form Data Parameters** |                                        |
| ------------------------ | -------------------------------------- |
| upload                   | Your document file (e.g.: audio2.flac) |

### **Expected API Response**

Datasaur can process the response differently based on the **`Content-Type`** header returned from the API response.

#### Text response (`Content-Type: text/plain)`

```
A quick brown fox jumps over a lazy dog
Speaker 2: A quick brown fox jumps over a lazy dog
Speaker 1: A quick brown fox jumps over a lazy dog
Speaker 2: A quick brown fox jumps over a lazy dog
Speaker 1: A quick brown fox jumps over a lazy dog
```

#### JSON response (`Content-Type: application/json`)

Datasaur uses [Importable format](https://docs.datasaur.ai/api/custom-ocr/importable-format) to process the API response.

```javascript
{
  "cells": [
    {
      "content": "SHIHLIN TAIWAN",
      "index": 0,
      "line": 0,
      "metadata": [],
      "tokens": ["SHIHLIN", "TAIWAN"]
    },
    {
      "content": "STREET SNACKS",
      "index": 0,
      "line": 1,
      "metadata": [],
      "tokens": ["STREET", "SNACKS"]
    }
  ],
  "labelSets": [],
  "labels": [
    {
      "id": 1,
      "startCellLine": 0,
      "startCellIndex": 0,
      "startTokenIndex": 0,
      "startCharIndex": 0,
      "endCellLine": 0,
      "endCellIndex": 0,
      "endTokenIndex": 1,
      "endCharIndex": 5,
      "layer": 0,
      "counter": 0,
      "startTimestampMillis": 1375,
      "endTimestampMillis": 4250,
      "type": "TIMESTAMP"
    },
    {
      "id": 2,
      "startCellLine": 1,
      "startCellIndex": 0,
      "startTokenIndex": 0,
      "startCharIndex": 0,
      "endCellLine": 1,
      "endCellIndex": 0,
      "endTokenIndex": 1,
      "endCharIndex": 5,
      "layer": 0,
      "counter": 0,
      "startTimestampMillis": 4437,
      "endTimestampMillis": 8218,
      "type": "TIMESTAMP"
    }
  ],
  "name": "ASR API Response Sample",
  "type": "TIMESTAMP"
}
```

### Apply custom API

* Upload the audio files through [Project Creation Wizard](https://docs.datasaur.ai/data-studio-projects/creating-a-project#project-creation-wizard)
* In Step 2,
  * Select **+Add new API...** as the ASR method

    <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-4577d0eb117b849f457e695d7ff4d23a317d8cbf%2FPCW%20-%20Step%202%20-%20audio%20labeling%20-%20audio%20preview%20-%20apply%20ASR%20method%20-%20dropdown.png?alt=media" alt=""><figcaption></figcaption></figure>
  * Put your API name, API URL, and the secret

    <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-cd6b8509c4479e8179641c35515cb8157b3b2a90%2FPCW%20-%20Step%202%20-%20audio%20labeling%20-%20add%20custom%20ASR%20API%20dialog.png?alt=media" alt=""><figcaption></figcaption></figure>
  * After clicking Save, the custom API will be saved to the list, and you can choose it as the ASR method
* [Create a label set, assign labelers as appropriate, then launch the project](https://docs.datasaur.ai/workspace-management/workspace#creating-a-project)
* The interface will appear like the screenshot below, with the audio on the top and the transcription on the bottom

  <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-4ded718a9b324db5a70b7a985c1fe1028c55aa38%2FProject%20-%20audio%20labeling%20-%20labeled.png?alt=media" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
Custom API capabilities are **only supported in team workspaces**. If you would like access, please email us at <support@datasaur.ai>.
{% endhint %}
