# Search

The **Search** extension helps users quickly find specific words, phrases, or labeled tokens within their data. It’s useful for navigating both individual documents and entire projects, with features like label-specific searches, regex searches, and exact word matching. Results are clearly displayed in a list, making it easier to analyze and work with large datasets efficiently.

## Span Labeling

In a span labeling project, two types of searches are available: **Standard** and **Advanced.**

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-2639d82d30bf9f86efe983e52ea7b3ab5e62cc3a%2FExtension%20-%20Search%20-%20search%20type%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

### Standard search

The **Standard search** allows users to perform simple searches based on text and labels using keywords or regular expressions (regex). This search type is intuitive and provides quick access to relevant data by matching the input with the text or labels in the project.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3b51e02d6a59a6b9c76f9e11e1e9c58f7f84fa1e%2FExtension%20-%20Search%20-%20basic%20-%20initial.png?alt=media" alt=""><figcaption></figcaption></figure>

#### **Search based on text**

Text-based search allows users to search for specific words or patterns within the data by specifying a word filter and entering a keyword to locate matching text in the project.

**Word filter**

This option lets users define how their search keywords are matched to results. The available options are:

* **Contains any word:** Matches results that contain any of the specified words.
  * Example: Searching for <mark style="color:red;">`men`</mark> will match with <mark style="color:red;">`men`</mark>, <mark style="color:red;">`mentioned`</mark>, <mark style="color:red;">`abandonment`</mark>.
* **Exact word:** Displays only exact matches for the search keyword.
  * Example: Searching for <mark style="color:red;">`men`</mark> will match <mark style="color:red;">`men`</mark> but not <mark style="color:red;">`mentioned`</mark>.
* **Regex:** Allows users to search using regular expressions for advanced pattern matching.
  * Example: Searching for <mark style="color:red;">`men*`</mark> will match words starting with <mark style="color:red;">`men`</mark>, such as <mark style="color:red;">`mentioned`</mark>.

#### **Search based on label**

Label-based search allows users to find specific labels or categories in the data, with the word filter set to **Contains any word.**

#### **Search result**

The search operates at the **span and/or** **label level**, evaluating each labeled span or label instance individually based on the defined search criteria.

In the reviewer mode, search results include not only **reviewed labels** (accepted and consensus), but also **conflicted** and **rejected** labels.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-c28f3efb423c2b3d7d714e7555e2f97b30e56a7e%2FExtension%20-%20Search%20-%20basic%20-%20initial-1.png?alt=media" alt=""><figcaption></figcaption></figure>

To enhance readability, users can enable **Show only matching lines in the text viewer** from the three-dot menu in the top-right corner of the extension. Since the search operates at the span or label level, this option hides lines that do not contain any spans or labels from the search results, allowing users to focus only on relevant content in the text viewer.

### Advanced search

The **Advanced search** provides a more sophisticated way to search by allowing the combination of multiple conditions to refine results. This search type supports complex queries using MongoDB query syntax.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3f70e5306972bf93064d905b421d8db6e6e2bfc4%2FExtension%20-%20Search%20-%20advanced%20-%20initial%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

#### **Configure conditions**

There are two ways to configure the search conditions:

1. **Logic builder:** A user-friendly interface to create conditions visually.
2. **Query:** Directly input advanced queries for complex conditions.

#### Configure conditions via logic builder

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-7438683cbbe403d710741ce63911657c4c71f8c5%2FExtension%20-%20Search%20-%20advanced%20-%20logic%20builder%20-%20filled.png?alt=media" alt=""><figcaption></figcaption></figure>

Users can create searches with multiple conditions, where each condition includes a **search target**, a **filter operation**, and a **keyword**. These conditions can be combined using **logical operators** such as "OR" or "AND" to define the relationship between the conditions.

**Search target:**

* <mark style="color:red;">**`Text`**</mark>: Matches words or content in the spans.
* <mark style="color:red;">**`Label`**</mark>: Matches the labels applied to the text.
* <mark style="color:red;">**`Metadata`**</mark>: Matches information attached to the line (in key-value pair).

**Filter operation:**

* <mark style="color:red;">**`is`**</mark>: Matches search target that exactly matches the specified keyword.
* <mark style="color:red;">**`is not`**</mark>: Matches search target that explicitly does not match the specified keyword.
* <mark style="color:red;">**`contains`**</mark>: Matches search target that contains the specified keyword.
* <mark style="color:red;">**`does not contain`**</mark>: Matches search target that does not contain the specified keyword
* <mark style="color:red;">**`matches regex`**</mark>: Matches search target that fits the regular expression pattern.

**Keyword:**

* For **Text** and **Label**, this is the word or phrase to match.
* For **Metadata**, this is the **key: value** pair used to filter information.

**Logical operator:**

* <mark style="color:red;">**`OR`**</mark>: Matches results that meet at least one condition.
* <mark style="color:red;">**`AND`**</mark>: Matches results that meet all conditions.

**Sub-condition:** An additional filter available when the <mark style="color:red;">`Label`</mark> search target is selected. It allows users to narrow results based on who applied or reviewed the label.

{% hint style="info" %}
Sub-conditions are available **only in reviewer mode**.
{% endhint %}

* When using sub-conditions, users can combine multiple filters using logical operators such as **AND** and **OR**, just like parent conditions.
* Sub-condition filter operations:
  * <mark style="color:red;">**`is applied by`**</mark>: Matches labels that were applied by the selected user, or by any user if “any labeler” is selected.
  * <mark style="color:red;">**`is not applied by`**</mark>: Matches labels that were not applied by the selected user, or by any user if “any labeler” is selected.
  * <mark style="color:red;">**`is reviewed by`**</mark>: Matches labels that were reviewed by the selected user, or by any user if “any reviewer” is selected (both accepted and rejected).
  * <mark style="color:red;">**`is not reviewed by`**</mark>: Matches labels that were not reviewed by the selected user, or by any user if “any reviewer” is selected (both accepted and rejected).

#### Configure conditions via query

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-ed9f998e4771e82764c68eb8332867b11016ab43%2FExtension%20-%20Search%20-%20advanced%20-%20query.png?alt=media" alt=""><figcaption></figcaption></figure>

You can set up conditions using MongoDB queries. Datasaur supports a subset of the MongoDB [query selectors](https://www.mongodb.com/docs/manual/reference/operator/query/#query-selectors), which are listed below.

**Key Operators**

* <mark style="color:red;">**`$regex`**</mark> — Search for text patterns (combine with <mark style="color:red;">`$options`</mark> for behavior like case-insensitivity using <mark style="color:red;">`$options: "i"`</mark>).
* <mark style="color:red;">**`$not`**</mark> — Exclude matches.
* <mark style="color:red;">**`$or`**</mark> — Requires at least one condition to match.
* <mark style="color:red;">**`$and`**</mark> — Requires all conditions to match.

{% hint style="info" %}
**Notes:** <mark style="color:red;">`$or`</mark> and <mark style="color:red;">`$and`</mark> require exactly 2 conditions.
{% endhint %}

**Search condition**

1. **Text condition** — Searches for text content.
   * **First example:** Find text containing "is". This will match text like "This is used to train data."

     ```mongodb
     {
       "cellFragment.content": {
         "$regex": "is",
         "$options": "i"
       }
     }
     ```
   * **Second example:** Find text not containing "is". This will match text like "Labeling the data will be done in Datasaur."

     ```mongodb
     {
       "cellFragment.content": {
         "$not": {
           "$regex": "is",
           "$options": "i"
         }
       }
     }
     ```
2. **Label condition** — Filters based on labeled spans.

   * **First example:** Find spans labeled with a label containing "GEO". This will match labels like “GEO”, “Location Geo”, and “Geospatial Data”.

     ```mongodb
     {
       "spanLabels": {
         "$elemMatch": {
           "labelClassName": {
             "$regex": "GEO",
             "$options": "i"
           }
         }
       }
     }
     ```
   * **Second example:** Find spans labeled exactly with "GEO". This will match labels like "GEO", "geo", or any other case variations, but the entire label must be "GEO" with no additional characters.

     ```mongodb
     {
       "spanLabels": {
         "$elemMatch": {
           "labelClassName": {
             "$regex": "^GEO$",
             "$options": "i"
           }
         }
       }
     }
     ```
   * **Third example:** Find spans labeled as “GEO” by the labeler with user ID 1 (Number 1 represents the user ID of the label contributors. These IDs are unique identifiers used to construct the query conditions). See the guide below for how to obtain the user ID.

     ```mongodb
     {
       "spanLabels": {
         "$elemMatch": {
           "labelClassName": {
             "$regex": "^GEO$",
             "$options": "i"
           },
           "appliedByUserIds": {
             "$in": [
               "1"
             ]
           }
         }
       }
     }
     ```
   * **Fourth example:** Find spans labeled as “GEO” that have been reviewed by reviewer with user ID 2 (Number 2 represents the user ID of the label contributors. These IDs are unique identifiers used to construct the query conditions). See the guide below for how to obtain the user ID.

     <pre class="language-mongodb"><code class="lang-mongodb">{
       "spanLabels": {
         "$elemMatch": {
           "labelClassName": {
             "$regex": "^GEO$",
             "$options": "i"
     <strong>      },
     </strong>      "reviewedByUserIds": {
             "$in": [
               "2"
             ]
           }
         }
       }
     }
     </code></pre>

   💡 Guide to determine a contributor's user ID, use the **Logic builder**:

   * Navigate to the **Configure conditions via logic builder** section.
   * Add a sub-condition that filters by **Labeler** or **Reviewer**.
   * Select the desired user from the dropdown list.
   * Save the conditions. The corresponding user ID will be automatically populated in the generated query.
3. **Metadata condition** — Searches for key-value pairs in the metadata attached to each line.
   * **Example:** Find metadata where the key is "category" and the value is "education".

     ```mongodb
     {
       "cellFragment.metadata": {
         "$elemMatch": {
           "key": {
             "$regex": "category",
             "$options": "i"
           },
           "value": {
             "$regex": "education",
             "$options": "i"
           }
         }
       }
     }
     ```
4. **Logical OR condition** — Matches if any of the conditions are true.
   * **Example:** Find text containing either "France" or "John".

     ```mongodb
     {
       "$or": [
         { 
     	    "cellFragment.content": { 
     		    "$regex": "France", 
     		    "$options": "i" 
     	    }
         },
         { 
     	    "cellFragment.content": { 
     		    "$regex": "John", "$options": "i" 
     	    }
         }
       ]
     }
     ```
5. **Logical AND condition** — Matches only if all conditions are true.
   * **Example:** Find text containing both "France" and "John".

     ```mongodb
     {
       "$and": [
         { 
     	    "cellFragment.content": { 
     		    "$regex": "France", 
     		    "$options": "i" 
     	    }
         },
         { 
     	    "cellFragment.content": { 
     		    "$regex": "John", "$options": "i" 
     	    }
         }
       ]
     }
     ```

#### **Search result**

The search operates at the **line level**, meaning it evaluates each line individually against the list of specified conditions.

In the reviewer mode, search results include not only **reviewed labels** (accepted and consensus), but also **conflicted** and **rejected** labels.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-516cddaeee8b0a33bbc149584fc37be0dc350df8%2FExtension%20-%20Search%20-%20advanced%20-%20search%20result%20negative%20operators.png?alt=media" alt=""><figcaption></figcaption></figure>

For conditions with negative operators (<mark style="color:red;">`is not`</mark>, <mark style="color:red;">`does not contain`</mark>), only the lines that meet the specified conditions will be displayed in the results.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-7be3b1245da6f4c08bfce2fe34124710fd61b6f6%2FExtension%20-%20Search%20-%20advanced%20-%20search%20result%20negative%20operators%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

To improve readability, users can enable **Show only matching lines in the text viewer** from the three-dot menu in the top-right corner of the extension. When enabled, non-matching lines are hidden, allowing users to focus only on relevant results in the text viewer.

### Filtering search results

After running a search, you can narrow down the results by applying filters. This is especially useful when working with large result sets.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3d00e92372569255ee426f146e83eff830db5653%2FExtension%20-%20Search%20-%20advanced%20-%20filter.png?alt=media" alt=""><figcaption></figcaption></figure>

You can currently filter search results using the following criteria:

1. **Label status** — Use this filter to show results based on their review status. Options include:

   * Accepted: The label is accepted by the reviewer or manually applied by the reviewer.
   * Conflicted: The label has unresolved disagreements.
   * Rejected: The label is rejected by the reviewer.

   **Notes:**

   * *Status filters are available only in reviewer mode.*
   * *The **Rejected** option appears only when the project setting **Show rejected labels in Review Mode** is enabled.*
2. **Label class** — Use this filter to show results based on the label applied. Options include:
   1. Unlabeled: The text does not have any label assigned.
   2. Any label class that appears in the current search results.

### Saved search

Saved Search enables storing custom search configurations for reuse. Once saved, the same search can be applied without the need to manually reconfigure the conditions.

#### Saving a search

{% hint style="info" %}
Saving a search can be done in both labeler mode and reviewer mode. A search saved in one mode will also be accessible in the other.
{% endhint %}

Once you have a search configured, open the **Search** extension menu. You should see an option called **Save search configuration**.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-13bdc24cdfe5468bbe3c9445447f613924fb6ebf%2FExtension%20-%20Search%20-%20More%20icon.png?alt=media" alt=""><figcaption></figcaption></figure>

Clicking this option will open a dialog where you can enter a name and description for the saved search, including a preview of the current search configuration.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-54c46d022efd1267897a5e12c8048741cd5a1bdb%2FExtension%20-%20Search%20-%20save%20search%20configuration%20dialog.png?alt=media" alt=""><figcaption></figcaption></figure>

#### Using a saved search <a href="#using-a-saved-search" id="using-a-saved-search"></a>

There are two ways to access the saved search:

1. Open the extension menu and select **Manage saved searches**.
2. If advanced search type is selected, open the **Configure** menu and select **Use existing saved search**.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-7005cec4936d6392e500c5f76888b02819205285%2FExtension%20-%20Search%20-%20advanced%20-%20search%20result%20negative%20operators-1.png?alt=media" alt=""><figcaption></figcaption></figure>

Both methods open a dialog displaying all saved searches. Selecting one shows its details on the right, and searches can also be filtered by name using the search bar. Click **Use saved search** to apply the selected configuration to the current session.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-90a0af75b82af56135370b1221ca3930eef346cd%2FExtension%20-%20Search%20-%20Manage%20saved%20searches.png?alt=media" alt=""><figcaption></figcaption></figure>

#### Editing or deleting a saved search <a href="#editing-or-deleting-a-saved-search" id="editing-or-deleting-a-saved-search"></a>

In the **Manage saved searches** dialog, select the saved search you want to edit or delete. Two buttons: “Edit” and “Delete” will appear next to the saved search name.

**Edit**

{% hint style="info" %}
While editing a saved search, you cannot navigate to or preview another saved search.
{% endhint %}

Clicking the “Edit” button will open an interface where you can update the name and description. Once the changes are made, click the **Save changes** button to apply them.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-0c213d28be2ce9c7b831233efe45baeb354dc00a%2FExtension%20-%20Search%20-%20Manage%20saved%20searches%20-%20edit.png?alt=media" alt=""><figcaption></figcaption></figure>

**Delete**

Clicking the “Delete” button will open a confirmation dialog. Confirm the action to permanently delete the saved search.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-5379c9d86785c9f98df2aaebed906e621d716d55%2FExtension%20-%20Search%20-%20Manage%20saved%20searches%20-%20delete.png?alt=media" alt=""><figcaption></figcaption></figure>

### Label all / bulk labeling for span labels

The **Label all** allows users to quickly apply a label to all matching results in a project.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-2303cce0125447efd6922801560b032f9c05cec1%2FExtension%20-%20Search%20-%20Basic%20-%20Label%20all.png?alt=media" alt=""><figcaption></figcaption></figure>

For example, searching for the text <mark style="color:red;">`Holmes`</mark> will show all its instances in the document. Selecting <mark style="color:red;">`Person`</mark> from the dropdown and clicking **Label all** applies the <mark style="color:red;">`Person`</mark> label to every occurrence of <mark style="color:red;">`Holmes`</mark>.

This feature helps speed up bulk labeling and improves consistency, especially in projects requiring detailed text analysis.

{% hint style="info" %}
**Tips & Tricks:** To make it easier to navigate through the results, you can use the **Up** or **Down arrow** keys to navigate between results easily.
{% endhint %}

### Delete search result labels

The **Delete search result labels** allows users to remove all matching labeled results in a project.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3e74be2be17ae89dedca676fed529d04f6d93fab%2FExtension%20-%20Search%20-%20Basic%20-%20Delete%20search%20results.png?alt=media" alt=""><figcaption></figcaption></figure>

For example, searching for <mark style="color:red;">`Shakespeare`</mark> will display all instances in the document along with any labels applied to the text. Clicking **Delete search result labels** will remove all labels associated with <mark style="color:red;">`Shakespeare`</mark>.

This feature is useful for bulk deletions, making the cleanup process faster and more efficient. It’s especially helpful for projects that involve large datasets, ensuring data accuracy and consistency while saving time.

### Bulk answering for line questions

{% hint style="info" %}

* Only available when Line Labeling is enabled in a Span project.
* Supported in Advanced Search only.
  {% endhint %}

The **Answer matching lines in line labeling extension** allows users to apply the same answer to all matching lines from the search results in a project.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-eccf77e9ec432792af24329b015c7e6d00d7367b%2FExtension%20-%20Search%20-%20Basic%20-%20Label%20all%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

Clicking it will automatically select all matching lines in the editor based on the search results and direct you to the Line labeling extension. You can then modify the answer, and upon clicking Submit, the answer will be applied to all selected lines.

If an answer for a question is not modified, its existing value will remain unchanged. For example:

**Before**

<pre><code><strong>line 1
</strong>question: intent → answer: inquiry
question: priority → answer: low
question: status → answer: open

line 2
question: intent → answer: complaint
question: priority → answer: high
question: status → answer: open
</code></pre>

If both lines are selected and only **question: priority** is updated to **answer: medium**, then after submitting:

```
line 1
question: intent → answer: inquiry
question: priority → answer: medium
question: status → answer: open

line 2
question: intent → answer: complaint
question: priority → answer: medium
question: status → answer: open
```

## Row Labeling

Allows users to search within the data of the table, enabling them to find specific information across multiple rows and columns quickly by specifying the search target, word filter, and entering the keyword.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-218e60c55092d2256991ef281f405af1a25e135b%2FExtension%20-%20Search%20-%20standard%20-%20row%20labeling%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

**Search target**

This option allows users to specify the focus of the search. The available options are:

* <mark style="color:red;">**`Text`**</mark>: Matches the words or content in the data column.
* <mark style="color:red;">**`Label`**</mark>: Matches the words or content in the answer column.

**Word Filter**

This option lets users define how their search keywords are matched to results. The available options are:

* **Contains any word:** Matches results that contain any of the specified words.
  * Example: Searching for <mark style="color:red;">`men`</mark> will match with <mark style="color:red;">`men`</mark>, <mark style="color:red;">`mentioned`</mark>, <mark style="color:red;">`abandonment`</mark>.
* **Exact word:** Displays only exact matches for the search keyword.
  * Example: Searching for <mark style="color:red;">`men`</mark> will match <mark style="color:red;">`men`</mark> but not <mark style="color:red;">`mentioned`</mark>.
* **Regex:** Allows users to search using regular expressions for advanced pattern matching.
  * Example: Searching for <mark style="color:red;">`men*`</mark> will match words starting with <mark style="color:red;">`men`</mark>, such as <mark style="color:red;">`mentioned`</mark>.

## Search all files

The **Search all files** option allows users to search across all files within a project. When this option is **checked**, the search will include results from every file in the project. If the option is **unchecked**, the search will be limited to the current file only.

This is useful for users who want to either perform a broad search across all files or focus on a specific file within the project.
