> For the complete documentation index, see [llms.txt](https://docs.datasaur.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.datasaur.ai/assisted-labeling/data-programming/inter-annotator-agreement-for-data-programming.md).

# Inter-Annotator Agreement for Data Programming

**Data programming** evaluates the performance of labeling functions using [inter-annotator agreement](/workspace-management/analytics/inter-annotator-agreement.md) (IAA).

## Steps

To evaluate the performance of labeling functions:

1. Activate **Data programming** through the **Manage extensions** dialog.

   <figure><img src="/files/JaxpgZx59TPXbr2oJ9em" alt=""><figcaption></figcaption></figure>
2. Create labeling functions for the selected question. At least two labeling functions are required to calculate IAA.

   <figure><img src="/files/gOMNVUiH9PkKbpfGBX5f" alt=""><figcaption></figcaption></figure>
3. Click **Predict labels**. You can now view the final labels generated by the labeling functions for your selected question.

   <figure><img src="/files/h1E23xtNfRu5NMgip5DX" alt=""><figcaption></figcaption></figure>

You can review the predicted labels and view the IAA score in the **Manage Functions** dialog. An IAA score above 80% indicates good agreement.

<figure><img src="/files/dGwJLaAjYBP7hcGJdI7U" alt=""><figcaption></figcaption></figure>

## Pre-labeled columns as model representatives

If you use pre-labeled columns as representatives of your models, you can create labeling functions based on those values.

* If using Snorkel provider, please use this code

  ```python
  @labeling_function()
  def labeling_function(x) -> int:
    # Implement your logic here
    text = x.columns[x.column_name_to_index['column_name']]
    for key, value in LABELS.items():
      if re.search(key, text, re.IGNORECASE):
        return value

    return ABSTAIN
  ```
* If using Stegosaurus provider, please use this code and activate **Multiple-label template**:

  ```python
  @target_label()
  ABSTAIN = -1
  def label_function(sample):
    text = sample['column_name']
    # text = sample[COLUMN_NAME] if only want to use content from certain column

    # Implement your logic here
    # Keywords value on the certain column
    DICT_KEYWORDS = {
      'positive' : ['positive'],
      'negative' : ['negative']
    }
    for label, target_keywords in DICT_KEYWORDS.items():
      for keyword in target_keywords:
        if re.search(keyword, text, re.IGNORECASE):
          return LABELS[label]
    return ABSTAIN
    #return False to make empty result instead of ABSTAIN
  ```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.datasaur.ai/assisted-labeling/data-programming/inter-annotator-agreement-for-data-programming.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
