# Cohen's Kappa Calculation

[Cohen's Kappa](https://en.wikipedia.org/wiki/Cohen's_kappa) is one of the algorithms that is supported by Datasaur to calculate the agreement while taking into account the possibility of chance agreement. We will dive deep into how Datasaur collects all labels from labelers and reviewers in a project and processes them into an Inter-annotator Agreement matrix.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-bea8837efc14f8a6e0715436e867be5be426b9b3%2Fimage%20(302).png?alt=media" alt="" width="194"><figcaption></figcaption></figure>

## Sample Data

Suppose there are 2 labelers—Labeler A and Labeler B—who labeled the same sentences.

![Labeler A](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f98a1c9e971049dca18a1c1fb0fcf75f3621bdf3%2FAnalytics%20-%20IAA%20-%20sample%20data%201%20-%20labeler%20A.png?alt=media)

![Labeler B](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-a1468926f060db9584a9c536e761b74134536aa9%2FAnalytics%20-%20IAA%20-%20sample%20data%201%20-%20labeler%20B.png?alt=media)

There is also a reviewer who labeled the same sentences.

![Reviewer](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-b78c707c9f04d2c7ec0e8b921b87ee3c5e719c3a%2FAnalytics%20-%20IAA%20-%20sample%20data%201%20-%20reviewer.png?alt=media)

## Calculating the Data

### Agreement Records

Based on the screenshots above, we map those labels into the agreement records below:

| Position in sentence  | Labeler A | Labeler B | Reviewer |
| --------------------- | --------- | --------- | -------- |
| The Tragedy of Hamlet | EVE       | TITLE     | TITLE    |
| Prince of Denmark     | PER       | \<EMPTY>  | \<EMPTY> |
| Hamlet                | PER       | TITLE     | PER      |
| William Shakespeare   | PER       | PER       | PER      |
| 1599                  | YEAR      | YEAR      | YEAR     |
| 1601                  | YEAR      | YEAR      | YEAR     |
| Shakespeare           | ORG       | ORG       | PER      |
| 30,557                | \<EMPTY>  | \<EMPTY>  | QTY      |

### **Agreement Table / Confusion Matrix**

Then, we construct the records into the agreement table. We use the data from Labeler A and Labeler B for the simulation.

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-25dba8fcf1df7b2a2842e811dcbef6dca643abc0%2FAnalytics%20-%20IAA%20-%20Cohen's%20Kappa%20calculation%20-%20Kappa%20for%20Labeler%20A%20and%20Labeler%20B.png?alt=media)

### Calculating the Kappa

From the table above, there are **7** records with **4** agreements.

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-49d6725e3e65eb85dade8e25c2f258cd14354ead%2Fimage%20\(43\).png?alt=media)

The observed proportionate agreement is:

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-d2d31a387e67ae981b11ce13a4572f087a3d60ed%2Fimage%20\(56\).png?alt=media)

To calculate the probability of random agreement, we note that:

* Labeler A labeled `EVE` once and Labeler B didn't label `EVE`. Therefore, the probability of random agreement on the label `EVE` is:

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-282721f40af4a31d1c215d2ba022c26832f6ea09%2Fimage%20\(17\)%20\(1\)%20\(1\)%20\(1\).png?alt=media)

* Compute the probability of random agreement for all labels:

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-088f9c28017f74355bf9453a7320fb6371a79690%2Fimage%20\(133\).png?alt=media)

The full random agreement probability is the sum of the probability of random agreement for all labels:

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f9e508aa652c32982f004b75467551f54ecea3e5%2Fimage%20\(127\).png?alt=media)

Finally, we can calculate the Cohen's Kappa:

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-3ee1a97e2a5b612425f311ca84c2f3918d04d650%2Fimage%20\(115\).png?alt=media)

#### **Kappa for Labeler A and Reviewer**

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-90057cdb9873e606d6c27322cd9585fdafe7215d%2FAnalytics%20-%20IAA%20-%20Cohen's%20Kappa%20calculation%20-%20Kappa%20for%20Labeler%20A%20and%20Reviewer.png?alt=media)

#### **Kappa for Labeler B and Reviewer**

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-d9d48426e72979d5aaa91f957cf1ef5d1845d88b%2FAnalytics%20-%20IAA%20-%20Cohen's%20Kappa%20calculation%20-%20Kappa%20for%20Labeler%20B%20and%20Reviewer.png?alt=media)

![](https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-9d5c68a920882e3784b7b34ec64bc49ed211353e%2FAnalytics%20-%20IAA%20-%20Cohen's%20Kappa%20-%20sample%20data%201%20-%20results.png?alt=media)

## Summary

* We apply the same calculation for agreement between labelers and between the reviewer and labelers.
* Missing labels from a single labeler will be counted as having applied empty labels.
* The percentage of chance agreement will vary depending on:
  * The number of the labels in a project.
  * The number of label options.
* When both labelers agree but the reviewer rejects the labels:
  * The agreement between the two labelers increases.
  * The agreement between the labelers and the reviewer decreases.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/workspace-management/analytics/inter-annotator-agreement/cohens-kappa-calculation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
