Datasaur
Visit our websitePricingBlogPlaygroundAPI Docs
  • Welcome to Datasaur
    • Getting started with Datasaur
  • Data Studio Projects
    • Labeling Task Types
      • Span Based
        • OCR Labeling
        • Audio Project
      • Row Based
      • Document Based
      • Bounding Box
      • Conversational
      • Mixed Labeling
      • Project Templates
        • Test Project
    • Creating a Project
      • Data Formats
      • Data Samples
      • Split Files
      • Consensus
      • Dynamic Review Capabilities
    • Pre-Labeled Project
    • Let's Get Labeling!
      • Span Based
        • Span + Line Labeling
      • Row & Document Based
      • Bounding Box Labeling
      • Conversational Labeling
      • Label Sets / Question Sets
        • Dynamic Question Set
      • Multiple Label Sets
    • Reviewing Projects
      • Review Sampling
    • Adding Documents to an Ongoing Project
    • Export Project
  • LLM Projects
    • LLM Labs Introduction
    • Sandbox
      • Direct Access LLMs
      • File Attachment
      • Conversational Prompt
    • Deployment
      • Deployment API
    • Knowledge base
      • External Object Storage
      • File Properties
    • Models
      • Amazon SageMaker JumpStart
      • Amazon Bedrock
      • Open AI
      • Azure OpenAI
      • Vertex AI
      • Custom model
      • Fine-tuning
      • LLM Comparison Table
    • Evaluation
      • Automated Evaluation
        • Multi-application evaluation
        • Custom metrics
      • Ranking (RLHF)
      • Rating
      • Performance Monitoring
    • Dataset
    • Pricing Plan
  • Workspace Management
    • Workspace
    • Role & Permission
    • Analytics
      • Inter-Annotator Agreement (IAA)
        • Cohen's Kappa Calculation
        • Krippendorff's Alpha Calculation
      • Custom Report Builder
      • Project Report
      • Evaluation Metrics
    • Activity
    • File Transformer
      • Import Transformer
      • Export Transformer
      • Upload File Transformer
      • Running File Transformer
    • Label Management
      • Label Set Management
      • Question Set Management
    • Project Management
      • Self-Assignment
        • Self-Unassign
      • Transfer Assignment Ownership
      • Reset Labeling Work
      • Mark Document as Complete
      • Project Status Workflow
        • Read-only Mode
      • Comment Feature
      • Archive Project
    • Automation
      • Action: Create Projects
  • Assisted Labeling
    • ML Assisted Labeling
      • Amazon Comprehend
      • Amazon SageMaker
      • Azure ML
      • CoreNLP NER
      • CoreNLP POS
      • Custom API
      • FewNERD
      • Google Vertex AI
      • Hugging Face
      • LLM Assisted Labeling
        • Prompt Examples
        • Custom Provider
      • LLM Labs (beta)
      • NLTK
      • Sentiment Analysis
      • spaCy
      • SparkNLP NER
      • SparkNLP POS
    • Data Programming
      • Example of Labeling Functions
      • Labeling Function Analysis
      • Inter-Annotator Agreement for Data Programming
    • Predictive Labeling
  • Assisted Review
    • Label Error Detection
  • Building Your Own Model
    • Datasaur Dinamic
      • Datasaur Dinamic with Hugging Face
      • Datasaur Dinamic with Amazon SageMaker Autopilot
  • Advanced
    • Script-Generated Question
    • Shortcuts
    • Extensions
      • Labels
      • Review
      • Document and Row Labeling
      • Bounding Box Labels
      • List of Files
      • Comments
      • Analytics
      • Dictionary
      • Search
      • Labeling Guidelines
      • Metadata
      • Grammar Checker
      • ML Assisted Labeling
      • Data Programming
      • Datasaur Dinamic
      • Predictive Labeling
      • Label Error Detection
      • LLM Sandbox
    • Tokenizers
  • Integrations
    • External Object Storage
      • AWS S3
        • With IRSA
      • Google Cloud Storage
      • Azure Blob Storage
    • SAML
      • Okta
      • Microsoft Entra ID
    • SCIM
      • Okta
      • Microsoft Entra ID
    • Webhook Notifications
      • Webhook Signature
      • Events
      • Custom Headers
    • Robosaur
      • Commands
        • Create Projects
        • Apply Project Tags
        • Export Projects
        • Generate Time Per Task Report
        • Split Document
      • Storage Options
  • API
    • Datasaur APIs
    • Credentials
    • Create Project
      • New mutation (createProject)
      • Python Script Example
    • Adding Documents
    • Labeling
      • Create Label Set
      • Add Label Sets into Existing Project
      • Get List of Label Sets in a Project
      • Add Label Set Item into Project's Label Set
      • Programmatic API Labeling
      • Inserting Span and Arrow Label into Document
    • Export Project
      • Custom Webhook
    • Get Data
      • Get List of Projects
      • Get Document Information
      • Get List of Tags
      • Get Cabinet
      • Export Team Overview
      • Check Job
    • Custom OCR
      • Importable Format
    • Custom ASR
    • Run ML-Assisted Labeling
  • Security and Compliance
    • Security and Compliance
      • 2FA
  • Compatibility & Updates
    • Common Terminology
    • Recommended Machine Specifications
    • Supported Formats
    • Supported Languages
    • Release Notes
      • Version 6
        • 6.111.0
        • 6.110.0
        • 6.109.0
        • 6.108.0
        • 6.107.0
        • 6.106.0
        • 6.105.0
        • 6.104.0
        • 6.103.0
        • 6.102.0
        • 6.101.0
        • 6.100.0
        • 6.99.0
        • 6.98.0
        • 6.97.0
        • 6.96.0
        • 6.95.0
        • 6.94.0
        • 6.93.0
        • 6.92.0
        • 6.91.0
        • 6.90.0
        • 6.89.0
        • 6.88.0
        • 6.87.0
        • 6.86.0
        • 6.85.0
        • 6.84.0
        • 6.83.0
        • 6.82.0
        • 6.81.0
        • 6.80.0
        • 6.79.0
        • 6.78.0
        • 6.77.0
        • 6.76.0
        • 6.75.0
        • 6.74.0
        • 6.73.0
        • 6.72.0
        • 6.71.0
        • 6.70.0
        • 6.69.0
        • 6.68.0
        • 6.67.0
        • 6.66.0
        • 6.65.0
        • 6.64.0
        • 6.63.0
        • 6.62.0
        • 6.61.0
        • 6.60.0
        • 6.59.0
        • 6.58.0
        • 6.57.0
        • 6.56.0
        • 6.55.0
        • 6.54.0
        • 6.53.0
        • 6.52.0
        • 6.51.0
        • 6.50.0
        • 6.49.0
        • 6.48.0
        • 6.47.0
        • 6.46.0
        • 6.45.0
        • 6.44.0
        • 6.43.0
        • 6.42.0
        • 6.41.0
        • 6.40.0
        • 6.39.0
        • 6.38.0
        • 6.37.0
        • 6.36.0
        • 6.35.0
        • 6.34.0
        • 6.33.0
        • 6.32.0
        • 6.31.0
        • 6.30.0
        • 6.29.0
        • 6.28.0
        • 6.27.0
        • 6.26.0
        • 6.25.0
        • 6.24.0
        • 6.23.0
        • 6.22.0
        • 6.21.0
        • 6.20.0
        • 6.19.0
        • 6.18.0
        • 6.17.0
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
      • Version 5
        • 5.63.0
        • 5.62.0
        • 5.61.0
        • 5.60.0
  • Deployment
    • Self-Hosted
      • AWS Marketplace
        • Data Studio
        • LLM Labs
Powered by GitBook
On this page
  • Sample Data
  • Calculating the Agreement
  • 1. Arranging the data
  • 2. Cleaning the data
  • 3. Creating the agreement table
  • 4. Choosing weight function
  • 5. Calculating Pa
  • 6. Calculating Pe
  • 7. Calculating the Alpha
  • Summary
  1. Workspace Management
  2. Analytics
  3. Inter-Annotator Agreement (IAA)

Krippendorff's Alpha Calculation

Explain how Datasaur implements the Krippendorff's Alpha algorithm.

Last updated 7 months ago

is one of the algorithms that is supported by Datasaur to calculate the agreement while taking into account the possibility of chance agreement. We will deep dive into how Datasaur collects all labels from labelers and reviewers in a project and process them into an Inter-annotator Agreement matrix.

Sample Data

Suppose there are 2 labelers and 1 reviewer — Labeler A, Labeler B, and Reviewer — who labeled the same spans. Labeler A work is visualized in Image 1, Labeler B work is visualized in Image 2, and Reviewer work is visualized in Image 3.

Calculating the Agreement

In this section, we will see the calculation detail between Labeler A and Reviewer.

1. Arranging the data

First, we need to arrange the sample data into Table 1 for the better visualization.

Table 1. Sample Data

Span
Labeler A
Reviewer

The Tragedy of Hamlet

EVE

TITLE

Prince of Denmark

PER

Hamlet

PER

PER

William Shakespeare

PER

PER

1599

YEAR

YEAR

1601

YEAR

YEAR

Shakespeare

ORG

PER

30557

QTY

2. Cleaning the data

Second, we need to remove spans that only have 1 label i.e. Prince of Denmark and 30557. They should be removed because spans with a single label will introduce a calculation error. The calculation result will still show the agreement level between 2 annotators. The cleaned data is shown in Table 2.

Table 2. Cleaned Data

Span
Labeler A
Reviewer

The Tragedy of Hamlet

EVE

TITLE

Hamlet

PER

PER

William Shakespeare

PER

PER

1599

YEAR

YEAR

1601

YEAR

YEAR

Shakespeare

ORG

PER

3. Creating the agreement table

Third, we need to create an agreement table based on the cleaned data. The table is visualized in Table 3.

Based on the table, 5 values are calculated: nnn, rir_iri​, rkr_krk​, rrr, and r′r'r′.

Total spans in the data

  • nnn is the total spans in the data.

    • Here, n=6n=6n=6 because there are 6 spans.

Total labels in each span

ri=∑k=1mrik(1)r_i=\sum\limits_{k=1}^{m}r_{ik} (1)ri​=k=1∑m​rik​(1)
  • rir_iri​ is the total labels that span iii has.

  • mmm is the total number of label.

    • Here, m=5m=5m=5 because there are 5 labels.

  • rikr_{ik}rik​ is the number of kkk label in span iii.

Here is the calculation result.

  • r1=r1,EVE+r1,ORG+r1,PER+r1,TITLE+r1,YEAR=1+0+0+1+0=2r_1=r_{1,EVE}+r_{1,ORG}+r_{1,PER}+r_{1,TITLE}+r_{1,YEAR}=1+0+0+1+0=2r1​=r1,EVE​+r1,ORG​+r1,PER​+r1,TITLE​+r1,YEAR​=1+0+0+1+0=2

  • r2=r2,EVE+r2,ORG+r2,PER+r2,TITLE+r2,YEAR=0+0+2+0+0=2r_2=r_{2,EVE}+r_{2,ORG}+r_{2,PER}+r_{2,TITLE}+r_{2,YEAR}=0+0+2+0+0=2r2​=r2,EVE​+r2,ORG​+r2,PER​+r2,TITLE​+r2,YEAR​=0+0+2+0+0=2

  • r3=r3,EVE+r3,ORG+r3,PER+r3,TITLE+r3,YEAR=0+0+2+0+0=2r_3=r_{3,EVE}+r_{3,ORG}+r_{3,PER}+r_{3,TITLE}+r_{3,YEAR}=0+0+2+0+0=2r3​=r3,EVE​+r3,ORG​+r3,PER​+r3,TITLE​+r3,YEAR​=0+0+2+0+0=2

  • r4=r4,EVE+r4,ORG+r4,PER+r4,TITLE+r4,YEAR=0+0+0+0+2=2r_4=r_{4,EVE}+r_{4,ORG}+r_{4,PER}+r_{4,TITLE}+r_{4,YEAR}=0+0+0+0+2=2r4​=r4,EVE​+r4,ORG​+r4,PER​+r4,TITLE​+r4,YEAR​=0+0+0+0+2=2

  • r5=r5,EVE+r5,ORG+r5,PER+r5,TITLE+r5,YEAR=0+0+0+0+2=2r_5=r_{5,EVE}+r_{5,ORG}+r_{5,PER}+r_{5,TITLE}+r_{5,YEAR}=0+0+0+0+2=2r5​=r5,EVE​+r5,ORG​+r5,PER​+r5,TITLE​+r5,YEAR​=0+0+0+0+2=2

  • r6=r6,EVE+r6,ORG+r6,PER+r6,TITLE+r6,YEAR=0+1+1+0+0=2r_6=r_{6,EVE}+r_{6,ORG}+r_{6,PER}+r_{6,TITLE}+r_{6,YEAR}=0+1+1+0+0=2r6​=r6,EVE​+r6,ORG​+r6,PER​+r6,TITLE​+r6,YEAR​=0+1+1+0+0=2

Total of each label

rk=∑i=1nrik(2)r_k=\sum\limits_{i=1}^{n}r_{ik} (2)rk​=i=1∑n​rik​(2)
  • rkr_krk​ is the total of kkk label in the data.

  • nnn is the total spans in the data.

  • rikr_{ik}rik​ is the number of kkk label in span iii.

Here is the calculation result.

  • rEVE=r1,EVE+r2,EVE+r3,EVE+r4,EVE+r5,EVE+r6,EVE=1+0+0+0+0+0=1r_{EVE}=r_{1,EVE}+r_{2,EVE}+r_{3,EVE}+r_{4,EVE}+r_{5,EVE}+r_{6,EVE}=1+0+0+0+0+0=1rEVE​=r1,EVE​+r2,EVE​+r3,EVE​+r4,EVE​+r5,EVE​+r6,EVE​=1+0+0+0+0+0=1

  • rORG=r1,ORG+r2,ORG+r3,ORG+r4,ORG+r5,ORG+r6,ORG=0+0+0+0+0+1=1r_{ORG}=r_{1,ORG}+r_{2,ORG}+r_{3,ORG}+r_{4,ORG}+r_{5,ORG}+r_{6,ORG}=0+0+0+0+0+1=1rORG​=r1,ORG​+r2,ORG​+r3,ORG​+r4,ORG​+r5,ORG​+r6,ORG​=0+0+0+0+0+1=1

  • rPER=r1,PER+r2,PER+r3,PER+r4,PER+r5,PER+r6,PER=0+2+2+0+0+1=5r_{PER}=r_{1,PER}+r_{2,PER}+r_{3,PER}+r_{4,PER}+r_{5,PER}+r_{6,PER}=0+2+2+0+0+1=5rPER​=r1,PER​+r2,PER​+r3,PER​+r4,PER​+r5,PER​+r6,PER​=0+2+2+0+0+1=5

  • rTITLE=r1,TITLE+r2,TITLE+r3,TITLE+r4,TITLE+r5,TITLE+r6,TITLE=1+0+0+0+0+0=1r_{TITLE}=r_{1,TITLE}+r_{2,TITLE}+r_{3,TITLE}+r_{4,TITLE}+r_{5,TITLE}+r_{6,TITLE}=1+0+0+0+0+0=1rTITLE​=r1,TITLE​+r2,TITLE​+r3,TITLE​+r4,TITLE​+r5,TITLE​+r6,TITLE​=1+0+0+0+0+0=1

  • rYEAR=r1,YEAR+r2,YEAR+r3,YEAR+r4,YEAR+r5,YEAR+r6,YEAR=0+0+0+2+2+0=4r_{YEAR}=r_{1,YEAR}+r_{2,YEAR}+r_{3,YEAR}+r_{4,YEAR}+r_{5,YEAR}+r_{6,YEAR}=0+0+0+2+2+0=4rYEAR​=r1,YEAR​+r2,YEAR​+r3,YEAR​+r4,YEAR​+r5,YEAR​+r6,YEAR​=0+0+0+2+2+0=4

Total labels in the data

r=∑i=1nri(3)r=\sum\limits_{i=1}^nr_i (3)r=i=1∑n​ri​(3)
  • rrr is the total labels in the data.

  • nnn is the total spans in the data.

  • rir_iri​ is the total labels that span iii has.

Here is the calculation result.

  • r=r1+r2+r3+r4+r5+r6=12r=r_1+r_2+r_3+r_4+r_5+r_6=12r=r1​+r2​+r3​+r4​+r5​+r6​=12

Average number of labels per span

r′=rn(4)r'=\frac{r}{n} (4)r′=nr​(4)
  • r′r'r′ is the average number of labels per span.

  • nnn is the total spans in the data.

Here is the calculation result.

  • r′=rn=126=2r'=\frac{r}{n}=\frac{12}{6}=2r′=nr​=612​=2

4. Choosing weight function

Fourth, we need a weight function to weight the labels. Every label is treated equally because one label is no difference than the other. Hence, the weight function that will be used is stated in Formula 5.

wik=rik(5)w_{ik}=r_{ik} (5)wik​=rik​(5)
  • wikw_{ik}wik​ is the weighted number of kkk label in span iii.

  • rikr_{ik}rik​ is the number of kkk label in span iii.

5. Calculating Pa

Fifth, the observed weighted percent agreement is calculated.

Weighted number of labels

We will start by calculating the weighted number of label using Formula (6).

rik+=∑l=1mwklril(6)r_{ik+}=\sum\limits_{l=1}^{m} w_{kl}r_{il} (6)rik+​=l=1∑m​wkl​ril​(6)
  • rik+r_{ik+}rik+​ is the weighted number of kkk label in span iii.

  • mmm is the total number of label.

  • wklw_{kl}wkl​ is the weighted number of lll label in span kkk.

  • rilr_{il}ril​ is the number of lll label in span iii.

For example, we can apply Formula (6) to calculate the weighted EVE label in span 1.

r1,EVE+=∑l=15wEVE,lr1,l=1∗1+0∗0+0∗0+0∗1+0∗0=1r_{1,EVE+}=\sum\limits_{l=1}^{5} w_{EVE,l}r_{1,l}=1*1+0*0+0*0+0*1+0*0=1r1,EVE+​=l=1∑5​wEVE,l​r1,l​=1∗1+0∗0+0∗0+0∗1+0∗0=1

We need to calculate all of the span and label combination. The complete calculation result is visualized in Table 4.

Agreement percentage

After we got the weighted number of labels, we need to calculate the agreement percentage for a single span and label using Formula (7).

pa∣ik=rik(rik+−1)r′(ri−1)(7)p_{a|ik}=\frac{r_{ik}(r_{ik+}-1)}{r'(r_i-1)} (7)pa∣ik​=r′(ri​−1)rik​(rik+​−1)​(7)
  • pa∣ikp_{a|ik}pa∣ik​ is the agreement percentage of kkk label in span iii.

  • rikr_{ik}rik​ is the number of kkk label in span iii.

  • rik+r_{ik+}rik+​ is the weighted number of kkk label in span iii.

  • r′r'r′ is the average number of labels per span.

  • rir_iri​ is the total labels that span iii has.

For example, we can apply Formula (7) to calculate the agreement percentage of EVE label in span 1.

pa∣1,EVE=r1,EVE(r1,EVE+−1)r′(r1−1)=1(1−1)2(2−1)=0p_{a|1,EVE}=\frac{r_{1,EVE}(r_{1,EVE+}-1)}{r'(r_1-1)}=\frac{1(1-1)}{2(2-1)}=0pa∣1,EVE​=r′(r1​−1)r1,EVE​(r1,EVE+​−1)​=2(2−1)1(1−1)​=0

We need to calculate all of the span and label combination. The complete calculation result is visualized in Table 5.

Agreement percentage of a single span

We can simplify the result by getting the agreement percentage of a single span using Formula (8).

pa∣i=∑k=1mpa∣ik(8)p_{a|i}=\sum\limits_{k=1}^{m} p_{a|ik} (8)pa∣i​=k=1∑m​pa∣ik​(8)
  • pa∣ip_{a|i}pa∣i​ is the agreement percentage of span iii.

  • mmm is the total number of label.

  • pa∣ikp_{a|ik}pa∣ik​ is the agreement percentage of kkk label in span iii.

For example, we can apply Formula (8) to calculate the agreement percentage of span 1.

pa∣1=∑k=15pa∣1,k=0+0+0+0+0=0p_{a|1}=\sum\limits_{k=1}^{5} p_{a|1,k}=0+0+0+0+0=0pa∣1​=k=1∑5​pa∣1,k​=0+0+0+0+0=0

We need to calculate the agreement percentage of all spans. The complete calculation result is visualized in Table 6.

Average agreement percentage

From the previous calculation, we can calculate the average agreement percentage using Formula (9).

pa′=1n∑i=1nPa∣i(9)p_a'=\frac{1}{n}\sum\limits_{i=1}^{n}P_{a|i} (9)pa′​=n1​i=1∑n​Pa∣i​(9)
  • pa′p_a'pa′​ is the average agreement percentage.

  • nnn is the total spans in the data.

  • pa∣ip_{a|i}pa∣i​ is the agreement percentage of span iii.

We can apply Formula (9) to calculate the average agreement percentage.

pa′=16∑i=16Pa∣i=16(0+1+1+1+1+0)=0.6666p_a'=\frac{1}{6}\sum\limits_{i=1}^{6}P_{a|i}=\frac{1}{6}(0+1+1+1+1+0)=0.6666pa′​=61​i=1∑6​Pa∣i​=61​(0+1+1+1+1+0)=0.6666

Calculating Pa

Finally, the observed weighted percent agreement is calculated using Formula (10).

pa=pa′(1−1nr′)+1nr′(10)p_a=p_a'(1-\frac{1}{nr'})+\frac{1}{nr'} (10)pa​=pa′​(1−nr′1​)+nr′1​(10)
  • pap_apa​ is the observed weighted percent agreement.

  • pa′p_a'pa′​ is the average agreement percentage.

  • nnn is the total spans in the data.

  • r′r'r′ is the average number of labels per span.

We can apply Formula (10) to calculate the observed weighted agreement percentage.

pa=pa′(1−1nr′)+1nr′=0.6666(1−16×2)+16×2=0.6944p_a=p_a'(1-\frac{1}{nr'})+\frac{1}{nr'}=0.6666(1-\frac{1}{6\times2})+\frac{1}{6\times2}=0.6944pa​=pa′​(1−nr′1​)+nr′1​=0.6666(1−6×21​)+6×21​=0.6944

6. Calculating Pe

Sixth, the chance weighted percent agreement is calculated.

Classification probability

We start by calculating the classification probability for each label using Formula (11).

πk=rkr(11)\pi_k=\frac{r_k}{r} (11)πk​=rrk​​(11)
  • Ï€k\pi_kÏ€k​ is the classification probability for kkk label.

  • rkr_krk​ is the total of kkk label in the data.

  • rrr is the total labels in the data.

Here is the calculation result.

  • Ï€EVE=rEVEr=112=0.0833\pi_{EVE}=\frac{r_{EVE}}{r}=\frac{1}{12}=0.0833Ï€EVE​=rrEVE​​=121​=0.0833

  • Ï€ORG=rORGr=112=0.0833\pi_{ORG}=\frac{r_{ORG}}{r}=\frac{1}{12}=0.0833Ï€ORG​=rrORG​​=121​=0.0833

  • Ï€PER=rPERr=512=0.4166\pi_{PER}=\frac{r_{PER}}{r}=\frac{5}{12}=0.4166Ï€PER​=rrPER​​=125​=0.4166

  • Ï€TITLE=rTITLEr=112=0.0833\pi_{TITLE}=\frac{r_{TITLE}}{r}=\frac{1}{12}=0.0833Ï€TITLE​=rrTITLE​​=121​=0.0833

  • Ï€YEAR=rYEARr=412=0.3333\pi_{YEAR}=\frac{r_{YEAR}}{r}=\frac{4}{12}=0.3333Ï€YEAR​=rrYEAR​​=124​=0.3333

Calculating Pe

To calculate the chance weighted percent agreement, Formula (11) can be applied to Formula (12).

pe=∑k=1mπk2(12)p_e=\sum\limits_{k=1}^{m}{\pi_k}^2 (12)pe​=k=1∑m​πk​2(12)
  • pep_epe​ is the chance weighted percent agreement.

  • mmm is the total number of label.

  • Ï€k\pi_kÏ€k​ is the classification probability for kkk label.

Here is the chance weighted percent agreement calculation.

pe=∑k=1mπk2p_e=\sum\limits_{k=1}^{m}{\pi_k}^2pe​=k=1∑m​πk​2

pe=πEVE2+πORG2+πPER2+πTITLE2+πYEAR2p_e={\pi_{EVE}}^2+{\pi_{ORG}}^2+{\pi_{PER}}^2+{\pi_{TITLE}}^2+{\pi_{YEAR}}^2pe​=πEVE​2+πORG​2+πPER​2+πTITLE​2+πYEAR​2

pe=0.08332+0.08332+0.41662+0.08332+0.33332p_e=0.0833^2+0.0833^2+0.4166^2+0.0833^2+0.3333^2pe​=0.08332+0.08332+0.41662+0.08332+0.33332

pe=0.3055p_e=0.3055pe​=0.3055

7. Calculating the Alpha

Finally, Krippendorff's alpha is calculated using Formula (13).

α=pa−pe1−pe(13)\alpha=\frac{p_a-p_e}{1-p_e} (13)α=1−pe​pa​−pe​​(13)
  • α\alphaα is the Krippendorff's alpha between Labeler A and Reviewer.

  • pap_apa​ is the observed weighted percent agreement.

  • pep_epe​ is the chance weighted percent agreement.

We can get the α\alphaα by applying pap_apa​ and pep_epe​ to Formula (13).

α=pa−pe1−pe=0.6944−0.30551−0.3055=0.56\alpha=\frac{p_a-p_e}{1-p_e}=\frac{0.6944-0.3055}{1-0.3055}=0.56α=1−pe​pa​−pe​​=1−0.30550.6944−0.3055​=0.56

Summary

  • We apply the same calculation for agreement between labelers, and between reviewer and labelers.

  • Missing labels from a single labeler will be removed.

  • The percentage of chance agreement will vary depending on:

    • The number of the labels in a project.

    • The number of label options.

  • When both labelers agree but the reviewer rejects the labels:

    • The agreement between the two labelers increases.

    • The agreement between the labelers and the reviewer decreases.

Krippendorff's Alpha
Image 1. Labeler A Work
Image 2. Labeler B Work
Image 3. Reviewer Work
Table 3. Agreement Table
Table 4. Weighted Number of Label
Table 5. Agreement Percentage
Table 6. Agreement Percentage of Each Span