Analytics

Datasaur offers various methods to efficiently view and analyze your data. The Analytics pages are exclusively accessible to Administrators. For convenient data access, you can choose to export it, and it will be delivered to your email.

Custom Report Builder

We also have a feature to customize your own report to get the Analytics data that you want. For more detailed information, you can access this page.

Analytics Extension on a Project

Other pages offer comprehensive performance information, focusing on the results of labeling work. To cover all the bases, Datasaur will make sure it's easy to track each labeler's progress during an ongoing project. You can easily see how many labels have been produced, how many questions have been answered, and how many documents are left for each labeler.

For more details, visit this page.

Charts

Here are some tips to make sense and interpret the following charts below.

Higher values on the charts indicate better performance. This means that your team is consistently improving their speed, accuracy, and overall efficiency.
It's important to note that the total labels shown on the Throughput chart may not always match the total labels on the Quality chart. A higher value on the Throughput chart indicates that there are labels manually applied in Reviewer Mode, as the Quality chart only calculates labels from Labeler Mode. This could also indicate potential issues if there is a significant difference, as it may suggest that the Reviewer had to manually label a large amount of data.
In some cases, you may observe high Efficiency values while both Throughput and Quality values are low. This typically occurs when doing projects with lots of pre-labeled data, as Throughput and Quality calculations only consider manually applied labels and exclude pre-labeled data.

Overall Projects

Display the current total projects distribution based on its status. Note that is a statistic and not a time-series data.

Remaining Files

Display the current remaining files from uncompleted projects, break down by project status. Note that is this one is also statistic and not a time-series data.

Throughput

It demonstrates the speed at which your team can produce annotations (labels and answers). It is calculated by summing the following factors:

Total labels applied from each of labelers.
Total labels applied from the Reviewer Mode. However, this count excludes labels that are automatically accepted through consensus. It focuses solely on labels applied manually by reviewers, including those involved in conflict resolution.

Efficiency

It illustrates the effectiveness of your labeling process in generating accepted labels per minute.

It is calculated by dividing the total number of accepted labels from Reviewer Mode (including those resolved manually, applied directly, through consensus, and pre-labeled data) by the total time spent by all team members, broken down on a daily basis.

Quality

It provides a breakdown of conflicts that occurred among labelers and how they were handled by the reviewers. This metric could also give you insights regarding the level of agreement among labelers (especially if combined with IAA) and the effectiveness of the reviewers in resolving conflicts. It's important to note that conflicts do not occur for Bounding Box Labeling and LLM projects. The data is categorized as follows:

Total accepted labels This metric represents the sum of all labels applied from each labelers that have been accepted through both consensus and manual review processes.
Total rejected labels This metric represents the sum of all labels applied from each labelers that have been rejected through both consensus and manual review processes.
Total unresolved conflicts This metric represents the sum of all labels from each labelers that have not yet been accepted (either manually or through consensus) or rejected.

Cumulative Time Spent

It gives the total time spent of each day, categorized by labeler and reviewer role. This metric is computed when team members have a project open on their active browser tab. If they switch to another tab while working on the project, that time will not be factored into the calculation.

A project is considered idle if there is no mouse, keyboard, or scroll movement for two minutes. When a project is open and active, it will count the time spent by sending a request to the backend every 60 seconds. If a workspace becomes idle before 60 seconds have passed, it will send the elapsed time since the last request to the backend.

Multiple Ways to View the Data

Overview

Gain a high-level understanding of your data within the Workspace. This page calculates metrics from all your projects and provides overall comprehensive insights, including the Inter-Annotator Agreement. Access the Overview page by selecting Analytics on the sidebar.

You can use the project tag filter to view detailed analytics for specific projects. With this functionality, you have the option to select multiple tags, enabling the filters to operate using OR logic. This means that any projects containing at least one of the selected tags will be displayed in the filter results.

Project

Quickly access essential project information through the Name column, including statistics like total files, total time spent, token count (Span Labeling), and row count (Row Labeling). Gain additional insights by hovering over the avatar icon in the Labelers column or the Status column (only for in review and labeling in progress) to get the snapshot of the labeling progress.

For an in-depth analysis of a specific project, click the triple dot next to the project and select the "View project analytics". This detailed page includes a summary of your labeling work in the report tab, specific charts for the project, IAA across all assignees, and evaluation metrics comparing Reviewer Mode with each Labeler Mode.

Team Member

Explore specific team member's data and insights, providing a comprehensive view of their contributions and performance based on the role. Once again, we will use the same charts above but filter specifically for a particular team member. You can also see the overview information for each project assigned.

Regarding the table, it displays the latest statistics and is not dependent on the date range selected. It's important to note that the data in the table is specific to either the Labeler or Reviewer Mode, which can be toggled using the tab at the top of the page.

Access this report by going to your Members page, clicking on the three dots corresponding to your teammate, and then choosing View Member Details.

Evaluation Metrics

To assist you in evaluating the labeling, Datasaur calculates the evaluation metrics, which are available specifically for each completed project. For more detailed information, you can access this page.

Last updated 6 months ago