# Multi-application evaluation

### Overview

This feature allows you to compare and evaluate the performance of multiple models using various metrics from renowned evaluators like Ragas, Langchain, and Deepeval. By streamlining the assessment process, you can gain insights into the strengths and weaknesses of different applications and make data-driven decisions to optimize your workflows.

### Get started

To evaluate multiple models:

1. Navigate to the **Evaluation** page under LLM Labs menu.
2. Click the **Create evaluation project** button and choose **Automated evaluation** project type, then **Continue**.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-aebfa48377bb382a86cde19afd25715f0180d801%2FEvaluation%20-%20Create%20evaluation%20project%20dialog.png?alt=media" alt=""><figcaption></figcaption></figure>
3. Configure your evaluation by selecting the models to evaluate and choosing a dataset from the library. If you don’t have one, you can also upload a dataset in a CSV format containing two columns: `prompt` and `expected completion`.

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-995df2ed176b7c4839291945a99f1f2681e88b10%2FAutomated%20evaluation%20-%20PCW%20-%20Step%201.png?alt=media" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
If you can’t find your model in the list, go to the [Sandbox](https://docs.datasaur.ai/llm-projects/sandbox) where your model is created, and [deploy](https://docs.datasaur.ai/llm-projects/sandbox#deploying-the-llm) or save to library. You can only evaluate deployed or saved models.
{% endhint %}

4. Select the metric, provider, and the evaluator model you want to use for evaluation. [Learn more about the evaluators and metrics](https://docs.datasaur.ai/llm-projects/evaluation/automated-evaluation#evaluators).

   <figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-395aa9074c5a5e7da924c38475b7d72a6bb85117%2FAutomated%20evaluation%20-%20PCW%20-%20Step%202.png?alt=media" alt=""><figcaption></figcaption></figure>
5. Click **Create evaluation** **project** and wait for the evaluation process to finish.

### Analyze the evaluation results

After the evaluation process is completed, you can analyze the results.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-67fc3b78e27b6a086863b3154a299d9d5e037fe0%2FAutomated%20evaluation%20-%20Project%20-%20Multiple%20applications.png?alt=media" alt=""><figcaption></figcaption></figure>

#### **Summary of the evaluation**

Here you can view the total cost, time taken for generating completions, and the overall performance score given by the evaluator.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-f0328ba8c8f24a297fe67259b86e5d9e9ad6444d%2FAutomated%20evaluation%20-%20Project%20-%20Multiple%20applications%20-%20Summary.png?alt=media" alt=""><figcaption></figcaption></figure>

#### **Result and score from each model**

Here you can view the quality, the score, and the processing time of the generated completion from each model.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-fb57e33301ce0a8bcd8fb5159d6220262948a283%2FAutomated%20evaluation%20-%20Project%20-%20Multiple%20applications%20-%20Result%20table.png?alt=media" alt=""><figcaption></figcaption></figure>

#### Evaluation details

To view the evaluation details of a completion, click the **More** icon (three dots) at the far right of the row, then select **View details**.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-4dc415a138a4748e244f29d78a977fdfc81a662d%2FAutomated%20evaluation%20-%20Project%20-%20Multiple%20applications%20-%20Evaluation%20details%20dialog%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>
