# Ranking (RLHF)

## Overview

The **Ranking** evaluation project helps you assess the quality of your LLM completions using human judgment, by comparing multiple completions for the same prompt. You rank the completions from best to worst, providing insight into which outputs align most closely with your expectations.

## Prerequisites

In **Ranking** projects, you can evaluate two types of completions:

1. Pre-generated completions
2. Completions generated by models from Sandbox

### **Evaluate pre-generated completions**

1. Prepare a dataset in a CSV file with several columns: `prompt` and `completion_1`, `completion_2`, `completion_3`, and so forth up to `completion_xx`.

{% file src="/files/3ssA2gveoDaBX5E2xnLB" %}

### **Evaluate models from Sandbox**

1. Ensure the LLM application is deployed.
2. Prepare a dataset in a CSV file with one column: `prompt`.

{% file src="/files/kV4uhIM4zyuJmW3JWnav" %}

## Create project

To create Ranking evaluation projects:

1. Navigate to the **Evaluation** page under LLM Labs menu.
2. Click **Create evaluation project,** select **Ranking,** then **Continue**.

   <figure><img src="/files/L6IGw6IF6SDjHHK5edLd" alt=""><figcaption></figcaption></figure>
3. Set up your project. Choose what you want to evaluate with Ranking:
   1. **Evaluate pre-generated completions**

      1. Upload the dataset in a CSV file with several columns: `prompt` and `completion_1`, `completion_2`, `completion_3`, and so forth up to `completion_xx`.

      <figure><img src="/files/C0GQZemnDeQyhDioBGJt" alt=""><figcaption><p>Ranking evaluation project with pre-generated completion creation</p></figcaption></figure>
   2. **Evaluate models from Sandbox**

      1. Upload the dataset in a CSV file with one column: `prompt`.
      2. Select the model that you want to use to generate completions. If you can’t find your model in the list, go to the [Sandbox](https://docs.datasaur.ai/llm-projects/sandbox) where your model is created, and [deploy](https://docs.datasaur.ai/llm-projects/sandbox#deploying-the-llm) or save to library. You can only evaluate deployed or saved models.

      <figure><img src="/files/83eS6vDAz9KXsRcvaMZU" alt=""><figcaption><p>Ranking evaluation project with LLM application creation</p></figcaption></figure>
4. Click **Create evaluation** **project**.

## Evaluate completions

Open the project to start evaluating completions. Each prompt includes at least two completions. Rank them from best to worst by dragging to reorder, then submit your answer to move on to the next prompt.

{% embed url="<https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjXHclk9fELydcAslG%2Fuploads%2F1AHWlbbSXRVmr1evfZql%2FDemo%20Ranking%20(compressed).mp4?alt=media&token=db15e434-712e-4946-94d0-b6210277cfa6>" %}

## View evaluation results

After evaluating all completions, mark the evaluation as complete from the app bar. Click the current status **Evaluation in progress** and change it to **Evaluation completed**.

<figure><img src="/files/rdPRirehp8LuxPRQMc4l" alt=""><figcaption></figcaption></figure>

After the evaluation is marked as complete, you can view the summary of the evaluation. For models from Sandbox evaluation, you can see:

* Average cost and processing time for generating completions
* Evaluation results in a table view

<figure><img src="/files/0BMTBynjPPTkhg88BnA8" alt=""><figcaption></figcaption></figure>

For pre-generated completions evaluation, you can see the evaluation results in a table view as well.

<figure><img src="/files/6CToWiuINQoYJ5oi4Cb9" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/llm-projects/evaluation/ranking-rlhf.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
