Performance Monitoring
Last updated
Last updated
Performance monitoring is a powerful feature that enhances Datasaur's Automated Evaluation for LLM Applications. It allows users to set up regular, automated evaluations of their LLM applications, ensuring consistent performance monitoring and timely improvements. Once the schedule is set, evaluations are conducted automatically without the need to open LLM Labs, and users can conveniently check or receive notifications of the evaluation results.
To begin using the Performance monitoring:
Navigate to the Performance monitoring page under Evaluation menu.
Click the Create performance monitoring button.
Configure your evaluation by selecting the applications that you want to evaluate and uploading the ground truth dataset in a CSV format containing two columns: prompt
and expected completion
.
In the final step, you will need to set and configure the schedule for the evaluation process. You will need to configure:
Recurrence: You can choose the frequency of your evaluation. The options include Daily, Weekly on Sunday, Monthly on day 1, or Custom.
Daily at 12:00 AM: Your evaluation will be performed on a daily basis at midnight.
Weekly on Sunday at 12:00 AM: Your evaluation will be performed weekly on Sunday at midnight.
Monthly on day at 12:00 AM: Your evaluation will be performed monthly on the first day of the month at midnight.
Custom: You can set and configure your own evaluation frequency.\
Monitor performance drift: You can get notified for LLM performance drift over time. Datasaur will notify you via email when any generated completion deviates beyond a specified threshold during scheduled evaluations, indicating potential performance deterioration.
Run immediately: Evaluate your LLM application once after creating the project, regardless of the recurrence settings.
Click the Create evaluation project button, and your scheduled automated evaluation will be created.
You can click the Run now button, to manually start the evaluation process.
Once the evaluation process has started, you will need to wait until it is completed. You'll receive an email once it's finished, or you can refresh the page to see the latest update.
You can see the evaluation result once the evaluation process is done.
After the evaluation process is completed, you can analyze the results.
On the summary section, you can see the cost and the processing time of the evaluation process. You can also see the average evaluator score and the performance result.
In the results section, you can see the completions generated by your LLM, along with their scores for the selected metric, reasons behind the scores, and overall performance.
You can view the evaluation details from your application by clicking the three-dots icon on the right side of the table.
Here you can view the detailed evaluation result.
You can also use your existing dataset that you have already uploaded to the page.
Select the Metric, Provider, and the Evaluator model you want to use for evaluation. .