Managing benchmarks in Validate
A benchmark is a set of questions that can optionally include the expected answers. A Tonic Validate benchmark is one way to provide the questions for a Validate run.
A Validate run assesses how your RAG system answers the benchmark questions. If your benchmark includes answers, then Validate compares the answers from the benchmark with the answers from your RAG system.
To create and update benchmarks, you can use either the Validate application or the Validate SDK.
Managing benchmarks
Displaying the list of benchmarks
To display your list of benchmarks, in the Validate navigation menu, click Benchmarks.
For each benchmark, the Benchmarks page displays:
The name of the benchmark
The number of questions in the benchmark
Creating a benchmark
You create a benchmark from the Benchmarks page.
To create a benchmark from the Benchmarks page:
Click Create A New Benchmark.
In the Name field, enter a name for the benchmark.
Click Save.
Updating a benchmark
You can update the name and questions for an existing benchmark.
To update a benchmark:
On the Benchmarks page, either:
Click the benchmark name.
Click the options menu for the benchmark, then click Edit.
On the Edit Benchmark panel, to change the benchmark name, in the Name field, enter the new name.
You can also:
To save the changes, click Save.
Deleting a benchmark
To delete a benchmark, on the Benchmarks page:
Click the options menu for the benchmark.
In the options menu, click Delete.
Configuring benchmark questions
Adding a question to a benchmark
A benchmark consists of a set of questions. For each question, you can optionally provide the expected response.
To add a question to a benchmark:
Click Add Q&A.
In the Question field, type the text of the question.
Optionally, in the Answer field, type the text of the expected answer. If you do not provide an answer, then Validate cannot calculate an answer similarity score for the question.
Click Finish Editing.
Updating a benchmark question
To update an existing question:
Click the edit icon for the question.
Update the Question and Answer fields.
Click Finish Editing.
Deleting questions from a benchmark
To delete a question from a benchmark, click the delete icon for the question.
To delete all of the questions, click Clear All.
Using Benchmarks from the UI
You can use the benchmarks from the UI in the Validate SDK via calling get_benchmark
Using the Validate SDK to manage benchmarks
You can use the Validate SDK to create a benchmark from a list of questions and answers.
To upload this benchmark to the UI, use the new_benchmark
method in the ValidateApi
Last updated