LogoLogo
Tonic Validatetonic_validateDocs homeTonic.ai
  • Tonic Validate guide
  • About Tonic Validate
    • What is Tonic Validate?
    • Validate components and tools
    • Validate workflows
  • Getting started with Validate
    • Starting your Validate account
    • Setting up the Validate SDK
    • Quickstart example - create a run and log responses
    • Creating and revoking Validate API keys
  • About RAG metrics
    • About using metrics to evaluate a RAG system
    • RAG components summary
    • RAG metrics summary
    • RAG metrics reference
  • Benchmarks and projects
    • Managing benchmarks in Validate
    • Managing projects in Validate
  • Runs
    • Starting a Validate run
    • Viewing and managing runs
  • Production monitoring
    • Configuring your RAG system to send questions to Validate
    • Viewing the metric scores and logged questions
  • Code examples
    • End-to-end example using LlamaIndex
Powered by GitBook
On this page
  • Managing benchmarks
  • Displaying the list of benchmarks
  • Creating a benchmark
  • Updating a benchmark
  • Deleting a benchmark
  • Configuring benchmark questions
  • Adding a question to a benchmark
  • Updating a benchmark question
  • Deleting questions from a benchmark
  • Using Benchmarks from the UI
  • Using the Validate SDK to manage benchmarks

Was this helpful?

Export as PDF
  1. Benchmarks and projects

Managing benchmarks in Validate

Last updated 12 months ago

Was this helpful?

A benchmark is a set of questions that can optionally include the expected answers. For a Tonic Validate development project, a benchmark is one way to provide the questions for a run.

A run assesses how your RAG system answers the benchmark questions. If your benchmark includes answers, then Validate compares the answers from the benchmark with the answers from your RAG system.

To create and update benchmarks, you can use either the Validate application or the Validate SDK.

Managing benchmarks

Displaying the list of benchmarks

To display your list of benchmarks, in the Validate navigation menu, click Benchmarks.

For each benchmark, the Benchmarks page displays:

  • The name of the benchmark

  • The number of questions in the benchmark

Creating a benchmark

You create a benchmark from the Benchmarks page.

To create a benchmark from the Benchmarks page:

  1. Click Create A New Benchmark.

  2. In the Name field, enter a name for the benchmark.

  3. Click Save.

Updating a benchmark

You can update the name and questions for an existing benchmark.

To update a benchmark:

  1. On the Benchmarks page, either:

    • Click the benchmark name.

    • Click the options menu for the benchmark, then click Edit.

  1. On the Edit Benchmark panel, to change the benchmark name, in the Name field, enter the new name.

  2. You can also:

  3. To save the changes, click Save.

Deleting a benchmark

To delete a benchmark, on the Benchmarks page:

  1. Click the options menu for the benchmark.

  2. In the options menu, click Delete.

Configuring benchmark questions

Adding a question to a benchmark

A benchmark consists of a set of questions. For each question, you can optionally provide the expected response.

To add a question to a benchmark:

  1. Click Add Q&A.

  2. In the Question field, type the text of the question.

  3. Optionally, in the Answer field, type the text of the expected answer. If you do not provide an answer, then Validate cannot calculate an answer similarity score for the question.

  4. Click Finish Editing.

Updating a benchmark question

To update an existing question:

  1. Click the edit icon for the question.

  2. Update the Question and Answer fields.

  3. Click Finish Editing.

Deleting questions from a benchmark

To delete a question from a benchmark, click the delete icon for the question.

To delete all of the questions, click Clear All.

Using Benchmarks from the UI

You can use the benchmarks from the UI in the Validate SDK via calling get_benchmark

from tonic_validate import ValidateApi
validate_api = ValidateApi("your-api-key")
benchmark = validate_api.get_benchmark("benchmark_id")

Using the Validate SDK to manage benchmarks

You can use the Validate SDK to create a benchmark from a list of questions and answers.

from tonic_validate import Benchmark
benchmark = Benchmark(
    questions=["What is the capital of France?"],
    answers=["Paris"]
)

To upload this benchmark to the UI, use the new_benchmark method in the ValidateApi

from tonic_validate import ValidateApi
benchmark = Benchmark(
    questions=["What is the capital of France?"],
    answers=["Paris"]
)
validate_api.new_benchmark(benchmark, "benchmark_name")

.

Add questions to the benchmark
Add questions
Update questions
Delete questions
Benchmarks page displaying the list of benchmarks
Edit Benchmark panel
Options menu for a benchmark
Fields to add a question to a benchmark