Validate workflows

Development project workflow

The overall process to use a Tonic Validate development project to evaluate your RAG system consists of the following:

Overview diagram of a Validate development project workflow

Create your benchmark (optional)

A Validate run analyzes a RAG system performance against a set of questions and optional ideal answers.

One way to provide the questions and answers is to configure a benchmark in Validate.

You can use the Validate application or SDK to add the benchmark to Validate.

Create your project

Next, use the Validate application to create a development project.

Create a run

Use the Validate SDK to create a run for the project.

The run configuration includes:

  • The project

  • The questions for to analyze the RAG performance. A Validate benchmark is one way to provide the question data.

  • Any metadata about the RAG data, such as the type of LLM, the embedder, or the retrieval algorithm

  • The metrics to calculate

Review the run results

From the Validate application, review the scores and metrics from the run.

Update and iterate

Based on the run results, you update the RAG system to improve the results, then create another run.

You compare the run results to see if your changes improved the quality of the answers.

Production monitoring project workflow

After you release your RAG system, you can use a Validate production monitoring project to track how well it answers user questions.

Overview diagram of a Validate production monitoring project workflow

Create your project

Use the Validate application to create a production monitoring project.

Configure your RAG system to send questions to the project

In your RAG system, you add a call to the Validate SDK to send the following to the production monitoring project:

  • Each question that a user asked

  • The answer that the RAG system provided

  • The context that the RAG system used

View the results

As it receives the questions, Validate generates metric scores.

In the Validate application, you can view a timeline of the average scores for the questions that Validate received from the RAG system.

You can also view and filter the list of questions.

Last updated

Was this helpful?