Validate workflows
Last updated
Last updated
The overall process to use a Tonic Validate development project to evaluate your RAG system consists of the following:
A Validate run analyzes a RAG system performance against a set of questions and optional ideal answers.
One way to provide the questions and answers is to configure a benchmark in Validate.
You can use the Validate application or SDK to add the benchmark to Validate.
Next, use the Validate application to create a development project.
Use the Validate SDK to create a run for the project.
The run configuration includes:
The project
The questions for to analyze the RAG performance. A Validate benchmark is one way to provide the question data.
Any metadata about the RAG data, such as the type of LLM, the embedder, or the retrieval algorithm
The metrics to calculate
From the Validate application, review the scores and metrics from the run.
Based on the run results, you update the RAG system to improve the results, then create another run.
You compare the run results to see if your changes improved the quality of the answers.
After you release your RAG system, you can use a Validate production monitoring project to track how well it answers user questions.
Use the Validate application to create a production monitoring project.
In your RAG system, you add a call to the Validate SDK to send the following to the production monitoring project:
Each question that a user asked
The answer that the RAG system provided
The context that the RAG system used
As it receives the questions, Validate generates metric scores.
In the Validate application, you can view a timeline of the average scores for the questions that Validate received from the RAG system.
You can also view and filter the list of questions.