Validate components and tools

Validate components

Projects

A development project is designed to be used during RAG system development. It is a collection of runs that allow you to see how the run performance for a given set of questions changes over time.

A production monitoring project allows you to monitor the performance over time of a production RAG system. You configure the RAG system to automatically send to the production monitoring project the questions your users asked, the answers the RAG system provided, and the associated context.

For more information, go to Managing projects in Validate.

Metrics

Metrics are used to score the RAG system responses to questions.

For a development project, Validate calculates metric scores for the benchmark questions that are provided for the project.

For a production monitoring project, Validate calculates metric scores for the questions that users ask the RAG system. The RAG system sends the questions to Validate.

Validate calculates different metrics that represent different aspects of a RAG system. For more information about metrics, go to the metrics section.

Runs

For a Validate development project, a run represents an assessment of the RAG responses to a set of questions based on the RAG system configuration at a given point in time.

For each response, the run includes:

The question and, optionally, the corresponding ideal answer. A benchmark is one option for providing the questions.
The LLM's response and the context that the RAG system retrieved
Metadata in the form of key-value pairs that you specify. For example, "Model": "GPT-4"
Scores for the responses that use your chosen metrics

The run also includes overall scores for the given metrics.

For more information, go to Viewing and managing runs.

Benchmarks

For a Validate development project, a benchmark is a collection of questions with or without responses. The responses represent the ideal answers to the given questions.

A benchmark is one way to provide the questions for Validate to use to evaluate your RAG system.

For more information, go to Managing benchmarks in Validate.

Validate tools

Validate SDK (tonic-validate)

You must use the Validate SDK to:

You can also use the SDK to:

Manage projects
Manage benchmarks for a development project
Calculate RAG metrics outside the context of a Validate project

Validate application

You can use the Validate application to manage benchmarks and projects.

You must use the Validate application to view:

Last updated 1 year ago

Was this helpful?