LogoLogo
Tonic Validatetonic_validateDocs homeTonic.ai
  • Tonic Validate guide
  • About Tonic Validate
    • What is Tonic Validate?
    • Validate components and tools
    • Validate workflows
  • Getting started with Validate
    • Starting your Validate account
    • Setting up the Validate SDK
    • Quickstart example - create a run and log responses
    • Creating and revoking Validate API keys
  • About RAG metrics
    • About using metrics to evaluate a RAG system
    • RAG components summary
    • RAG metrics summary
    • RAG metrics reference
  • Benchmarks and projects
    • Managing benchmarks in Validate
    • Managing projects in Validate
  • Runs
    • Starting a Validate run
    • Viewing and managing runs
  • Production monitoring
    • Configuring your RAG system to send questions to Validate
    • Viewing the metric scores and logged questions
  • Code examples
    • End-to-end example using LlamaIndex
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. About RAG metrics

RAG metrics summary

Last updated 3 months ago

Was this helpful?

RAG metrics include the following types of scores:

Score name
Input
Formula
What does it measure?
Evaluated components

or

  • Retrieved context

  • LLM answer

(Count of main points in answer that can be attributed to context) /

(Count of main points in the answer)

Whether the LLM answer contains information that does not come from the context.

  • Prompt builder

  • LLM

  • LLM answer

  • List of PII types

Calculated by Textual

Whether the LLM answer contains personally identifiable information (PII) of the specified types. Requires a Tonic Textual API key.

  • Prompt builder

  • LLM

  • LLM answer

  • Text string

  • Case-sensitivity flag

Compare LLM answer to text string

Whether the answer matches the provided text string.

LLM

  • Question

  • Reference answer

  • LLM answer

Score between 0 and 5

How well the reference answer matches the LLM answer. Cannot be used for production monitoring projects.

All components

  • Retrieved context

  • LLM answer

(Count of retrieved context in LLM answer) /

(Count of retrieved context)

Whether all of the context is in the LLM answer.

  • Prompt builder

  • LLM

  • Question

  • Retrieved context

  • LLM answer

(Count of relevant retrieved context in LLM answer) / (Count of relevant retrieved context)

Whether the relevant context is in the LLM answer.

  • Prompt builder

  • LLM

  • Callback

User-defined

Returns a true or false value based on a callback function that you provide. Cannot be used for production monitoring projects.

User-defined

  • LLM answer

  • Text string

Text.in(LLM answer)

Whether the response contains the provided text string.

LLM

  • Retrieved context

  • List of PII types

Calculated by Textual

Whether the context used for the response contains PII of the specified types. Requires a Tonic Textual API key.

Prompt builder

  • Retrieved context

  • Minimum length

  • Maximum length

(Minimum length) <= len(Context) <= (Maximum length)

Whether the length of a context item falls within the specified range.

Prompt builder

  • LLM answer

Returns 1 or 0 based on whether there is duplicate information

Whether the response contains duplicate information.

LLM

  • LLM answer

Returns 1 or 0 based on whether there is hate speech

Whether the response contains hate speech.

LLM

  • Target length of time

(Run time) <= (Target time)

Whether the response takes longer than the provided target time.

Entire system

  • LLM answer

  • Regular expression

  • Expected number of matches

Runs a regex search and then counts the matches. Returns true if the number of matches is equal to the expected match count.

Whether the response contains the expected number of matches for the provided regular expression.

LLM

  • LLM answer

  • Minimum length

  • Maximum length

(Minimum length) <= len(LLM response) <= (Maximum length)

Whether the response length falls within the specified range.

LLM

  • Question

  • Retrieved context

(Count of relevant retrieved context) / (Count of retrieved context)

Whether the context retrieved is relevant to answer the given question.

  • Chunker

  • Embedder

  • Retriever

Answer consistency
Answer consistency binary
Answer contains PII
Answer match
Answer similarity score
Augmentation accuracy
Augmentation precision
Binary
Contains text
Context contains PII
Context length
Duplication
Hate speech content
Latency
Regex
Response length
Retrieval precision