RAG metrics summary

RAG metrics include the following types of scores:

Score nameInputFormulaWhat does it measure?Evaluated components
  • Question

  • Reference answer

  • LLM answer

Score between 0 and 5

How well the reference answer matches the LLM answer.

All components

  • Question

  • Retrieved context

(Count of relevant retrieved context) / (Count of retrieved context)

Whether the context retrieved is relevant to answer the given question.

  • Chunker

  • Embedder

  • Retriever

  • Question

  • Retrieved context

  • LLM answer

(Count of relevant retrieved context in LLM answer) / (Count of relevant retrieved context)

Whether the relevant context is in the LLM answer.

  • Prompt builder

  • LLM

  • Retrieved context

  • LLM answer

(Count of retrieved context in LLM answer) /

(Count of retrieved context)

Whether all of the context is in the LLM answer.

  • Prompt builder

  • LLM

  • Retrieved context

  • LLM answer

(Count of main points in answer that can be attributed to context) /

(Count of main points in the answer)

Whether the LLM answer contains information that does not come from the context.

  • Prompt builder

  • LLM

Last updated