LogoLogo
Tonic Validatetonic_validateDocs homeTonic.ai
  • Tonic Validate guide
  • About Tonic Validate
    • What is Tonic Validate?
    • Validate components and tools
    • Validate workflows
  • Getting started with Validate
    • Starting your Validate account
    • Setting up the Validate SDK
    • Quickstart example - create a run and log responses
    • Creating and revoking Validate API keys
  • About RAG metrics
    • About using metrics to evaluate a RAG system
    • RAG components summary
    • RAG metrics summary
    • RAG metrics reference
  • Benchmarks and projects
    • Managing benchmarks in Validate
    • Managing projects in Validate
  • Runs
    • Starting a Validate run
    • Viewing and managing runs
  • Production monitoring
    • Configuring your RAG system to send questions to Validate
    • Viewing the metric scores and logged questions
  • Code examples
    • End-to-end example using LlamaIndex
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Code examples

End-to-end example using LlamaIndex

Last updated 12 months ago

Was this helpful?

This development project example uses:

  • The data found in the folder of the tonic_validate SDK repository

  • The list of questions and reference answers found in

We use six Paul Graham essays about startup founders taken from . With these we build a RAG system that uses the simplest default model.

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("./paul_graham_essays").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Gets the response from llama index
def get_llama_response(prompt):
    response = query_engine.query(prompt)
    context = [x.text for x in response.source_nodes]
    return {
        "llm_answer": response.response,
        "llm_context_list": context
    }

We load the question and answer list and use it to create a Tonic Validate benchmark.

import json
from tonic_validate import Benchmark

import json
qa_pairs = []
with open("question_and_answer_list.json", "r") as qa_file:
    qa_pairs = json.load(qa_file)[:10]

question_list = [qa_pair['question'] for qa_pair in qa_pairs]
answer_list = [qa_pair['answer'] for qa_pair in qa_pairs]

benchmark = Benchmark(questions=question_list, answers=answer_list)

Next, we connect to Validate using an API token we generated from the Validate application, and create a new development project and benchmark.

from tonic_validate import ValidateApi
validate_api = ValidateApi("api-key-here")

Finally, we can create a run and score it.

from tonic_validate import ValidateScorer, ValidateApi
from tonic_validate.classes.benchmark import BenchmarkItem

# Score the responses
scorer = ValidateScorer()
run = scorer.score(benchmark, get_llama_response)

After you execute this code, you can upload your results to the Validate application and view it.

from tonic_validate import ValidateApi
# Upload the run
validate_api = ValidateApi("your-api-key")
validate_api.upload_run("your-project-id", run)

The metrics are automatically calculated and logged to Validate. The distribution of the scores over the benchmark are also graphed.

examples/paul_graham_essays
examples/question_and_answer_list.json
his blog
LlamaIndex