RAG components summary

A RAG system includes the following components:

Component

Definition

Examples

Document store

Where the textual data is stored.

Google Docs, Notion, Word documents

Chunker

How each document is broken into pieces (or chunks) that are then embedded.

llama hub

Embedder

How each document chunk is transformed into a vector that stores its semantic meaning.

ada-002, sentence transformer

Retriever

The algorithm that retrieves relevant chunks of text from the user query.

Those chunks of text are used as context to answer a user query.

Take the top cosine similarity scores between the embedding of the user query and the embedded document chunks

Prompt builder

How the user query, along with conversation history and retrieved document chunks, are put into the context window to prompt the LLM for an answer to the user query.

Here's a user query {user_query} and here's a list of context that may be helpful to answer the user's query: {context_1}, {context_2}.

Answer the user's query using the given context.

LLM

The large language model that receives the prompt from the prompt builder and returns an answer to the user's query.

gpt3.5-turbo, gpt4, llama 2, claude