A RAG system includes the following components:
Component | Definition | Examples |
---|---|---|
Document store
Where the textual data is stored.
Google Docs, Notion, Word documents
Chunker
How each document is broken into pieces (or chunks) that are then embedded.
llama hub
Embedder
How each document chunk is transformed into a vector that stores its semantic meaning.
ada-002, sentence transformer
Retriever
The algorithm that retrieves relevant chunks of text from the user query.
Those chunks of text are used as context to answer a user query.
Take the top cosine similarity scores between the embedding of the user query and the embedded document chunks
Prompt builder
How the user query, along with conversation history and retrieved document chunks, are put into the context window to prompt the LLM for an answer to the user query.
Here's a user query {user_query} and here's a list of context that may be helpful to answer the user's query: {context_1}, {context_2}.
Answer the user's query using the given context.
LLM
The large language model that receives the prompt from the prompt builder and returns an answer to the user's query.
gpt3.5-turbo, gpt4, llama 2, claude