RAGv2: Next-Gen Retrieval System
🌟 Welcome to RAGv2!
If you’ve ever used ChatGPT and wished it knew about your private documents or today’s news without making things up, you’re looking for RAG.
🧐 What is RAG? (The “Open-Book Exam” Analogy)
Imagine a brilliant student taking an exam.
- Standard AI: The student answers from memory. If they forgot a detail, they might guess (this is called a “hallucination”).
- RAG System: The student is allowed to take the exam with their textbooks open.
- When asked a question, they first search the textbook for the right page.
- They read the information on that page.
- they write an answer based only on what they just read.
RAGv2 is that student, and your PDFs/Text files are the textbooks.
🏗️ Core Architecture
RAGv2 uses a unique Parent-Child Chunking strategy. Instead of feeding the AI random snippets of text, we find the specific needle in the haystack (the Child) but give the AI the whole context (the Parent) to ensure it understands the “big picture.”
📚 Quick Start Navigation
- The Concept: system-architecture - Why we use Parent-Child chunking.
- The Engine: all-in-one-wrapper - Run the project with a single command.
- The Data: data-pipeline - How to feed your documents to the system.
- The Brain: chat-engine - How the AI “thinks” and answers.
- The Settings: hyperparameters - Tuning for speed and accuracy.
🛠️ The Beginner’s Tech Stack
We use specific tools to make this run on your local computer (no expensive cloud required!):
| Tool | Purpose | Analogy |
|---|---|---|
llama_cpp | The Brain | The actual AI model that speaks and thinks. |
faiss | The Memory | A specialized index that finds the right page in your “textbook” instantly. |
pypdf | The Eye | Reads your PDF files and turns them into text. |
GGUF | The Format | A special way of shrinking models (Quantization) so they fit on standard GPUs. |
Last Updated: 2026-05-01