Vectorless RAG
What Is Vectorless RAG?
Traditional RAG converts documents into vector embeddings and retrieves by similarity search. Vectorless RAG eliminates embeddings entirely — it navigates document structure using LLM reasoning, similar to how a human reads a table of contents to find the right section.
Traditional RAG vs Vectorless RAG
Traditional RAG Vectorless RAG
───────────── ──────────────
Document Document
↓ ↓
Chunking Structured Index (tree)
↓ ↓
Embeddings Query Routing (LLM reasoning)
↓ ↓
Vector Database Hierarchical Navigation
↓ ↓
Similarity Search Precise Section Retrieval
↓ ↓
Top-K Chunks Relevant Pages/Sections
↓ ↓
LLM Output LLM Output
How It Works
Phase 1 — Indexing
The system builds a hierarchical tree index from the document's natural structure: chapters → sections → subsections. Each node stores a title, summary, and page range. No chunking or embedding is needed.
Document
├── Chapter 1: Introduction
│ ├── 1.1 Background (pp. 1–3, summary: "...")
│ └── 1.2 Problem Statement (pp. 4–5, summary: "...")
├── Chapter 2: Methodology
│ ├── 2.1 Data Collection (pp. 6–8, summary: "...")
│ └── 2.2 Analysis (pp. 9–12, summary: "...")
└── Chapter 3: Results (pp. 13–18, summary: "...")
Phase 2 — Query (Reasoning-Based Retrieval)
User Query
↓
LLM reads tree structure (titles + summaries)
↓
LLM reasons: "Which branches answer this question?"
↓
Navigate to relevant nodes
↓
Extract full text from those sections
↓
LLM generates answer with page references
The LLM mimics how a human expert reads a document:
- Look at the table of contents
- Identify relevant sections by reasoning over summaries
- Read the relevant content
- Answer the question with traceable references
Key Differences
| Aspect | Traditional RAG | Vectorless RAG |
|---|---|---|
| Retrieval method | Vector similarity (ANN) | LLM reasoning over tree structure |
| Document processing | Chunking + embedding | Hierarchical index (no chunking) |
| Infrastructure | Vector DB + embedding model | Document tree + LLM only |
| Context preservation | Chunks lose surrounding context | Full section context preserved |
| Explainability | Opaque similarity scores | Traceable reasoning path + page refs |
| Latency | Fast (vector lookup) | Slower (LLM reasoning per query) |
| Scale | Billions of vectors | Best for focused document sets |
When to Use Each
| Use Case | Best Approach |
|---|---|
| Large unstructured corpora (research papers, web content) | Traditional RAG |
| Semantic search across many independent documents | Traditional RAG |
| Real-time retrieval over very large datasets | Traditional RAG |
| Long structured documents (legal, financial, technical) | Vectorless RAG |
| Compliance/audit (explainability required) | Vectorless RAG |
| Documents with clear hierarchy (manuals, reports) | Vectorless RAG |
| Mixed document types and query patterns | Hybrid (vector for broad, reasoning for precision) |
PageIndex — Reference Implementation
PageIndex (MIT licensed) is the primary open-source framework for vectorless RAG:
from pageindex import PageIndexClient
client = PageIndexClient(api_key="YOUR_KEY")
doc_id = client.submit_document("report.pdf")["doc_id"]
tree = client.get_tree(doc_id, node_summary=True)["result"]
Tree search via LLM reasoning:
import json
search_prompt = f"""
You are given a question and a document tree structure.
Each node has a node_id, title, and summary.
Find all nodes likely to contain the answer.
Question: {query}
Document tree: {json.dumps(tree, indent=2)}
Reply as JSON: {{"thinking": "...", "node_list": ["id1", "id2"]}}
"""
result = await call_llm(search_prompt)
node_ids = json.loads(result)["node_list"]
context = "\n\n".join(node_map[nid]["text"] for nid in node_ids)
answer = await call_llm(f"Answer based on context:\n{context}\n\nQ: {query}")
Hybrid Approach
Production systems increasingly combine both:
- Vector search for broad retrieval across many documents
- Reasoning-based navigation for precision within selected documents
Query → Vector DB (find relevant documents)
↓
Top documents → Tree index (navigate to exact sections)
↓
Precise context → LLM → Answer with page references