DocForge - Self-validating multi-agent RAG system that fact-checks its own answers

DocForge

Self-validating multi-agent RAG system that fact-checks its own answers

Pitch

DocForge is a multi-agent RAG system built with LangGraph where four specialized agents route, retrieve, synthesize, and validate answers from your documents. Unlike basic RAG pipelines, every response is fact-checked against source documents — hallucinations are caught and corrected automatically with retry logic. Features Redis caching, adaptive retrieval, and dual LLM support.

Description

DocForge is a sophisticated Multi-Agent Retrieval-Augmented Generation (RAG) system designed for capturing and analyzing technical documentation to provide accurate answers through an intelligent, streamlined process. Built on LangGraph, DocForge brings together advanced features such as intelligent query routing, adaptive document retrieval, and built-in fact-checking to ensure reliable information extraction.

Key Features

Multi-Agent Architecture

Routing Agent: Categorizes query complexity (ranging from simple lookups to complex reasoning) and creates optimized search queries for effective database interaction.
Retrieval Agent: Dynamically retrieves relevant documents (between 3-10) based on query complexity, employing a relaxed approach on retries for improved accuracy.
Analysis Agent: Combines insights from multiple sources to synthesize coherent, well-cited answers using a chain-of-thought reasoning methodology.
Validation Agent: Verifies each claim against original documents to identify inaccuracies and rectify them when necessary.

Intelligent Workflow

Confidence-Based Validation Skip: Facilitates faster response times by bypassing validation when retrieval scores are high and sources are adequately supportive without information gaps.
Adaptive Retry Strategy: Automatically re-attempts document retrieval with an increased number of documents (up to 50% more) and relaxed relevance thresholds upon encountering validation failures.
Redis Caching: Enhances response speeds with cached query results (using SHA-256 keys with a 1-hour time-to-live), allowing instant returns for repeated queries.
Dual LLM Provider Support: Offers flexibility through the ability to switch between OpenAI GPT (via OpenRouter) and Google Gemini, allowing configuration tailored to specific tasks.

Production-Ready

Built with a FastAPI REST API for seamless interactions and robust performance tracking.
Includes comprehensive error handling to ensure reliability and graceful degradation across all functions.
Tracks token usage and monitors latency for optimizing performance in data processing tasks.
Features a complete ETL pipeline for efficient ingestion of PDF documents along with an in-memory embedding cache to minimize API calls.

Architecture Overview

The architecture of DocForge supports a comprehensive workflow:

User Query
    |
    v
+-----------------+
|   Redis Cache   | <-- Check cache first
+--------+--------+
         | (cache miss)
         v
+-----------------+
|  Routing Agent  | <-- Classify complexity, optimize search query
+--------+--------+
         |
         v
+-----------------+
| Retrieval Agent | <-- Fetch 3-10 docs from Pinecone
+--------+--------+     (50% more on retry, relaxed threshold)
         |
         v
+-----------------+
| Analysis Agent  | <-- Synthesize cited answer (chain-of-thought)
+--------+--------+
         |
         v
    Confidence Check:
    |
    +-- High confidence --> Skip validation --> Return & Cache
    |
    +-- Otherwise:
         |
         v
    +-----------------+
    |Validation Agent | <-- Fact-check every claim
    +--------+--------+
             |
             v
        Decision:
        +-- Valid            --> Return & Cache
        +-- Invalid (< 3)   --> Retry from Retrieval (adaptive)
        +-- Invalid (>= 3)  --> Return corrected answer & Cache

Usage Examples

Basic Query

from backend.agents.graph import run_graph

result = run_graph("What is LangGraph?")

print(result["fact_checked_answer"])
print(f"Validation: {result['validation_passed']}")
print(f"Documents used: {len(result['retrieved_chunks'])}")
print(f"Query type: {result['query_type']}")
print(f"Latency: {result['latency_ms']:.0f}ms")
print(f"Tokens used: {result['total_tokens_used']}")

Ingest PDF Documents

from backend.ingestion.pipeline import ingest_documents

stats = ingest_documents("./documents/", chunk_size=1000, chunk_overlap=200)

print(f"Loaded: {stats['documents_loaded']} documents")
print(f"Created: {stats['chunks_created']} chunks")
print(f"Uploaded: {stats['chunks_uploaded']} vectors")

API Query

curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is LangGraph?"}'

DocForge merges cutting-edge technology with intuitive design, making it a vital resource for professionals needing fast and accurate retrieval of information from complex technical documentation.

0 comments

No comments yet.

New comment