MCP Local RAG is a privacy-first document search server that runs entirely on your machine. By using the Model Context Protocol, it enables seamless search through local documents without sending any data to external services, addressing privacy concerns and the costs associated with cloud solutions.
MCP Local RAG is a privacy-centric document search server designed to operate entirely on local machines. This server eliminates the need for API keys or cloud services, ensuring that sensitive data remains on the user's device. Leveraging the Model Context Protocol (MCP), it supports tools like Cursor, Codex, and Claude Code, enabling users to perform semantic searches on local documents without compromising on privacy.
Key Features
- Privacy-First Search: Provides an efficient way to search technical specifications, research papers, internal documentation, and meeting notes without sending data to external services.
- Cost-Effective: Reduces costs associated with third-party embedding APIs, especially for large document sets or frequent searches.
- Offline Accessibility: Enables document searches without needing internet connectivity, as all processing happens locally.
Functionality
The MCP Local RAG server comprises several essential tools:
- Document Ingestion: Ingests various file formats, including PDF, DOCX, TXT, and Markdown. It extracts text, divides it into searchable segments, generates embeddings using a local model, and stores this data in a local vector database.
- Semantic Search: Utilizes natural language queries to understand user intent, offering relevant results even when different terminology is used.
- File Management: Keeps track of ingested documents, their chunk count, and indexing status.
- System Status Monitoring: Provides insights into document count, chunks generated, and memory usage to help with performance evaluation.
Performance
The server shows exceptional performance metrics:
- Typical query response times are under 3 seconds, even with thousands of indexed document chunks.
- Document ingestion is efficient, capable of processing a 10MB PDF in approximately 45 seconds.
How It Works
The server relies on:
- LanceDB for efficient vector storage without the need for complex server setups.
- Transformers.js for generating embeddings, functioning effectively within Node.js.
- all-MiniLM-L6-v2 model for producing high-dimensional vectors to represent the semantic meaning of text.
Getting Started
The setup involves minimal configuration:
- For each client (Cursor, Codex, Claude Code), specify the local RAG server in the respective configuration files.
- Users can begin interacting with their documents straight away by employing simple commands to ingest and search through documents.
This local server is an invaluable tool for developers and professionals who seek a functional and responsive AI-driven document search system while maintaining complete control over their data. The focus on privacy means users can trust that their sensitive information remains secure and accessed only by them.
No comments yet.
Sign in to be the first to comment.