PitchHut logo
Git-Native RAG
Manage RAG context easily with a privacy-first CLI tool.
Pitch

Git-Native RAG offers a lightweight and privacy-focused CLI solution for managing Retrieval-Augmented Generation (RAG) contexts. By leveraging Git metadata, it eliminates bulky databases and opaque binary data, ensuring a clean Git history while empowering local AI processing—all without the need for cloud infrastructure.

Description

Git-Native RAG is a lightweight, privacy-first, and deterministic command-line interface (CLI) tool designed for managing Retrieval-Augmented Generation (RAG) context using Git metadata. This tool allows developers to manage their AI workflows locally, maintaining complete independence from any cloud infrastructure.

Why Choose Git-Native RAG?

Traditional RAG systems can be cumbersome, often relying on complex databases and unwieldy binary data that can clutter Git history. Git-Native RAG offers a streamlined solution through its innovative Two-Stage Architecture:

  1. The Tracker (Zero-AI): Efficiently records file hashes, ensuring fast and deterministic operations that are friendly to Git.
  2. The Indexer (Local AI): Generates embeddings locally using Ollama, storing vectors in a compact SQLite format.

Key Benefits

  • Clean Git History: Avoids bloating pull requests with large binary vector files.
  • Privacy Focused: Ensures that sensitive code remains local to your machine.
  • Native Agent Support: Uses AGENTS.md to provide guidance for AI agents effectively.
  • Efficiency: Employs Int8 quantization, resulting in a minimal and rapid vector database.
  • MCP Compatibility: Seamlessly exposes your RAG index as Model Context Protocol (MCP) tools for AI agents.

Features

  • Pluggable Providers: Integrates easily with Ollama and supports various embedding models.
  • AGENTS.md Integration: Automatically generates guidance files for AI collaborators, enhancing collaboration.
  • .ragignore Support: Customizable file exclusion using glob patterns.
  • Thematic Clustering: Implements K-Means grouping for better architectural organization.
  • Lazy Indexing: Updates indexes only for modified files, optimally managing resources.
  • Web Dashboard: Provides a visual interface for metrics, searching, and cluster exploration.
  • Commit Analysis: Delivers semantic indexing for Git commit histories.
  • Matryoshka Support: Manages variable embedding dimensions for optimized storage.
  • AST Chunking: Facilitates structure-aware code splitting tailored for TypeScript/JavaScript.

Architecture

The project's architecture is organized with a dedicated manifest for file hashes that gets committed to Git, while the vector database remains gitignored to ensure privacy and optimize space usage.

Testing and Development

Support for extensive testing protocols is included, allowing for unit tests and integration tests, ensuring robust functionality right out of the box.

Roadmap

Future enhancements are planned including VS Code extension support and OpenAI provider integration, ensuring that this tool continues to evolve to meet developers' needs.

For those interested in exploring the implications of AI in code management and retrieval systems, Git-Native RAG stands out as a vital, efficient, and privacy-oriented solution.

0 comments

No comments yet.

Sign in to be the first to comment.