PitchHut logo
Efficiently captures and retrieves context with local-first memory.
Pitch

Aingram redefines agent memory by integrating multiple retrieval methods in one SQLite file. With no cloud dependencies, it ensures privacy while efficiently finding context through advanced search techniques. Achieving exceptional accuracy, Aingram is the future of intelligent memory systems.

Description

AIngram (Lite) is a cutting-edge local-first agent memory system designed to enhance context retrieval with efficiency and precision. By utilizing a combination of full-text search (FTS5), semantic vector search with sqlite-vec, and knowledge graph traversal, AIngram operates seamlessly from a single SQLite database file, eliminating the need for cloud dependencies or external APIs.

Key Features

  • Multi-Signal Retrieval: Aingram integrates three retrieval signals to ensure that the most relevant context is always accessible, ranking results through Reciprocal Rank Fusion for optimal accuracy. This comprehensive approach allows AIngram to excel in applications requiring precise, domain-specific information retrieval.

  • Local Operation: The entire system operates locally on a SQLite database, ensuring privacy and control over memory content. This design choice simplifies the management of agent memory, allowing easy backup and transfer without reliance on external database systems.

  • High Performance: On the LongMemEval benchmark, Aingram demonstrates exceptional performance, achieving near-perfect recall scores. It finds the correct context in the top 3 results for 100% of queries with available evidence and achieves 95.5% recall in real-world noisy conversation histories.

How It Works

Agent query:

▶ FTS5 (keyword)                        
▶ sqlite-vec + QJL two-pass (semantic)  ── ▶ RRF fusion ──▶ ranked results
▶ Knowledge graph (entity)              
  • FTS5 full-text search — SQLite's native full-text index. Fast, no embedding required, excellent for exact terminology and technical strings.

  • sqlite-vec vector search — Dense semantic retrieval using nomic-embed-text-v1.5 running locally via ONNX. 768-dimensional embeddings, CPU or GPU. No external API.

  • QJL two-pass vector search — At larger corpus sizes, vector search dominates retrieval latency. Aingram uses a Quantized Johnson-Lindenstrauss (QJL) two-pass approach: a fast first pass over compressed quantized vectors narrows the candidate pool, then a precise second pass over full float32 vectors reranks the survivors. This trades a small fraction of recall for significantly lower latency at scale — the break-even point is around 30K entries, above which QJL is faster than brute-force float32 search with no meaningful quality loss.

  • Knowledge graph traversal — Entities and relationships extracted from memory entries. Multi-hop queries resolved via CTE. "What did Alice decide about auth?" finds the entity, traverses relationships, returns relevant entries — even if the query didn't match the entry verbatim.

  • Reciprocal Rank Fusion — Results from all three signals are combined and re-ranked. Each signal's rank position, not raw score, contributes to the final order. This makes the fusion robust to scale differences between signals.

Example Usage

Store and recall memories using the simple Python API:

from aingram import MemoryStore

with MemoryStore('./agent_memory.db') as mem:
    mem.remember('The API rate limit is 100 req/min.')
    mem.remember('Deployment takes ~3 min.')

    results = mem.recall('what do I need to know before deploying?', limit=5)
    for r in results:
        print(r.score, r.entry.content)

Knowledge Graph Integration

Aingram facilitates entity extraction to create a knowledge graph, enabling intelligent querying across interconnected entries based on their relationships rather than just textual matches:

results = mem.recall('what did Alice decide?', limit=5)

MCP Server Option

Integrate Aingram with any compatible agent using the MCP server functionality, allowing for advanced interactions and context management.

Conclusion

Aingram (Lite) serves as a robust foundation for building intelligent agents capable of maintaining context-rich memory with reliable performance, all without cloud reliance. Currently developing a Pro version, promising more advanced features and capabilities in agent memory management.

For more information, visit the official website or join the community on Discord.

0 comments

No comments yet.

Sign in to be the first to comment.