OctaneDB - A Python library designed for fast, efficient vector database solutions.

OctaneDB

A Python library designed for fast, efficient vector database solutions.

Pitch

OctaneDB is a high-performance, lightweight vector database library built in Python. It surpasses existing solutions with 10x faster performance, making it ideal for AI and ML applications. With features like advanced indexing, text embedding support, and flexible storage options, OctaneDB delivers efficient similarity search and optimized memory usage.

Description

OctaneDB is a high-performance, lightweight vector database library developed in Python, specifically engineered to deliver remarkable speed improvements when compared to existing solutions such as Pinecone, ChromaDB, and Qdrant. Users can expect 10x faster performance, achieving sub-millisecond query response times while seamlessly handling an impressive insertion rate of over 3,000 vectors per second. This library is ideally suited for AI/ML applications that demand rapid similarity searches and efficient data management.

Key Features

Performance Optimizations

Achieves 10x faster operations compared to established vector databases.
Provides sub-millisecond response times for queries.
Offers an exceptional insertion rate of 3,000+ vectors per second.
Utilizes HDF5 compression for optimized memory usage.

Advanced Indexing Capabilities

Implements HNSW (Hierarchical Navigable Small World) for rapid approximate search functionality.
Supports FlatIndex for precise similarity searches.
Allows the customization of parameters to fine-tune performance.
Features automatic index optimization for enhanced efficiency.

Text Embedding Support

Compatible API with ChromaDB for effortless migration.
Automates text-to-vector conversion through the integration of sentence-transformers.
Supports numerous embedding models, including all-MiniLM-L6-v2 and all-mpnet-base-v2, with GPU acceleration (CUDA) capabilities.
Facilitates batch processing to enhance performance.

Flexible Storage Options

Offers in-memory storage for maximum speed, along with persistent file-based storage.
Utilizes a hybrid mode to harness the advantages of both storage types.
Employs HDF5 format for efficient compression of data.

Versatile Search Features

Supports multiple distance metrics, including Cosine, Euclidean, Dot Product, Manhattan, Chebyshev, and Jaccard.
Advanced metadata filtering capabilities leveraging logical operators.
Enables batch search operations for increased efficiency.
Includes text-based search functionality with automatic embedding.

Developer-Friendly Experience

Features a simple and intuitive API similar to ChromaDB.
Comprehensive documentation enriched with practical examples.
Provides type hints throughout the code for better usability.
Comes with an extensive suite of tests to ensure reliability.

Example Usage

Here is a sample of how to utilize OctaneDB for operational tasks:

from octanedb import OctaneDB

db = OctaneDB(dimensions=384, embedding_model="all-MiniLM-L6-v2")  # Initialize
collection = db.create_collection("documents")  # Create a new collection

result = db.add(
    ids=["doc1", "doc2"],
    documents=["This is a document about pineapple", "This is a document about oranges"],
    metadatas=[{"category": "tropical", "color": "yellow"}, {"category": "citrus", "color": "orange"}]
)

results = db.search_text(query_text="fruit", k=2, filter="category == 'tropical'", include_metadata=True)
for doc_id, distance, metadata in results:
    print(f"Document: {db.get_document(doc_id)}")
    print(f"Distance: {distance:.4f}")
    print(f"Metadata: {metadata}")

Performance Benchmarks

Recent benchmark comparisons illustrate OctaneDB's superior performance:

Operation	OctaneDB	ChromaDB	Pinecone	Qdrant
Insert (vectors/sec)	3,200	320	280	450
Search (ms)	0.8	8.2	15.1	12.3
Memory Usage	1.2GB	2.8GB	3.1GB	2.5GB
Index Build Time	45s	180s	120s	95s

This project is designed for applications including AI/ML, document search, recommendation systems, image search, and NLP tasks.

For more information and to get started with OctaneDB, check out the comprehensive documentation and examples provided in the repository.

0 comments

No comments yet.

New comment