AegisFlow - Effortlessly manage AI traffic with a versatile control plane.

AegisFlow

Effortlessly manage AI traffic with a versatile control plane.

Pitch

AegisFlow offers a robust open-source control plane for your AI applications. With its single Go binary, it manages routing, security policies, cost tracking, and observability across multiple AI providers. It addresses vendor lock-in and provides seamless integration, ensuring reliability and visibility in production environments.

Description

AegisFlow is an open-source AI gateway designed for effective management, security, and observability of AI traffic. Built with Go, this production-grade control plane seamlessly integrates with various LLM providers, including OpenAI, Anthropic, and Ollama, allowing for a consistent API interface across different platforms.

Key Features

Unified AI Gateway: Centralize all AI provider interactions through a single, OpenAI-compatible API.
Intelligent Routing: Customize the routing of requests based on model name, cost, latency, or specific strategies, with automatic fallbacks in case of provider failures.
Rate Limiting & Quotas: Implement per-tenant and per-user rate limits using an efficient sliding window algorithm, ensuring fair usage without service disruption.
Robust Policy Engine: Set comprehensive input and output policies to block malicious prompts or harmful responses. This includes keyword filtering, PII detection, and custom extendable filters.
Observability: Gain insights into system performance and usage metrics through OpenTelemetry traces and structured logging via Prometheus, enabling better monitoring and alerting capabilities.
Usage Accounting: Track token usage and cost estimation on a per-request basis, supporting per-tenant usage aggregation for accurate billing and budget management.
Multi-Tenant Architecture: Effectively manage multiple tenant environments with isolation, customizable access controls, and distinct rate limits per user or application.

Performance Metrics

AegisFlow has been benchmarked effectively, achieving over 58,000 requests per second with incredibly low latency:

Metric	Value
Throughput	58,000+ requests/sec
p50 Latency	1.1 ms
p95 Latency	4.2 ms
p99 Latency	7.3 ms
Memory Usage	~29 MB RSS after 10K requests
Binary Size	~15 MB

Configuration

Configuration management is done through a straightforward YAML file, which enables users to customize their settings easily. Basic configurations include API server settings, provider definitions, and policy management, promoting easy deployment and scaling in production environments.

Getting Started

Users can quickly deploy AegisFlow using Docker Compose or by building it locally. Comprehensive command examples are available in the documentation to facilitate health checks, chat completions, and monitoring API usage effectively.

Future Roadmap

AegisFlow is continuously evolving with planned enhancements such as streaming policy checks, persistent usage storage, an admin dashboard for better management, and advanced routing capabilities across multi-cluster scenarios.

This project embraces contributions from the community to expand its functionality, improve documentation, or enhance its user interface. Explore the contributing guidelines to participate.

Conclusion

AegisFlow serves as a robust solution for developers looking to manage AI provider traffic efficiently while maintaining visibility and control over usage, security, and performance. For additional details, access the full documentation and explore the architecture, API specifications, and examples.

0 comments

No comments yet.

New comment