Nyquest is a powerful semantic compression proxy for LLMs, designed to reduce token usage by 15–75% while preserving meaning. Featuring 350+ compiled rules and a local LLM stage, it is compatible with popular models and easy to install. Streamline your workflows without compromising intent or context.
Nyquest: Semantic Compression Proxy for LLMs
Nyquest is a sophisticated semantic compression proxy specifically designed for Large Language Models (LLMs). This innovative tool effectively reduces LLM token usage by 15–75% without sacrificing the intent and meaning of the prompts.
Key Features
- Drop-in proxy for integration with existing LLMs, featuring over 350 compiled rules and a local LLM semantic condensation stage using Qwen 2.5 (1.5B model).
- A simple one-shot installer that configures environment dependencies, builds the necessary components, and sets up a systemd service.
- The ability to work seamlessly with various LLM models including those from Anthropic, OpenAI, Gemini, xAI, and OpenRouter.
- A unique System preflight validation tool to check hardware and software compatibility, optimizing the performance and efficiency of token processing.
Key Benefits
- Significant Token Savings: Save valuable tokens by intelligently managing prompt structures through advanced regex rules tailored to maintain context and meaning.
- Performance: With a latency of under 2ms for rules-only processing, Nyquest ensures high throughput with up to 1,408 requests per second and efficient usage of system resources.
- Flexibility: Offers tiered hardware recommendations allowing users to optimize for different capabilities, from basic rule applications to advanced semantic compression leveraging dedicated GPU resources.
How It Works
Nyquest utilizes a six-stage pipeline for processing prompts:
- Normalizer: Resolves duplicate and conflicting instructions, optimizing prompt clarity.
- OpenClaw Agent Mode: Implements 7-strategy optimization to enhance autonomous agent performance.
- Cache Reorder: Enhances efficiency by optimizing for provider-side caching.
- Compression Engine: Applies over 350 rules across 18 categories to compress the prompts.
- Semantic LLM Stage: Condenses and refines prompts further using the Qwen 2.5 model.
- Auto-Scale + Forward: Dynamically adapts compression levels based on usage patterns for optimal performance.
Integration & API
Nyquest supports a range of integration techniques, allowing it to work behind the scenes while communicating with various LLM APIs through OpenAI-compatible standards. By specifying additional request headers, users can customize the compression level and routing parameters, enabling seamless integration into existing workflows.
Metrics & Statistics
Nyquest's robust metrics dashboard provides real-time insights into token savings, request counts, and compression effectiveness. Users can monitor the engine's performance, analyze rule application frequencies, and ensure optimal operation within desired parameters.
For detailed metrics and capabilities, users can access the metrics endpoint, which tracks every rule category employed across requests.
Conclusion
Nyquest stands out as a powerful solution for optimizing LLM token usage, offering substantial savings in both processing time and expenses. Harness its capabilities to refine your interactions with LLMs and enhance operational efficiency.
For further information, please visit nyquest.ai or refer to the documentation.
No comments yet.
Sign in to be the first to comment.