Embershard is a macOS chat application featuring its own native LLM inference engine, designed specifically for Apple Silicon. It offers an efficient and focused computing environment, allowing for seamless interaction while maintaining complete control over graph construction and execution for Llama and Qwen architectures.
Embershard is a specialized native language model (LLM) inference engine and chat application designed for macOS and Apple Silicon. This project provides an efficient and streamlined solution for executing chat applications using advanced language models while maintaining simplicity and performance.
Key Features
- Native Inference Engine: Embershard operates independently from other libraries during inference, ensuring a focused and efficient processing of models. It supports the
llamaandqwen2families, including Llama 3.x, Mistral, and Qwen 2.5. - Optimized Performance: The design includes a resident Key-Value (KV) cache and a custom byte-level BPE and SentencePiece tokenizer. This results in enhanced performance and throughput, matching or exceeding other implementations based on memory bandwidth.
- Proven Accuracy: The model achieves logit parity with existing systems, ensuring consistent and reliable outputs across multiple conversations without losing context, even during prolonged interactions.
Efficient Tokenization and Model Loading
Embershard features a robust loading mechanism for GGUF files, including support for sharded GGUFs and compatibility checks based on available system resources. This ensures that only suitable models are loaded and enhances overall user experience.
User-Friendly Interface
Built using SwiftUI, the Embershard application provides a contemporary interface featuring multiple chat modes such as Standard, Agentic (for multi-step tasks), and Arena (for concurrent responses from multiple models). Each chat session can be customized with specific skills, making it versatile and adaptable to different use cases.
Example Usage
To run the engine and generate responses, no additional libraries are needed. Here’s a quick example:
./build/gen_gx /path/to/model.gguf "My name is Alice." "What is my name?"
Conclusion
As a project rooted in the principles of independence and modularity, Embershard represents a powerful tool for developers and enthusiasts looking to explore LLM capabilities in a macOS environment. Explore the potential of language models with Embershard, where performance meets simplicity.
No comments yet.
Sign in to be the first to comment.