compact-dict offers an innovative approach to dictionary data structures in Rust, focusing on customizable layouts and cache-friendly performance. By utilizing linear probing and ensuring contiguous memory layout, it enhances CPU cache efficiency while minimizing pointer chasing, making it ideal for high-performance applications.
compact-dict is an experimental, highly customizable, open-addressing dictionary implemented in Rust. It is designed to optimize performance and memory usage, heavily inspired by the principles of Mojo's Dict. This dictionary utilizes linear probing and maintains a continuous memory layout for string keys, significantly enhancing speed and cache performance for suitable workloads.
Key Advantages of compact-dict
Modern processors tend to perform poorly when faced with pointer chasing, which can lead to cache misses. By ensuring that data remains contiguous, compact-dict minimizes fragmentation and enhances retrieval efficiency.
Features
- Customizable Layout: Offers generic integer types for key-count (
KC) and key-offset (KO) sizes, allowing users to optimize memory based on specific requirements. - Cache Friendly: A linear probing mechanism coupled with optional cached hashes enhances performance.
- SIMD Preparation: Ready for SIMD-accelerated probing using
portable_simd. - Configurable Mutability: Supports optional destructive operations through a constant generic parameter for efficient deletion masking.
- Performant Hash Function: Utilizes a high-performance internal hasher based on aHash by default.
Usage Example
Here’s how to create and use a compact-dict instance:
use compact_dict::dict::Dict;
fn main() {
let mut map: Dict<u32> = Dict::new(8);
// Inserting key-value pairs
map.put("hello", 42);
map.put("world", 100);
// Retrieving values
assert_eq!(map.get_or("hello", 0), 42);
assert_eq!(map.get_or("world", 0), 100);
assert_eq!(map.get_or("missing", 0), 0);
// Checking the count of elements
assert_eq!(map.contains("hello"), true);
assert_eq!(map.len(), 2);
map.clear();
assert_eq!(map.len(), 0);
}
Performance Analysis
compact-dict excels in scenarios where rapid string ingestion is required, such as data analysis or short-lived workloads. Its design is specifically optimized for high throughput, particularly in read-heavy environments.
Strengths
- Continuous Memory Management: It minimizes heap allocations for each unique key by utilizing a single dense
Vec<u8>for string storage. - Optimized Lookups: Leverages SIMD capabilities for effective hash comparisons.
- Ideal for Static Datasets: Preserves speed in append-only or static workloads, making it less suited for applications that require frequent deletions.
Trade-offs
While compact-dict outperforms many other hashing implementations in specific contexts, it also comes with certain constraints:
- Limited Deletion Capability: Designed primarily for append-only operations, making it less suitable for cases requiring regular deletions.
- Generic Key Limitations: Focused on string keys (
&str), which may not be compatible with all data types. - Nightly Rust Dependency: Requires the nightly Rust toolchain for certain features and optimizations, which may limit accessibility for some users.
In summary, compact-dict is engineered for high performance in specialized scenarios and serves as a compelling alternative in the Rust ecosystem for developers seeking an efficient dictionary solution.
No comments yet.
Sign in to be the first to comment.