AV Chaos Monkey is a distributed chaos engineering platform designed for testing video conferencing systems. It simulates over 1500 WebRTC participants and applies network conditions like packet loss and jitter to evaluate system resilience. Users can easily manage chaos events through a REST API, ensuring robust performance under real-world stress.
AV Chaos Monkey is a distributed chaos engineering platform specifically designed for load testing audio and video conferencing systems, such as WebRTC applications. With the ability to simulate over 1500 WebRTC participants, it implements H.264 video and Opus audio streams while injecting various types of network chaos, ensuring robust validation of system resilience under adverse conditions.
Key Features
Media Processing Pipeline
- FFmpeg converts input video into H.264 Annex-B and Ogg/Opus formats, ensuring efficient media processing at startup.
- NAL Reader parses the H.264 stream, while Opus Reader extracts audio frames from the Ogg container, caching frames in memory to minimize CPU load (approximately 90% reduction compared to per-participant encoding).
Control Plane
The control plane features an HTTP server that manages the lifecycle of tests via a REST API, including a Spike Scheduler that distributes chaos events such as:
- Packet loss (1-25%)
- Jitter (10-50ms)
- Bitrate reduction (30-80%)
- Frame drops (10-60%)
Participant Pool
Each participant generates RTP streams and is auto-partitioned across pods, achieving scalable testing from local setups to Kubernetes orchestrations that handle up to 1500 participants.
Chaos Injection Strategies
Inject chaos using five different spike types:
- Packet Loss: Drops RTP packets to mimic real-world conditions.
- Network Jitter: Introduces latency variation for a more realistic test environment.
- Bitrate Reduction: Simulates varying network bandwidth by throttling video encoding.
- Frame Drops: Skips video frames randomly to test resilience to data loss.
- Bandwidth Limiting: Imposes caps on the overall throughput.
Observability Stack
Utilize Prometheus for real-time monitoring metrics, with metrics including participant counts, packet statistics, and network conditions visualized in Grafana. This setup gives detailed insight into the performance and health of the test environment.
Client Integration
AV Chaos Monkey facilitates easy integration with existing video call systems. It supports an UDP Receiver for aggregated RTP streams and a WebRTC Receiver for establishing direct WebRTC connections. Both options enable the validated testing of Synchronized Frame Unit (SFU), Multi-point Control Unit (MCU), or Mesh architectures.
Deployment Options
AV Chaos Monkey can be run in various environments:
- Local Development: Perfect for development and debugging small-scale tests (1-100 participants).
- Docker Compose: Ideal for CI/CD and medium-scale tests (100-500 participants).
- Kubernetes: Best for large-scale tests, supporting 500-1500 participants with efficient horizontal scaling.
By leveraging AV Chaos Monkey, teams can rigorously assess their video conferencing systems under diverse network conditions, ensuring better preparedness and reliability in critical scenarios. For detailed API references, configuration options, and examples of chaos test execution, consult the repository README.
No comments yet.
Sign in to be the first to comment.