Bouncy is a minimal yet powerful headless browser built in Rust, designed for seamless web scraping. It operates without dependencies, enables JavaScript execution when needed, and offers multiple functions including command-line access, making it suitable for developers looking for an efficient scraping solution.
Overview
Bouncy is a lightweight, headless web browser built in Rust, designed to facilitate scraping with minimal setup requirements. This performance-oriented tool allows users to easily extract HTML, text, and links from web pages, effectively handling JavaScript rendering as needed. It operates as a single executable binary, eliminating the need for additional runtimes like Node.js, Chrome, or Python.
Key Features
- Multiple Modes of Operation: Execute commands via the CLI with
bouncy fetchorbouncy scrape; utilize thebouncy-mcpfor Model Context Protocol interfacing with applications like Claude; or run a Chrome DevTools Protocol (CDP) server withbouncy serve. All binaries are included in the same release. - Minimal Overhead: With approximately 10–21 MB of memory usage per page and a total binary size around 40 MB including V8, Bouncy ensures efficient resource management.
- Lazy Loading of V8: V8, the JavaScript engine, only activates when necessary, resulting in impressive cold start times ranging from 3–6 ms for static pages, and 30–80 ms for JavaScript-heavy pages.
- Built-in Stealth Features: Automatically masks browser signatures and randomizes fingerprints across sessions, enhancing scraping stealth without needing configuration.
- Cross-Platform Compatibility: Available binaries for Linux x86_64, macOS (Apple Silicon and Intel), and Windows, ensuring accessibility across various operating systems.
Performance Comparisons
When compared to Playwright and similar tools, Bouncy stands out with faster cold start times (3–6 ms), significantly lower memory consumption (10–21 MB per page), and an environment free from heavy runtimes. While Playwright provides extensive layout and rendering capabilities, Bouncy excels in scenarios where a lighter DOM + JS approach suffices.
Usage Examples
Fetching a Page
bouncy fetch https://example.com --dump html
bouncy fetch https://example.com --dump links
bouncy fetch https://example.com --dump text
Running JavaScript
bouncy fetch https://news.example.com --eval "document.title"
Sending a POST Request with Cookies
bouncy fetch https://api.example.com/x -X POST -H 'Authorization: Bearer …' --body '{"hello":"world"}' --cookie-jar ./jar.json
Utilizing Bouncy in a Project
Developers can incorporate Bouncy's functionalities into their Rust applications by including specific crates as dependencies. This modular approach allows for targeted usage without unnecessary overhead:
[dependencies]
bouncy-fetch = "0.1"
bouncy-extract = "0.1"
bouncy-js = "0.1"
bouncy-cdp = "0.1"
bouncy-dom = "0.1"
A simple example of fetching a page title would look like this:
use bouncy_fetch::Fetcher;
use bouncy_extract::extract_title;
let fetcher = Fetcher::new()?
let resp = fetcher.get("https://example.com").await?;
let title = extract_title(&resp.body)?;
println!("{:?}", title); // Some("Example Domain")
Conclusion
Bouncy is ideal for developers and data scientists looking for an efficient and versatile web scraping solution with a simple installation process and robust performance. For those who require a more complex setup with advanced rendering capabilities, integrating with tools like Playwright may be more suitable.
No comments yet.
Sign in to be the first to comment.