dbslice simplifies the process of debugging by allowing the extraction of minimal, referentially intact subsets from production databases. Instead of copying entire databases, dbslice intelligently retrieves only the necessary records, maintaining relationships and integrity, making it a vital tool for developers who need precise data to replicate and solve bugs.
dbslice
dbslice is a powerful tool designed to help developers extract minimal, referentially-intact database subsets for use in local development and debugging. It addresses the challenge of reproducing bugs that require specific data from production databases while avoiding the inefficiency of copying an entire database.
The Problem
Recreating bugs often necessitates the exact records that caused them, which can be difficult to obtain when working with large production databases. dbslice simplifies this process by allowing users to extract only the necessary components by following foreign key relationships, thereby maintaining referential integrity.

Quick Start
To get started:
# Extract an order and all related records
dbslice extract postgres://localhost/myapp --seed "orders.id=12345" > subset.sql
# Import into local database
psql -d localdb < subset.sql
Key Features
- Zero-Configuration Setup: dbslice introspects the database schema automatically, negating the need for a data model file.
- Effortless Data Extraction: A single command is all it takes to extract complete data subsets.
- Sensitive Data Handling: Automatically detects and anonymizes sensitive fields such as emails, phone numbers, and social security numbers by default.
- Multiple Output Formats: Supports exporting data in SQL, JSON, and CSV formats, catering to various use cases.
- Efficient Streaming: Optimizes memory use during the extraction of large datasets, accommodating 100K+ rows seamlessly.
- Virtual Foreign Keys: Enables handling of Django GenericForeignKeys and implicit relationships through configuration.
- Configurable Extractions: Provides an option for YAML-based configuration files, allowing for repeatable processes.
- Data Validation: Ensures referential integrity of the extracted dataset to avoid potential issues.
Database Support
| Database | Status |
|---|---|
| PostgreSQL | Fully supported |
| MySQL | Planned (not yet implemented) |
| SQLite | Planned (not yet implemented) |
Example Usages
Basic Extraction:
# Extract by primary key
dbslice extract postgres://user:pass@host:5432/db --seed "orders.id=12345"
# Extract with WHERE clause
dbslice extract postgres://localhost/db --seed "orders:status='failed' AND created_at > '2024-01-01'"
Anonymization Example:
# Auto-anonymize detected sensitive fields
dbslice extract postgres://... --seed "users.id=1" --anonymize
Output Formats:
# SQL (default)
dbslice extract postgres://... --seed "orders.id=1" --output sql
# JSON fixtures
dbslice extract postgres://... --seed "orders.id=1" --output json --out-file fixtures/
How It Works
- Introspection: Reads the database schema to discover tables and foreign key relationships.
- Traversal: Begins extraction from designated seed records, following foreign key relationships.
- Data Extraction: Collects all identified records.
- Sorting: Orders the tables correctly for insertion.
- Output Generation: Produces the final dataset in the specified formats with appropriate data handling.
With dbslice, the daunting task of preparing subsets for debugging becomes a streamlined process, enabling developers to focus more on resolving issues rather than managing data extraction complexities.
No comments yet.
Sign in to be the first to comment.