PitchHut logo
S3 File Viewer
Easily browse and query S3 files through a web interface.
Pitch

S3 File Viewer is a web application designed for efficient browsing and querying of S3 bucket data using SQL. It supports .csv and .txt files, allowing users to inspect data structures, run queries with DuckDB, and navigate through a simple interface. Perfect for those needing quick insights without complex setups.

Description

S3 File Viewer is a powerful web application designed for effortless browsing of Amazon S3 buckets and executing SQL queries on text files. Built using Go, HTMX, the AWS SDK, and DuckDB, this project benefits users who need a straightforward solution for data inspection.

Key Features

  • S3 Bucket Navigation: Explore your S3 storage using an intuitive folder-style prefix navigation that simplifies file management.
  • File Access: Support for viewing .csv and .txt files exclusively, ensuring streamlined data handling.
  • Column Inspection: Automatically displays column names and types for tabular files, facilitating better understanding of the data structure.
  • SQL Query Execution: Leverage the DuckDB integration to run SQL queries directly against files through the data view, enabling powerful data analysis.
  • Raw Text Preview: For unstructured files, access a raw text preview fallback, allowing users to inspect content without SQL queries.

Configuration Options

S3 File Viewer enables customization through various environment variables:

VariableDefaultDescription
AWS_REGION(required)Specifies the AWS region for S3 access.
PORT8080Sets the HTTP listening port.
MAX_QUERY_ROWS1000Limits the number of returned rows per SQL query.
QUERY_TIMEOUT_SEC30Defines the SQL query timeout period.
BUCKET_ALLOWLIST(empty)Lists allowed bucket names (empty for all).
RAW_PREVIEW_BYTES262144Specifies the maximum bytes displayed in raw text preview.

Usage Instructions

The application offers a user-friendly interface with the following steps:

  1. Home: Enter the S3 bucket name and an optional prefix to begin browsing.

  2. Browse: Navigate through folders and select .csv or .txt files to view content.

  3. File View: For tabular files, the interface features:

    • A Columns pane that displays schema info.
    • A SQL Query pane allowing users to run queries like SELECT * FROM data LIMIT 100.
    • A Results pane to view outputs from executed queries.

    For unstructured text files, a raw preview is provided instead of the SQL interface.

SQL Execution Safety

S3 File Viewer enforces safety measures during SQL execution:

  • Only SELECT queries are permitted.
  • Multi-statement queries are not allowed.
  • Destructive commands such as DROP, INSERT, or CREATE are blocked.
  • Results are capped at the defined MAX_QUERY_ROWS limit.

Project Structure

The project is well-organized into distinct modules, including:

cmd/server/          HTTP server entrypoint
internal/config/     Environment configuration
internal/s3/         AWS S3 listing and object reads
internal/duckdb/     DuckDB pool, schema, and query execution
internal/files/      Text file detection and reader selection
internal/handlers/   HTTP handlers, templates, and static assets

S3 File Viewer serves as a robust foundation for efficiently browsing and querying files stored in Amazon S3, aiding data professionals in their analysis tasks.

0 comments

No comments yet.

Sign in to be the first to comment.