PitchHut logo
Distributed File Storage Service
Self-hosted, S3-compatible file storage with robust features.
Pitch

This project offers a complete S3-compatible file storage solution that runs on your own hardware. With features like erasure coding, JWT authentication, and Kubernetes support, it provides all the benefits of S3 without the cloud expenses or vendor lock-in. Enjoy full control and transparency in your storage solutions.

Description

Distributed File Storage Service

The Distributed File Storage Service is an S3-compatible object storage solution that enables users to leverage cloud storage capabilities on their own hardware, eliminating the costs associated with public cloud services. With features such as erasure coding, replication, and a comprehensive metadata store powered by PostgreSQL, this service allows for efficient and reliable data management.

Key Features

  • S3-Compatible API: Supports all the essential functionalities of AWS S3 including buckets, objects, multipart uploads, and range requests.
  • Robust Authentication: Secure JWT login with refresh tokens and RBAC (Role-Based Access Control) for effective access management.
  • Advanced Storage Solutions: Implementing erasure coding (using Reed-Solomon), file chunking, SHA-256 deduplication, and data replication to ensure data integrity and availability.
  • Performance Monitoring: Integrates Prometheus for metrics collection alongside structured logging and audit trails for better observability.
  • Seamless Deployment Options: Easily deployable via Docker Compose, Kubernetes manifests, or local development setups.

Cost-Effective Alternative to AWS S3

Moving to self-hosted storage can significantly reduce costs while providing full control over data:

Cost ComparisonAWS S3Distributed File Storage Service
Monthly Cost~$23 for 1 TB/monthFree (utilizes existing hardware)
Vendor Lock-inYesNo
Erasure CodingNoYes (Reed-Solomon)
Pricing TransparencyOpaque100% Transparent

Quick Start

To set up and validate the service, execute the following commands:

# 1. Clone the repository
git clone https://github.com/aman179102/distributed-file-storage-service.git
cd distributed-file-storage-service

# 2. Start services
docker compose up -d

# 3. Verify the service health
curl http://localhost:8080/health

# 4. User login 
curl -s -X POST http://localhost:8080/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"admin@example.com","password":"admin123"}' | jq .

# 5. Create a new bucket
TOKEN="<your-jwt-token>"
curl -s -X POST http://localhost:8080/api/v1/buckets \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"my-first-bucket"}' | jq .

# 6. Upload a file
echo "Hello S3!" > test.txt
curl -s -X PUT "http://localhost:8080/api/v1/buckets/1/objects/test.txt" \
  -H "Authorization: Bearer $TOKEN" \
  --data-binary @test.txt | jq .

# 7. Download the file
curl -s "http://localhost:8080/api/v1/buckets/1/objects/test.txt" \
  -H "Authorization: Bearer $TOKEN"

# 8. Access metrics via Prometheus
curl http://localhost:8080/metrics | head -20

Architecture Overview

The service is built with a modular architecture to facilitate easy updates and integration:

┌──────────────────────────────────────────┐
│            HTTP/REST API                 │
│       (S3-Compatible Endpoints)          │
├──────────────────────────────────────────┤
│      Auth & Authorization Layer          │
│       JWT + RBAC + IAM Policies          │
├──────────────────────────────────────────┤
│        Business Logic Layer               │
│  File Service · Bucket Service · Policy  │
├──────────────────────────────────────────┤
│           Storage Layer                   │
│  Chunking · Dedup · Erasure · Replication│
├──────────────────────────────────────────┤
│     Infrastructure                       │
│  PostgreSQL │ Redis │ Local Disk Store   │
├──────────────────────────────────────────┤
│     Observability                        │
│  Metrics │ JSON Logs │ Audit Trails      │
└──────────────────────────────────────────┘

Conclusion

This Distributed File Storage Service is ideal for those looking for a reliable, cost-effective solution for managing file storage with S3 compatibility. It harnesses the power of open-source technology to provide a production-grade environment suitable for enterprises.

0 comments

No comments yet.

Sign in to be the first to comment.