LeakCTL is an innovative platform designed to automatically capture and analyze .txt files shared in Telegram channels. By integrating Telegram monitoring with Elasticsearch, it offers powerful search capabilities and a secure web interface tailored for security researchers and SOC teams to efficiently track and manage data leaks.
LeakCTL: Automated Data Leak Detection and Analysis Platform
LeakCTL is an innovative monitoring system that facilitates the automated capture of .txt
files shared in Telegram channels. It efficiently extracts file contents and indexes them into Elasticsearch, providing a robust web interface for effective analysis and full-text search capabilities, thus promoting data leak detection and analysis.
Designed specifically for security researchers and Security Operation Center (SOC) teams, LeakCTL streamlines the process of tracking and analyzing data leaks in a scalable, automated, and centralized manner.
Key Features
-
Telegram Agent (Userbot)
- Monitors Telegram groups and channels for
.txt
file leaks. - Automatically downloads, parses, and indexes contents into Elasticsearch.
- Avoids duplication by skipping already-processed files.
- Supports batch uploads with progress tracking and retry logic.
- Includes Telegram-based logging and alerting features.
- Monitors Telegram groups and channels for
-
Web Interface
- Secure user login system with session management and rate limiting.
- Full-text wildcard search across indexed leak contents via Elasticsearch.
- One-click export of search results as
.txt
files. - Live pagination for navigating large result sets.
- Streamlined user interface with role-based access restrictions.
-
Elasticsearch Integration
- Manages custom Index Lifecycle Management (ILM) policies and index templates.
- Features daily index rollover and automatic deletion of outdated indices.
- Optimized for high-performance ingestion and search capabilities even with millions of indexed entries.
-
Docker-Ready Deployment
- Easily set up with
docker-compose
, including installation scripts for Elasticsearch and Docker.
- Easily set up with
System Architecture
LeakCTL consists of three primary components that work cohesively for automated data leak detection:
-
Telegram Userbot (
user_bot.py
)- Operates as a headless user on Telegram, scanning target chats for any
.txt
file leaks. - Downloads and parses each file line-by-line, preparing it for upload.
- Uploads the organized data to an Elasticsearch index named
leaks
.
- Operates as a headless user on Telegram, scanning target chats for any
-
Elasticsearch Backend
- Contains all extracted lines paired with relevant metadata such as
file_name
,line_number
, and@timestamp
. - Implements ILM to automate daily index rollover and deletion of obsolete indices, ensuring optimal performance for full-text search.
- Contains all extracted lines paired with relevant metadata such as
-
Flask Web Interface (
app.py
)- Provides a secure frontend for analysts to search and view leaked contents in detail.
- Supports full-text wildcard search and pagination to manage extensive datasets efficiently.
- Results can be conveniently exported as downloadable
.txt
files, with built-in protections against spam through rate limiting and session expiry.
UML Diagram Summary
uml_use_case_diagram.png
illustrates user interactions with the system, including authentication and search operations.uml_class_diagram.png
outlines the class architecture of key components, such asUser
, the search logic, and file processing mechanisms.uml_class_academic_diagram.png
offers a higher-level overview of the modular interactions between components.
For complete architectural diagrams, refer to /uml_*_diagram.png
in the repository.
No comments yet.
Sign in to be the first to comment.