MyOCR is a Python package tailored for the development of efficient OCR systems. It offers an end-to-end workflow integrating detection and recognition while being modular and extensible. With support for ONNX runtime, it ensures fast inference, making it ideal for real-world applications.
MyOCR is an advanced Python package designed to facilitate the development of efficient, production-ready Optical Character Recognition (OCR) systems. This tool enables developers to effortlessly train, customize, and deploy deep learning models, creating high-performance OCR pipelines suitable for various real-world applications.
Key Features
- End-to-End OCR Workflow: Integrate detection, recognition, and an array of models seamlessly.
- Modular & Extensible: Easily mix and match components to suit your project's needs, including model swaps and customizable input-output converters.
- Optimized for Production: Leverage ONNX runtime support for enhanced performance on both CPU and GPU.
- Smart Structured Outputs: Transform raw OCR results into well-organized formats, such as invoices and forms, improving data accessibility.
- Developer-Centric: Access clear Python APIs, pre-built pipelines, and straightforward custom training options.
Recent Updates
The alpha version of MyOCR was released on April 24, 2025, featuring integrated image detection, class recognition models, and complete functionality across all components.
Quick Start Guide
Example usage for basic OCR recognition:
from myocr.pipelines import CommonOCRPipeline
# Initialize common OCR pipeline (using GPU)
pipeline = CommonOCRPipeline("cuda:0") # Use "cpu" for CPU mode
# Perform OCR recognition on an image
result = pipeline("path/to/your/image.jpg")
print(result)
For structured OCR output aimed at extracting invoice information, the setup is as follows:
chat_bot:
model: qwen2.5:14b
base_url: http://127.0.0.1:11434/v1
api_key: 'key'
from pydantic import BaseModel, Field
from myocr.pipelines import StructuredOutputOCRPipeline
# Define output data model, refer to:
from myocr.pipelines.response_format import InvoiceModel
# Initialize structured OCR pipeline
pipeline = StructuredOutputOCRPipeline("cuda:0", InvoiceModel)
# Process image and obtain structured data
result = pipeline("path/to/invoice.jpg")
print(result.to_dict())
MyOCR also comes with a REST API service for easier integration with web applications, which can be initiated using:
# Start the service (default port: 5000)
python main.py
Docker Support
MyOCR includes Docker deployment capabilities, allowing seamless builds and runs with provided scripts, accommodating both CPU and GPU setups for flexibility in resource allocation.
No comments yet.
Sign in to be the first to comment.