PaddleOCR: Optical character recognition and document parsing
Multilingual OCR toolkit with document structure extraction.
Learn more about PaddleOCR
PaddleOCR is an optical character recognition system implemented in Python using the PaddlePaddle deep learning framework. It combines text detection and recognition models to process document images end-to-end, extracting both raw text and structured layout information. The toolkit includes pre-trained models for multiple languages, handwriting detection, and document structure analysis (tables, forms, key-value pairs). Common deployment scenarios include document digitization pipelines, PDF extraction for RAG systems, and integration with language models for document understanding tasks.
Multi-Language Pre-Trained Models
Ships with production-ready models for 100+ languages including CJK, Arabic, and Latin scripts. Eliminates cold-start training and dataset collection for most deployment scenarios.
Modular Detection-Recognition Pipeline
Decouples text localization from character recognition into swappable components. Enables per-region model selection and independent optimization of detection versus recognition accuracy.
Document Structure Extraction
Parses tables, forms, and key-value pairs beyond raw text output. Produces structured JSON suitable for direct ingestion into RAG pipelines or database workflows.
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en')
result = ocr.ocr('invoice.jpg', cls=True)
for line in result[0]:
text = line[1][0]
confidence = line[1][1]
print(f"{text} (confidence: {confidence:.2f})")PaddleOCR-VL-1.5 is a new iterative version of the PaddleOCR-VL series. Based on comprehensive optimization of the core capabilities of version 1.0, t
- –Release the PaddleOCR-VL-1.5 complex document parsing solution.**
- –Add support for calling MLX-VLM inference services.
- –PaddleOCR-VL now supports cross-page table merging and multi-level heading reconstruction.
- –PP-StructureV3 adds support for the `formatblockcontent` and `markdownignorelabels` parameters.
- –Fixed an issue where accessing the `/docs` endpoint in the official PaddleOCR-VL image would result in an error.
PaddleOCR-VL now supports specifying custom model names and API keys, and can seamlessly integrate w
- –PaddleOCR-VL now supports specifying custom model names and API keys, and can seamlessly integrate with inference services from third-party platforms such as SiliconFlow and Novita AI.
- –The PP-StructureV3 MCP Server supports using hosted services on the Qianfan platform as the underlying inference engine.
- –The documentation for PP-OCRv5 and PaddleOCR-VL has been comprehensively improved, with known errors fixed to enhance readability and accuracy.
- –Added support for inference on Muxi GPUs, further expanding hardware compatibility and deployment flexibility.
- –PaddleOCR-VL 现已支持指定自定义模型名称与 API Key,并可无缝对接硅基流动、Novita AI 等第三方平台的推理服务。
2025.11.13 v3.3.2 released Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.3.1.
- –2025.11.13 v3.3.2 released Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.3.1...v3.3.2
See how people are using PaddleOCR
Related Repositories
Discover similar tools and frameworks used by developers
PyTorch
Python framework for differentiable tensor computation and deep learning.
LLaMA-Factory
Parameter-efficient fine-tuning framework for 100+ LLMs.
Civitai
Community platform for sharing Stable Diffusion models, embeddings, and AI generation assets.
StabilityMatrix
Multi-backend inference UI manager with embedded dependencies.
Wan2.2
Open-source diffusion framework for multi-modal video generation.