Navigate:
Tesseract
~$TESSE0.2%

Tesseract OCR: Open source optical character recognition engine

LSTM-based OCR engine supporting 100+ languages.

LIVE RANKINGS • 10:20 AM • STEADY
OVERALL
#180
23
AI & ML
#64
1
30 DAY RANKING TREND
ovr#180
·AI#64
STARS
72.6K
FORKS
10.5K
7D STARS
+156
7D FORKS
+14
Tags:
See Repo:
Share:

Learn more about Tesseract

Tesseract is an open-source optical character recognition engine that converts images containing text into machine-readable character data. The system employs a Long Short-Term Memory neural network architecture as its primary recognition engine, processing text line images through multiple layers that analyze character patterns and linguistic context to produce accurate transcriptions. It maintains a modular design that supports over 100 languages through trained data models, processes standard image formats, and generates structured output in multiple document formats including searchable PDFs and XML-based representations. The engine implements a multi-stage pipeline that performs image preprocessing, layout analysis to detect text regions, line segmentation, and finally character recognition through the neural network. Originally developed at Hewlett-Packard and later maintained by Google, it balances recognition accuracy with processing speed by leveraging both statistical language models and neural network predictions.

Tesseract

1

Dual Recognition Engines

Includes both LSTM neural network and legacy pattern recognition engines with runtime switching via --oem flag. Enables modern accuracy while maintaining compatibility with older trained models and specialized use cases.

2

100+ Language Support

Recognizes text in over 100 languages out-of-the-box using pre-trained data files. Custom language training supported through documented pipeline for specialized fonts, domains, or historical scripts.

3

Multiple Output Formats

Generates plain text, hOCR with positioning data, searchable PDFs, TSV structured output, and PAGE/ALTO XML. Integrates directly into document processing workflows without format conversion layers.


import pytesseract
from PIL import Image

image = Image.open('document.png')
text = pytesseract.image_to_string(image)
print(text)

v5.5.2

Code simplification and build improvements with cmake optimizations.

  • Simplify code for osdetect
  • Fix and improve configuration for cmake builds
  • Modernize some for loops and fix some signed/unsigned issues
  • Cmake optimization with warp2
v5.5.1

Fixed random number generator and improved template classes with CLI enhancements.

  • Fix linear congruential random number generator
  • Make list classes templated
  • Add cli -c parameter(s) to init vectors
  • Handle colormaps correctly
  • Use constexpr for kDawgMagicNumber
v5.5.0

Fixed static linking issues and improved installation with better XML output handling.

  • Fix TARGETPDBFILE error for static linking
  • Make regular usage of CMAKE_INSTALL_LIBDIR and GNUInstallDirs
  • Ignore illegal TESSDATA_PREFIX (not existing filesystem entry)
  • Fix confidence output for the PAGE XML renderer
  • Set hOCR capabilities ocr_pdir and ocr_plang unconditionally


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers