PyMuPDF: Python library for PDF document processing
Python bindings for MuPDF document processing library.
Learn more about PyMuPDF
import fitz\ndoc = fitz.open("document.pdf")\nprint(doc.page_count)
MuPDF-based architecture
Built as Python bindings to MuPDF, a lightweight C toolkit maintained by Artifex Software. This approach provides direct access to a mature rendering engine without reimplementing PDF parsing logic in Python.
Multi-format support
Handles PDF, XPS, EPUB, and other document formats through a single API. The underlying MuPDF engine provides native support for these formats rather than relying on format-specific libraries.
Optional feature extensibility
Core functionality requires no external dependencies, while optional features like font subsetting (fontTools) and OCR (Tesseract) can be added independently. This allows users to install only the capabilities they need.
import pymupdf
doc = pymupdf.open("report.pdf")
page = doc[0]
text = page.get_text()
print(text)
doc.close()Drops Python 3.9 support; minimum version is now 3.10. Upgrades to MuPDF 1.26.11 and fixes five reported issues.
- –Pin Python ≥3.10 before upgrading; 3.9 is no longer supported.
- –Review issues #4699, #4712, #4720, #4742, #4746 if you encountered related bugs.
Drops Python 3.8, adds 3.14 support, removes duplicate Shape class; upgrades to MuPDF 1.26.10 with five bug fixes.
- –Pin Python ≥3.9 and ≤3.14; Python 3.8 is no longer supported.
- –Remove references to pymupdf.utils.Shape (duplicate removed); use pymupdf.Shape directly.
Maintenance release upgrading to MuPDF 1.26.7 and fixing 11 reported issues; no breaking changes noted.
- –Upgrade to MuPDF 1.26.7 and apply fixes for issues #3806, #4388, #4457, #4462, #4533, #4565, #4571, #4590, #4614, #4639.
- –Use new Page.clip_to_rect() method for clipping operations; experimental Graal support and improved Tesseract data search included.
See how people are using PyMuPDF
Top in Developer Tools
Related Repositories
Discover similar tools and frameworks used by developers
pino
Low-overhead streaming JSON logger for Node.js.
corepack
Enforces package manager versions specified in package.json.
crush
LLM-powered coding agent with LSP and MCP integration.
ddt4all
Open-source Python application for vehicle diagnostics, ECU parameter modification, and CAN bus communication with Renault and Dacia cars.
json-server
Zero-config REST API server from JSON files.