Navigate:
Paperless-ngx
~$PAPE0.9%

Paperless-ngx: Document management system with OCR

Self-hosted OCR document archive with ML classification.

LIVE RANKINGS • 11:46 AM • STEADY
TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50
OVERALL
#46
11
AI & ML
#27
6
30 DAY RANKING TREND
ovr#46
·AI#27
STARS
37.0K
FORKS
2.3K
7D STARS
+324
7D FORKS
+20
Tags:
See Repo:
Share:

Learn more about Paperless-ngx

Paperless-ngx is a self-hosted document management system built with Django backend and Angular frontend. It processes scanned documents and PDFs through OCR pipelines to extract text and metadata, storing indexed documents in a searchable database. The system supports document classification through machine learning models, custom tagging schemes, and metadata assignment. It is typically deployed using Docker Compose and can be integrated with document scanners for automated ingestion workflows.

Paperless-ngx

1

Full-Text OCR Search

Documents are processed through OCR pipelines to extract and index all text content. Enables searching across document bodies, not just filenames or manual metadata, making large archives instantly queryable.

2

Docker-First Deployment

Ships with Docker Compose configurations for self-hosted deployment with minimal setup. Includes automated migration paths from legacy Paperless versions and supports multiple installation methods.

3

Community-Maintained Fork

Actively developed by an open community after the original project ended maintenance. Regular releases incorporate user contributions, security patches, and feature requests through collaborative governance.


import requests

url = "http://localhost:8000/api/documents/post_document/"
files = {"document": open("invoice.pdf", "rb")}
headers = {"Authorization": "Token YOUR_API_TOKEN"}

response = requests.post(url, files=files, headers=headers)
print(f"Document uploaded: {response.json()['id']}")


vv2.20.6

Security fixes and improvements for document management, tagging performance, and date calculations.

  • Fix: extract all ids for nested tags
  • Fix: prevent note deletion outside doc
  • Performance: improve treenode inefficiencies
  • Fix: change date calculation for 'this year' to include future documents
  • Fix: Running management scripts under rootless could fail
vv2.20.5

Fixes UI display issues with tag names and workflow action ordering.

  • Fix: ensure horizontal scroll for long tag names in list, wrap tags without parent
  • Fix: use explicit order field for workflow actions
vv2.20.4

Security fixes and improvements for metadata handling, database validation, and workflow functionality.

  • Fix: propagate metadata override created value
  • Fix: support ordering by storage path name
  • Fix: validate cf integer values within PostgreSQL range
  • Fix: add error handling and retry when opening index
  • Fix: fix recurring workflow to respect latest run time

See how people are using Paperless-ngx

Loading tweets...


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers