Overview
Discover what makes Papermerge powerful
Papermerge DMS is a fully‑self‑hosted document management system engineered for scanned archives. From a developer’s standpoint, it exposes a clean RESTful API (OpenAPI compliant) that mirrors the rich UI features: hierarchical folders, tagging, custom metadata fields, and full‑text search powered by OCR. Internally the application is split into a Python/Starlette backend and a Vite‑powered Vue.js frontend, allowing teams to extend or replace either layer independently. The core data model is relational; PostgreSQL stores document metadata, version history, and user permissions while the file system (or an object store) holds the actual PDF/TIFF/PNG/JPEG blobs.
Backend
Frontend
Database
Search
Overview
Papermerge DMS is a fully‑self‑hosted document management system engineered for scanned archives. From a developer’s standpoint, it exposes a clean RESTful API (OpenAPI compliant) that mirrors the rich UI features: hierarchical folders, tagging, custom metadata fields, and full‑text search powered by OCR. Internally the application is split into a Python/Starlette backend and a Vite‑powered Vue.js frontend, allowing teams to extend or replace either layer independently. The core data model is relational; PostgreSQL stores document metadata, version history, and user permissions while the file system (or an object store) holds the actual PDF/TIFF/PNG/JPEG blobs.
Architecture
- Backend – Python 3.11 with FastAPI/Starlette, using SQLAlchemy for ORM and Alembic for migrations. The server is event‑driven (uvicorn) and supports asynchronous OCR workers via Celery/RabbitMQ or Redis, enabling horizontal scaling of the OCR pipeline.
- Frontend – Vue 3 with Vite, TypeScript, and Pinia for state management. The UI communicates over the OpenAPI spec, making it trivial to generate SDKs in any language.
- Database – PostgreSQL (default, with optional support for SQLite in dev). Indexes on ftscolumns enable fast full‑text search.
- Search – Optional integration with ElasticSearch or PostgreSQL’s native full‑text search; the former is recommended for production workloads.
- Storage – Local file system by default, but the media root can be mapped to any POSIX‑compatible storage or an S3‑compatible bucket via a custom media backend.
Core Capabilities
- Document Versioning – Every edit creates a new immutable version; the API exposes endpoints to list, revert, or delete versions.
- Custom Fields & Metadata – Define per‑document‑type schemas; the API allows CRUD on field definitions and values, making it possible to build domain‑specific search filters.
- Tagging & Hierarchy – Tags are color‑coded, and folders support nesting; both are exposed as first‑class resources in the API.
- OCR & Text Overlay – OCR jobs are queued asynchronously; the resulting text is stored in a separate column and can be retrieved or downloaded as an OCR‑enabled PDF.
- Webhooks – Trigger external services on document events (upload, OCR completion, deletion) via configurable URLs.
Deployment & Infrastructure
Papermerge ships as a ready‑to‑run Docker image; a minimal docker run command starts the stack with SQLite and no OCR worker. Production deployments typically use Docker Compose or Kubernetes, provisioning:
- A PostgreSQL pod
- One or more Celery workers for OCR and search indexing
- An optional ElasticSearch cluster The application is stateless aside from the media directory, making horizontal scaling straightforward. The media root can be shared via NFS or an S3 bucket to support multi‑instance deployments.
Integration & Extensibility
- Plugin System – The backend exposes a plugin hook that allows developers to register new endpoints, signal handlers, or custom authentication backends.
- SDK Generation – The OpenAPI spec can be used with tools like openapi-generatorto produce client libraries in Go, Java, or TypeScript.
- Authentication – Supports OAuth2/OIDC out of the box; custom user stores can be wired via Django‑style backends.
- Custom UI – The Vue frontend is componentized; developers can swap components or build entirely new dashboards while reusing the API.
Developer Experience
The codebase is heavily documented; each module contains docstrings and the public API surface is clearly typed with Pydantic models. The community maintains an active issue tracker and a dedicated Discord channel for quick support. Licensing under Apache 2.0 removes any commercial restrictions, encouraging internal tooling and external integrations.
Use Cases
- Enterprise Document Repositories – Centralized storage for contracts, invoices, and compliance documents with audit‑ready versioning.
- Legal Firms – OCRed PDFs enable clause search across thousands of case files; custom fields capture docket numbers.
- Healthcare Records – Scanned patient forms can be indexed and tagged by department, with strict group ownership for privacy.
- Government Archives – Long‑term preservation of scanned permits and certificates, leveraging the immutable version history.
Advantages
Papermerge offers a modern desktop‑like UX while remaining fully open source and self‑hosted. Its tight coupling of OCR, full‑text search, and version control in a single stack gives developers an all‑in‑one solution. The API is first class, the architecture supports horizontal scaling, and the permissive license eliminates vendor lock‑in—making it a compelling choice over heavier commercial DMS solutions.
Open SourceReady to get started?
Join the community and start self-hosting Papermerge today
Related Apps in other
Immich
Self‑hosted photo and video manager
Syncthing
Peer‑to‑peer file sync, no central server
Strapi
Open-source headless CMS for modern developers
reveal.js
Create stunning web‑based presentations with HTML, CSS and JavaScript
Stirling-PDF
Local web PDF editor with split, merge, convert and more
MinIO
Fast, S3-compatible object storage for AI and analytics
Weekly Views
Repository Health
Information
Explore More Apps
pacebin
Minimalist self-hosted paste and file hosting service
Gitit
Self-hosted other
Watcharr
Self-hosted watch list for movies, shows and games
Cypht
Unified webmail client that aggregates all your email accounts
Open Source Point of Sale
Web‑based POS with inventory, invoicing and multi‑currency support
Termix
Web‑based SSH terminal and server management platform
