MCPSERV.CLUB
I, Librarian

I, Librarian

Self-Hosted

Web‑based PDF and document manager for private collaboration

Stale(63)
311stars
0views
Updated Jul 18, 2025

Overview

Discover what makes I, Librarian powerful

**I, Librarian** is a self‑hosted web application that turns any server into a full‑featured, centralized document repository. Built on the classic LAMP stack (Linux / Apache / MySQL / PHP), it exposes a rich set of RESTful endpoints and a GraphQL‑style query interface that let developers treat the library as an API‑first service. The core of the application is a PHP MVC framework written in pure PHP 8, with Twig templating for the UI and Composer‑managed dependencies. The data layer is powered by MySQL 8 or MariaDB, using Eloquent‑style ORM for schema migrations and query building.

Web Server

Runtime

Database

Search & OCR

Overview

I, Librarian is a self‑hosted web application that turns any server into a full‑featured, centralized document repository. Built on the classic LAMP stack (Linux / Apache / MySQL / PHP), it exposes a rich set of RESTful endpoints and a GraphQL‑style query interface that let developers treat the library as an API‑first service. The core of the application is a PHP MVC framework written in pure PHP 8, with Twig templating for the UI and Composer‑managed dependencies. The data layer is powered by MySQL 8 or MariaDB, using Eloquent‑style ORM for schema migrations and query building.

Architecture

  • Web Server – Apache 2.4+ (or Nginx via reverse proxy) handles HTTP(S) traffic and serves static assets from the public directory.
  • Runtime – PHP 7.2+ (recommended 8.x) runs the application code; the codebase is fully PSR‑4 autoloaded and follows SOLID principles.
  • Database – MySQL 8/MariaDB stores metadata, user accounts, annotations, and project relationships. The schema is versioned with migration files.
  • Search & OCR – Elasticsearch (or OpenSearch) indexes PDF text, metadata, and annotations. Tesseract OCR is invoked for non‑text PDFs; the resulting text is stored back in Elasticsearch, enabling multilingual full‑text search.
  • Containerization – A docker-compose.yml is included for quick spin‑up. Containers expose standard ports (80/443) and can be orchestrated with Docker Swarm or Kubernetes via Helm charts.

Core Capabilities

  • Document Import – Supports bulk ingestion of PDFs, Office files (via LibreOffice headless conversion), and ZIP archives. Import jobs are queued with Redis, allowing asynchronous processing.
  • Annotation API – REST endpoints (/api/annotations) allow CRUD operations on PDF annotations, including text highlights, comments, and shapes. Annotations are stored in a dedicated annotations table and synced to the search index.
  • Project Collaboration – Users can create “projects” that group documents and share annotations. Project membership is managed via role‑based access control (RBAC).
  • Webhooks & Events – The application emits JSON events (document.created, annotation.updated) that can be subscribed to via HTTP callbacks, enabling integration with CI/CD pipelines or external analytics.
  • Custom Metadata – Users can define arbitrary key/value pairs for documents; these are indexed and searchable.

Deployment & Infrastructure

  • Self‑Hosting – The application can run on any Linux distribution with Apache/PHP. Windows users may install WAMP or XAMPP, while macOS can use Homebrew to provision Apache/PHP.
  • Scalability – Horizontal scaling is achieved by running multiple PHP-FPM workers behind a load balancer. The search index can be sharded across an Elasticsearch cluster, and the database can use read replicas for high‑traffic scenarios.
  • Backup & Recovery – Database dumps and file system snapshots are supported. The application includes a migration tool that restores the schema and data to a new instance.
  • Security – SSL termination is recommended at the reverse proxy. The application enforces CSRF tokens, input sanitization, and role checks on every API call.

Integration & Extensibility

  • Plugin Architecture – Developers can drop PHP modules into the plugins directory; each plugin registers routes, services, and database migrations. The core exposes hooks (onDocumentUpload, beforeAnnotationSave) for custom logic.
  • External Authentication – LDAP, OAuth2, and SAML providers are supported via configuration files. This allows seamless integration with corporate identity systems.
  • CLI Tools – A set of Artisan‑style commands (librarian:import, librarian:index) provide automation for batch jobs and maintenance tasks.

Developer Experience

  • Documentation – The project’s GitHub wiki contains detailed API references, architecture diagrams, and migration guides. Inline code comments follow PSR‑12 standards.
  • Community – The issue tracker is active, with a dedicated “dev” label for feature requests. Contributors can submit pull requests; the maintainers enforce unit tests (PHPUnit) and code quality checks via GitHub Actions.
  • Configuration – All settings are stored in a single .env file, making it straightforward to adjust database credentials, API keys, and feature flags without code changes.

Use Cases

  1. Research Groups – Centralize all published papers, grant proposals, and meeting notes in one searchable repository.
  2. Enterprise Knowledge Base – Store SOPs, white papers, and compliance documents with fine‑grained access control.
  3. Open Source Projects – Host design docs, API specifications, and meeting minutes, allowing contributors to annotate directly in the browser.
  4. Academic Institutions – Provide students and faculty with a single portal for lecture notes, assignments, and peer reviews.

Advantages

  • Performance – PHP 8’s JIT engine and the use of Redis queues reduce latency for heavy import jobs.
  • Flexibility – The plugin system and webhooks let developers extend functionality without touching core code.
  • Licensing – The project is released under the MIT license, allowing free use in commercial environments.
  • Low Footprint – A single docker-compose file can spin up a production‑ready instance on modest hardware (2 GB RAM, 1 CPU).

In summary, I, Librarian offers a developer‑friendly

Open SourceReady to get started?

Join the community and start self-hosting I, Librarian today

Weekly Views

Loading...
Support Us
Most Popular

Infrastructure Supporter

$5/month

Keep our servers running and help us maintain the best directory for developers

Repository Health

Loading health data...

Information

Category
other
License
GPL-3.0
Stars
311
Technical Specs
Pricing
Open Source
Docker
None
Supported OS
LinuxWindowsmacOS
Author
mkucej
mkucej
Last Updated
Jul 18, 2025