MCPSERV.CLUB
EPrints

EPrints

Self-Hosted

Open-source digital repository platform for research

Active(85)
40stars
0views
Updated 18 hours ago

Overview

Discover what makes EPrints powerful

EPrints is a mature, open‑source platform for building institutional repositories and digital libraries. From a developer standpoint it is essentially a **content‑management system (CMS)** that specializes in scholarly metadata, full‑text ingestion, and rich harvesting interfaces. It exposes a comprehensive REST/Atom API for CRUD operations on items, collections, and users, while also providing an RDF‑based OAI‑Pmh interface for metadata harvesting. The core engine is written in PHP, but the system’s architecture encourages language‑agnostic integrations via its well‑documented API surface.

Schema‑driven metadata

Full‑text indexing

OpenID & OAuth

Export formats

Overview

EPrints is a mature, open‑source platform for building institutional repositories and digital libraries. From a developer standpoint it is essentially a content‑management system (CMS) that specializes in scholarly metadata, full‑text ingestion, and rich harvesting interfaces. It exposes a comprehensive REST/Atom API for CRUD operations on items, collections, and users, while also providing an RDF‑based OAI‑Pmh interface for metadata harvesting. The core engine is written in PHP, but the system’s architecture encourages language‑agnostic integrations via its well‑documented API surface.

Key Features

  • Schema‑driven metadata – Define custom field types (e.g., date, author, file) and validation rules in XML, allowing repositories to model complex scholarly objects.
  • Full‑text indexing – Built on the Sphinx search engine, EPrints can index PDF/Word files and expose faceted search.
  • OpenID & OAuth – Supports authentication against external identity providers, enabling single‑sign‑on for research portals.
  • Export formats – Generates BibTeX, RIS, MODS, MARCXML, and Dublin Core on the fly.
  • Batch import – CSV/Excel ingestion pipelines with mapping templates.

Technical Stack

LayerTechnologyNotes
ApplicationPHP 7.4+ (object‑oriented)Uses a lightweight MVC framework; dependency injection is minimal but extensible.
DatabaseMySQL / MariaDB (or PostgreSQL via compatibility layer)Stores metadata in normalized tables; supports foreign‑key constraints for referential integrity.
SearchSphinx 2.xFull‑text indexing of PDFs and XML metadata; exposes a query language via PHP API.
Web ServerApache or Nginx (PHP‑FastCGI)Requires URL rewriting for pretty URLs; SSL termination handled externally.
DeploymentDocker images available (community maintained)Containerized deployments simplify scaling and CI pipelines.

EPrints is deliberately lightweight; it does not bundle a full JavaScript framework, instead relying on vanilla JS and minimal dependencies for client‑side interactions. This keeps the codebase small (~120 k lines of PHP) and reduces attack surface.

Core Capabilities & APIs

  • REST/Atom – Endpoints for /items, /collections, /users; supports pagination, filtering, and custom query parameters.
  • Webhooks – Post‑publish notifications can be sent to external services (e.g., Slack, custom webhooks).
  • Event hooksbefore_save_item, after_delete_collection allow developers to inject logic into the lifecycle.
  • Template system – Uses Twig‑like syntax; developers can override view templates without modifying core code.
  • CLI utilitieseprints-cli offers commands for bulk operations, cache flushing, and database migrations.

Deployment & Infrastructure

EPrints is designed for self‑hosting on conventional LAMP stacks. Minimum requirements: 2 GB RAM, 500 MB disk for a small repository; production deployments often run on 8–16 GB RAM to accommodate concurrent search queries. The Docker image (eprints/eprints:latest) encapsulates the PHP runtime, MySQL, and Sphinx, simplifying orchestration with Docker Compose or Kubernetes. For high‑traffic scenarios, a load balancer (NGINX) in front of multiple EPrints pods can be used, with a shared MySQL cluster and replicated Sphinx indexes.

Integration & Extensibility

EPrints follows a plugin architecture: modules are PHP classes registered in config/plugins.php. Developers can create plugins that add new fields, modify UI components, or hook into the OAI‑Pmh feed. The plugin API is versioned; most core plugins (e.g., search, export) are maintained by the community. Custom OAI‑Pmh schemas can be defined, and EPrints exposes a SWORD endpoint for deposit from external systems. Webhooks enable real‑time integration with institutional discovery layers or analytics platforms.

Developer Experience

  • Configuration – YAML‑style files (config.php) allow fine‑grained control over authentication, storage paths, and index settings.
  • Documentation – The official docs include a “Developer Guide” with API references, plugin development tutorials, and migration notes.
  • Community – An active mailing list and GitHub repository provide prompt support; the code is under GPL‑3.0, encouraging commercial use with no licensing fees.
  • Testing – PHPUnit test suites cover core functionality; developers can run phpunit against their fork to validate custom plugins.

Use Cases

  1. University Repository – Store theses, dissertations, and faculty publications; expose OAI‑Pmh for institutional discovery.
  2. Research Group Portal – Integrate with GitLab CI to auto‑publish preprints; use webhooks for Slack notifications.
  3. Digital Library Service – Host a multi‑tenant repository with role‑based access; use SWORD to ingest from external publishers.
  4. Data Archival – Leverage full‑text indexing to provide search over large PDF datasets; export metadata in MARCXML for library systems.

Advantages

  • Performance – PHP’s lightweight engine combined with Sphinx indexing delivers sub‑second search responses even for large corpora.
  • Flexibility – Schema‑driven metadata and plugin hooks allow tailoring to

Open SourceReady to get started?

Join the community and start self-hosting EPrints today