CollectiveAccess - Providence

Self-Hosted

Open-source collections cataloguing for museums and archives

Active(95)

347stars

0views

Updated 16 days ago

CollectiveAccess - Providence screenshot

Overview

Discover what makes CollectiveAccess - Providence powerful

CollectiveAccess Providence is a PHP‑based, web‑centric back‑end for managing complex cultural heritage collections. It exposes a rich set of CRUD operations, search, and reporting capabilities through both RESTful endpoints and a modern GraphQL API. The application is built on the Zend Framework ecosystem, leveraging Doctrine ORM for database abstraction and Elasticsearch (or Solr) for full‑text indexing. Data is stored in a relational database such as MySQL 8 or PostgreSQL 15, while media files are served from the local filesystem or an external object store via a pluggable storage driver. Providence’s architecture is intentionally modular: core logic lives in the `app` namespace, while custom extensions are dropped into a dedicated `extensions/` directory and discovered automatically at runtime.

Flexible metadata schema

Advanced change tracking

Background job system

Export & ingestion

Overview

CollectiveAccess Providence is a PHP‑based, web‑centric back‑end for managing complex cultural heritage collections. It exposes a rich set of CRUD operations, search, and reporting capabilities through both RESTful endpoints and a modern GraphQL API. The application is built on the Zend Framework ecosystem, leveraging Doctrine ORM for database abstraction and Elasticsearch (or Solr) for full‑text indexing. Data is stored in a relational database such as MySQL 8 or PostgreSQL 15, while media files are served from the local filesystem or an external object store via a pluggable storage driver. Providence’s architecture is intentionally modular: core logic lives in the app namespace, while custom extensions are dropped into a dedicated extensions/ directory and discovered automatically at runtime.

Key Features

Flexible metadata schema – Users can define arbitrary data profiles (e.g., DublinCore, MODS, METS) and attach custom fields to any entity type. The schema engine writes changes to the database using Doctrine migrations, enabling versioned data models without manual SQL.
Advanced change tracking – Every update to an object is logged in a separate audit table, with support for provenance, location history, and time‑stamped events. The UI exposes a timeline view that can be queried via GraphQL.
Background job system – Media transcoding, search re‑indexing, and export generation are handled by a lightweight queue (Redis or Beanstalkd). The job runner is bundled with the application, eliminating the need for external cron jobs.
Export & ingestion – Providence can ingest XML feeds (DublinCore, METS, MODS), SQLite dumps, or custom CSV formats. Exports are configurable BagIt packages that integrate seamlessly with preservation repositories such as Archivematica or LOCKSS.

Technical Stack

Layer	Technology
Web Server	Apache 2.4 / Nginx + PHP‑FPM
Language	PHP 8.2/8.3 (compatible back to 7.4)
Framework	Zend Framework 2/3 + Doctrine ORM
Search	Elasticsearch 7.x / Solr 8.x (configurable)
Database	MySQL 8 / PostgreSQL 15
Queue	Redis or Beanstalkd (optional)
API	GraphQL v15, REST fallback

The stack is deliberately lightweight yet extensible; the core can run on a single LAMP/LNMP server, while production deployments often use Docker Compose or Kubernetes to scale the database, queue, and search nodes independently.

Deployment & Infrastructure

Providence is a classic “install once, run anywhere” PHP application. It requires a writable cache directory and the ability to create database tables via Doctrine CLI tools. For high‑availability, developers typically containerise each component: a PHP‑FPM image for the web tier, a PostgreSQL or MySQL image, an Elasticsearch cluster, and a Redis queue. Kubernetes manifests are available in the community repo, providing horizontal scaling of web workers and automatic rolling upgrades. The application’s configuration is driven by a single YAML file (config.yml) that supports environment variables for CI/CD pipelines.

Integration & Extensibility

Plugin system – Any PHP class implementing the CollectiveAccess\PluginInterface can be dropped into extensions/. The discovery mechanism loads plugins at boot time, allowing developers to add new form widgets, validation rules, or data transformers without touching the core.
Webhooks & API – External services can subscribe to change events via a JSON‑over‑HTTP webhook endpoint. The GraphQL API supports mutations for creating, updating, and deleting objects, making it easy to build custom front‑ends or sync with other CMSs.
Customization – The UI is built on Bootstrap 5 and Twig templates. Developers can override any template in themes/ or inject custom JavaScript via the theme’s asset pipeline. CSS variables expose brand colors, while a set of REST endpoints allows programmatic control over user permissions and workflow rules.

Developer Experience

The project’s documentation is comprehensive, covering installation, schema design, and API usage. A dedicated “Developer Guide” explains the plugin architecture and provides sample code snippets (without full installation commands). The community is active on GitHub, with issue triage and pull‑request reviews happening within a week. Licensing under GPLv3 ensures that any derivative work remains open source, which is attractive for institutions that need to audit or extend the codebase.

Use Cases

Museum collections – A curator can model artifacts, accession histories, and provenance in a single database, while the public front‑end (Pawtucket2) exposes searchable catalogs with faceted navigation.
Digital archives – Archivists can ingest METS bundles, automatically extract media metadata, and generate BagIt exports for long‑term preservation.
Research consortia – Multiple sites can run synchronized Providence instances, using the replication system to keep data in sync across geographic locations.

Advantages

Performance – Native PHP 8 optimizations and Doctrine’s lazy loading keep API responses under 200 ms for typical queries.
Flexibility – The schema engine supports arbitrary metadata profiles, so developers can avoid costly custom database migrations.
Extensibility – The plugin system and GraphQL API allow rapid feature addition without touching core code.
Open‑source freedom –