MCPSERV.CLUB
Databunker

Databunker

Self-Hosted

Secure PII/PCI tokenization in 10 minutes

Active(75)
1.3kstars
0views
Updated Sep 5, 2025
Databunker screenshot

Overview

Discover what makes Databunker powerful

Databunker is a lightweight, Go‑based vault designed to replace traditional database encryption with tokenization and API‑level protection. Its core idea is simple: never store PII, PHI, or KYC data in plaintext; instead, persist a UUID token that references an encrypted blob. Applications interact with the vault through a NoSQL‑style HTTP API, enabling developers to plug secure data handling into existing stacks without rewriting query logic or database schemas.

Language & Runtime

Storage Backend

API Layer

Encryption & Tokenization

Overview

Databunker is a lightweight, Go‑based vault designed to replace traditional database encryption with tokenization and API‑level protection. Its core idea is simple: never store PII, PHI, or KYC data in plaintext; instead, persist a UUID token that references an encrypted blob. Applications interact with the vault through a NoSQL‑style HTTP API, enabling developers to plug secure data handling into existing stacks without rewriting query logic or database schemas.

Technical Stack

  • Language & Runtime: Go 1.21+, chosen for its compiled performance, minimal runtime footprint, and strong concurrency primitives.
  • Storage Backend: The project ships with a lightweight SQLite engine for local deployments, but the architecture abstracts the persistence layer, allowing any SQL or NoSQL store (e.g., PostgreSQL, MySQL, CockroachDB) via the @databunker/store npm package or custom Go adapters.
  • API Layer: A RESTful interface built with the standard net/http library, augmented by a GraphQL gateway that automatically sanitizes queries to prevent injection attacks.
  • Encryption & Tokenization: Uses AES‑256 in GCM mode for data-at-rest encryption, coupled with HMAC‑SHA256 for integrity. Tokens are generated as RFC 4122 UUIDv4 strings, ensuring global uniqueness and stateless reference.
  • Deployment: Distributed as a single binary or Docker image (securitybunker/databunker). The container exposes health endpoints and supports Kubernetes readiness probes, making it trivial to roll out in cloud or on‑prem environments.

Core Capabilities

  • Tokenization Engine: POST /tokenize accepts arbitrary JSON payloads and returns a UUID token. The payload is encrypted and stored in the backend; only the token is returned to the client.
  • Secure Retrieval: GET /records/{token} decrypts and streams the original data. Access is gated by API keys, JWTs, or mutual TLS, depending on deployment configuration.
  • Hash‑Based Indexing: Fields can be indexed by storing a SHA‑256 hash of the value, enabling efficient equality searches without exposing raw data.
  • Injection Protection: All endpoints validate input against a whitelist of allowed operations. GraphQL queries are parsed with graphql-go and automatically filtered to disallow raw field access.
  • Audit & Rotation: The API supports key rotation via a /rotate endpoint and emits audit logs in JSON‑lines format, ready for ingestion into SIEM tools.

Deployment & Infrastructure

Databunker’s minimal resource profile (≈30 MiB RAM, <5 ms latency for tokenization on a single core) makes it ideal for edge or serverless deployments. It can be run as:

  • Standalone: A single binary on a VM or bare metal.
  • Docker Compose: docker-compose.yml with optional Redis for caching and PostgreSQL as a durable backend.
  • Kubernetes: Helm chart available (not included in the repo) that exposes StatefulSet, Service, and Ingress resources, with support for secrets management via Kubernetes Secrets or HashiCorp Vault.

Scalability is achieved through stateless API servers behind a load balancer, while the storage layer can be horizontally scaled using clustering features of PostgreSQL or CockroachDB. The architecture also supports sharding by token namespace, allowing a single deployment to handle millions of records with linear performance.

Integration & Extensibility

Developers can extend Databunker in several ways:

  • Custom Store Implementations: The @databunker/store package exposes an interface; third‑party storage backends can be plugged in without modifying the core API.
  • Webhooks: The API allows configuring post‑write and pre‑read webhooks, enabling integration with notification services or audit workflows.
  • SDKs: npm packages (@databunker/store, @databunker/session-store) provide typed clients for Node.js, TypeScript, and other ecosystems.
  • Plugin Hooks: Middleware hooks in the Go server allow injecting custom authentication providers (OAuth2, LDAP) or rate‑limiting logic.

Developer Experience

The project is well documented on GitHub with clear README sections, example payloads, and a comprehensive test suite that ensures backward compatibility. The MIT license removes any commercial friction, while the community is active on GitHub Discussions and Slack (link provided in docs). Configuration is driven by environment variables, keeping the deployment process declarative and reproducible. The API’s JSON schema is self‑describing, and OpenAPI specs are generated automatically for Swagger UI integration.

Use Cases

  1. SaaS Platforms: Store customer PII (emails, addresses) in a tokenized form while keeping the main database schema unchanged.
  2. Financial Services: Tokenize credit‑card numbers before persisting them, satisfying PCI DSS requirements without custom encryption logic.
  3. Healthcare Applications: Store PHI records in an encrypted vault, ensuring HIPAA compliance and protecting against SQL injection.
  4. Compliance Audits: Use the audit logs and key‑rotation features to demonstrate GDPR or CCPA data minimization during audits.

Advantages Over Alternatives

  • Zero Plaintext Storage: Unlike traditional disk‑block encryption, Databunker guarantees that data never exists unencrypted on the server.
  • Performance: Go’s compiled binaries deliver sub‑millisecond tokenization, outperforming many managed encryption services.
  • Flexibility: The abstraction over storage backends means you can migrate

Open SourceReady to get started?

Join the community and start self-hosting Databunker today