Overview
Discover what makes Whoogle powerful
Whoogle is a self‑hosted search engine front‑end that scrapes Google’s public web results while stripping ads, JavaScript, AMP links, and tracking cookies. From a technical standpoint it acts as a thin HTTP proxy that normalizes the Google search page into a minimal, privacy‑oriented HTML surface. The core logic is written in Go, leveraging its standard library for HTTP handling and concurrent goroutines to manage request pipelines. The application exposes a single REST endpoint (`/search`) that accepts query parameters and returns a JSON payload of search results, which the front‑end consumes to render a clean UI. This separation allows developers to embed Whoogle as a microservice in larger stacks or expose it behind custom authentication layers.
Language & Runtime
Web Framework
Data Persistence
Caching Layer
Overview
Whoogle is a self‑hosted search engine front‑end that scrapes Google’s public web results while stripping ads, JavaScript, AMP links, and tracking cookies. From a technical standpoint it acts as a thin HTTP proxy that normalizes the Google search page into a minimal, privacy‑oriented HTML surface. The core logic is written in Go, leveraging its standard library for HTTP handling and concurrent goroutines to manage request pipelines. The application exposes a single REST endpoint (/search) that accepts query parameters and returns a JSON payload of search results, which the front‑end consumes to render a clean UI. This separation allows developers to embed Whoogle as a microservice in larger stacks or expose it behind custom authentication layers.
Architecture
Whoogle’s stack is intentionally lightweight:
- Language & Runtime: Go 1.22+, compiled to a statically linked binary, making it suitable for minimal Docker images or even native binaries on ARM and x86 hosts.
- Web Framework: The standard net/httppackage, supplemented by thegorilla/muxrouter for clean path handling.
- Data Persistence: None by default; all state is transient, though developers can hook into external storage for caching or analytics.
- Caching Layer: Optional in‑memory LRU cache (via github.com/hashicorp/golang-lru) to reduce repeated Google requests and mitigate rate‑limit exposure.
- Backend Integration: The project ships with a Mullvad Leta integration (LETA_INTEGRATION.md) that routes queries through Mullvad’s privacy‑focused search service when Google’s public API is blocked. This fallback can be toggled via an environment variable or config file, providing resilience against search‑engine policy changes.
Core Capabilities
- JSON API: Exposes a clean, well‑documented JSON schema for search results, including title, snippet, URL, and favicon data.
- Custom Bangs: Developers can extend the built‑in bang system (e.g., !g,!w) by editing a YAML configuration file, allowing routing to arbitrary URLs or search providers.
- Configuration: A single YAML/JSON config file (whoogle.conf) controls backend choice, cache size, request timeouts, and rate‑limit thresholds.
- Metrics: Built‑in Prometheus endpoint (/metrics) exposes request counts, latency histograms, and error rates for observability.
- Extensibility Hooks: Middleware support via net/httphandlers lets developers inject authentication, logging, or request rewriting logic without modifying core code.
Deployment & Infrastructure
Whoogle ships as a pre‑built Docker image (benbusby/whoogle-search) that runs on any OCI‑compatible runtime. The minimal base image (scratch or alpine) keeps the attack surface small and reduces disk usage to under 10 MB. For Kubernetes, a Helm chart is available (helm repo add whoogle https://benbusby.github.io/whoogle-search) that supports custom resource definitions for scaling, ingress configuration, and persistent caching. The application is stateless, making it horizontally scalable behind a load balancer; developers can simply replicate the pod or use Docker Swarm’s service replication. When running on edge platforms (Fly.io, Render.com), Whoogle can be configured with a single environment variable to enforce HTTPS and auto‑scale based on traffic.
Integration & Extensibility
Whoogle’s API can be consumed by any client that supports HTTP/JSON, making it ideal for embedding in browser extensions, mobile apps, or custom search widgets. The bang system can be extended to route queries to internal APIs (e.g., a local knowledge base) or third‑party search services. Webhooks are not built‑in, but the modular middleware architecture allows developers to publish events (e.g., onResultFetched) via a message broker like NATS or Kafka. Customization of the UI is straightforward: the front‑end is a simple static React bundle that can be swapped out or styled via CSS variables; the API remains unchanged.
Developer Experience
The project’s documentation is concise yet comprehensive, covering environment variables, deployment options, and advanced configuration. The community is active on GitHub Discussions and a dedicated mailing list, ensuring timely support for new integrations. Licensing under MIT allows unrestricted commercial use; the open‑source nature means developers can audit the scraping logic for compliance with Google’s terms of service. The single‑binary deployment model eliminates dependency hell, and the container image is regularly rebuilt with security patches.
Use Cases
- Enterprise Privacy Gateways: Deploy Whoogle behind a corporate proxy to provide employees with ad‑free, cookie‑less Google search without exposing internal IPs.
- Personal VPN/Router Integration: Run Whoogle on a home NAS or router (e.g., OpenWrt) to replace the default search engine in browsers, ensuring all traffic is anonymized.
- Developer Tooling: Use the JSON API to build custom dashboards or integrate search into CI/CD pipelines for documentation lookup.
- Educational Environments: Offer students a clean search interface in school networks, mitigating tracking while preserving Google’s comprehensive results.
Advantages
Whoogle gives developers fine‑grained control over search behavior without sacrificing the breadth of Google’s index. Its lightweight Go implementation offers low latency and high concurrency, outperforming heavier JavaScript‑based scrapers. The built‑in Mullvad Leta fallback provides resilience against policy changes, a
Open SourceReady to get started?
Join the community and start self-hosting Whoogle today
Related Apps in other
Immich
Self‑hosted photo and video manager
Syncthing
Peer‑to‑peer file sync, no central server
Strapi
Open-source headless CMS for modern developers
reveal.js
Create stunning web‑based presentations with HTML, CSS and JavaScript
Stirling-PDF
Local web PDF editor with split, merge, convert and more
MinIO
Fast, S3-compatible object storage for AI and analytics
Weekly Views
Repository Health
Information
Explore More Apps
Koel
Self‑hosted web music streaming for developers
Bracket
Manage tournaments effortlessly
OpenTTD
Build, manage, and expand transport empires
Feeds Fun
AI‑powered RSS reader with smart tags and custom rules
Chirpy
Privacy‑friendly, customizable comment system for modern websites
WeeChat
Lightweight, extensible chat client for multiple protocols
