MCPSERV.CLUB
Whoogle

Whoogle

Self-Hosted

Private, ad‑free Google search engine

Active(100)
11.1kstars
0views
Updated 11 days ago
Whoogle screenshot 1
1 / 3

Overview

Discover what makes Whoogle powerful

Whoogle is a self‑hosted search engine front‑end that scrapes Google’s public web results while stripping ads, JavaScript, AMP links, and tracking cookies. From a technical standpoint it acts as a thin HTTP proxy that normalizes the Google search page into a minimal, privacy‑oriented HTML surface. The core logic is written in Go, leveraging its standard library for HTTP handling and concurrent goroutines to manage request pipelines. The application exposes a single REST endpoint (`/search`) that accepts query parameters and returns a JSON payload of search results, which the front‑end consumes to render a clean UI. This separation allows developers to embed Whoogle as a microservice in larger stacks or expose it behind custom authentication layers.

Language & Runtime

Web Framework

Data Persistence

Caching Layer

Overview

Whoogle is a self‑hosted search engine front‑end that scrapes Google’s public web results while stripping ads, JavaScript, AMP links, and tracking cookies. From a technical standpoint it acts as a thin HTTP proxy that normalizes the Google search page into a minimal, privacy‑oriented HTML surface. The core logic is written in Go, leveraging its standard library for HTTP handling and concurrent goroutines to manage request pipelines. The application exposes a single REST endpoint (/search) that accepts query parameters and returns a JSON payload of search results, which the front‑end consumes to render a clean UI. This separation allows developers to embed Whoogle as a microservice in larger stacks or expose it behind custom authentication layers.

Architecture

Whoogle’s stack is intentionally lightweight:

  • Language & Runtime: Go 1.22+, compiled to a statically linked binary, making it suitable for minimal Docker images or even native binaries on ARM and x86 hosts.
  • Web Framework: The standard net/http package, supplemented by the gorilla/mux router for clean path handling.
  • Data Persistence: None by default; all state is transient, though developers can hook into external storage for caching or analytics.
  • Caching Layer: Optional in‑memory LRU cache (via github.com/hashicorp/golang-lru) to reduce repeated Google requests and mitigate rate‑limit exposure.
  • Backend Integration: The project ships with a Mullvad Leta integration (LETA_INTEGRATION.md) that routes queries through Mullvad’s privacy‑focused search service when Google’s public API is blocked. This fallback can be toggled via an environment variable or config file, providing resilience against search‑engine policy changes.

Core Capabilities

  • JSON API: Exposes a clean, well‑documented JSON schema for search results, including title, snippet, URL, and favicon data.
  • Custom Bangs: Developers can extend the built‑in bang system (e.g., !g, !w) by editing a YAML configuration file, allowing routing to arbitrary URLs or search providers.
  • Configuration: A single YAML/JSON config file (whoogle.conf) controls backend choice, cache size, request timeouts, and rate‑limit thresholds.
  • Metrics: Built‑in Prometheus endpoint (/metrics) exposes request counts, latency histograms, and error rates for observability.
  • Extensibility Hooks: Middleware support via net/http handlers lets developers inject authentication, logging, or request rewriting logic without modifying core code.

Deployment & Infrastructure

Whoogle ships as a pre‑built Docker image (benbusby/whoogle-search) that runs on any OCI‑compatible runtime. The minimal base image (scratch or alpine) keeps the attack surface small and reduces disk usage to under 10 MB. For Kubernetes, a Helm chart is available (helm repo add whoogle https://benbusby.github.io/whoogle-search) that supports custom resource definitions for scaling, ingress configuration, and persistent caching. The application is stateless, making it horizontally scalable behind a load balancer; developers can simply replicate the pod or use Docker Swarm’s service replication. When running on edge platforms (Fly.io, Render.com), Whoogle can be configured with a single environment variable to enforce HTTPS and auto‑scale based on traffic.

Integration & Extensibility

Whoogle’s API can be consumed by any client that supports HTTP/JSON, making it ideal for embedding in browser extensions, mobile apps, or custom search widgets. The bang system can be extended to route queries to internal APIs (e.g., a local knowledge base) or third‑party search services. Webhooks are not built‑in, but the modular middleware architecture allows developers to publish events (e.g., onResultFetched) via a message broker like NATS or Kafka. Customization of the UI is straightforward: the front‑end is a simple static React bundle that can be swapped out or styled via CSS variables; the API remains unchanged.

Developer Experience

The project’s documentation is concise yet comprehensive, covering environment variables, deployment options, and advanced configuration. The community is active on GitHub Discussions and a dedicated mailing list, ensuring timely support for new integrations. Licensing under MIT allows unrestricted commercial use; the open‑source nature means developers can audit the scraping logic for compliance with Google’s terms of service. The single‑binary deployment model eliminates dependency hell, and the container image is regularly rebuilt with security patches.

Use Cases

  • Enterprise Privacy Gateways: Deploy Whoogle behind a corporate proxy to provide employees with ad‑free, cookie‑less Google search without exposing internal IPs.
  • Personal VPN/Router Integration: Run Whoogle on a home NAS or router (e.g., OpenWrt) to replace the default search engine in browsers, ensuring all traffic is anonymized.
  • Developer Tooling: Use the JSON API to build custom dashboards or integrate search into CI/CD pipelines for documentation lookup.
  • Educational Environments: Offer students a clean search interface in school networks, mitigating tracking while preserving Google’s comprehensive results.

Advantages

Whoogle gives developers fine‑grained control over search behavior without sacrificing the breadth of Google’s index. Its lightweight Go implementation offers low latency and high concurrency, outperforming heavier JavaScript‑based scrapers. The built‑in Mullvad Leta fallback provides resilience against policy changes, a

Open SourceReady to get started?

Join the community and start self-hosting Whoogle today

Weekly Views

Loading...
Support Us
Most Popular

Infrastructure Supporter

$5/month

Keep our servers running and help us maintain the best directory for developers

Repository Health

Loading health data...

Information

Category
other
License
MIT
Stars
11.1k
Technical Specs
Pricing
Open Source
Docker
Official
Supported OS
LinuxDocker
Author
benbusby
benbusby
Last Updated
11 days ago