Web Agent Protocol

MCP Server

Record and replay browser interactions seamlessly

Stale(55)

478stars

0views

Updated 19 days ago

About

The Web Agent Protocol (WAP) enables users to capture web interactions via a Chrome extension, convert them into replayable action lists, and serve them as MCP servers for automated browser testing or agent use.

Capabilities

Resources

Access data sources

Tools

Execute functions

Prompts

Pre-built templates

Sampling

AI model interactions

Without WAP

The Web Agent Protocol (WAP) solves a common pain point for developers building web‑centric AI assistants: reliably capturing, transforming, and replaying user interactions across browsers. Traditional approaches rely on brittle screen‑scraping or ad‑hoc automation scripts that break when page layouts change. WAP introduces a clean separation between recording user actions and executing them, enabling agents to work with deterministic action lists that are resilient to UI drift.

At its core, WAP provides a lightweight HTTP server that receives raw event streams from the OTA‑WAP Chrome extension. These events—clicks, form submissions, navigation steps—are then converted into two kinds of action lists: exact‑replay, which replays every captured event verbatim, and smart‑replay, which abstracts common patterns (e.g., “click the first button with a given class”) to improve robustness. The server exposes these lists via an MCP interface, allowing any AI assistant—Claude, GPT‑4, or a custom agent—to request and execute them through the WAP‑Replay protocol. This modularity means developers can mix and match recording, transformation, and execution components without re‑engineering their entire stack.

Key capabilities include:

Event capture: A Chrome extension streams low‑level DOM events to the server, preserving timestamps and target metadata.
Action conversion: The SDK turns raw events into reusable action objects, supporting both exact and smart replay.
MCP integration: Recorded actions are exposed as MCP resources, enabling agents to discover and invoke them like any other tool.
Replay fidelity: The WAP‑Replay protocol ensures that replayed actions hit the same elements, even if the page has changed slightly since recording.

Real‑world use cases abound. A content curator could record a sequence that fills out a form and submits it, then hand that action list to an AI assistant that needs to batch‑process multiple entries. A QA engineer might record a complex workflow and replay it under different user scenarios to verify consistency. Even marketing teams can capture dynamic page interactions (e.g., A/B test toggles) and replay them in controlled experiments.

By integrating seamlessly into existing AI workflows, WAP empowers developers to treat browser interactions as first‑class data. Agents can discover recorded actions through MCP, compose them with other tools, and orchestrate end‑to‑end automation pipelines—all while maintaining a clear audit trail of what was executed and why. This level of abstraction, coupled with the robustness of smart‑replay, gives WAP a distinct advantage over ad‑hoc scripting solutions and positions it as an essential component for any AI system that needs to interact reliably with the web.