Agentic Workflows (MCP)

RegressionBot natively supports the Model Context Protocol (MCP). Expose visual testing tools directly to your AI agents (Claude Code, OpenAI Codex, Gemini, Windsurf, Cursor) via our remote Streamable HTTP MCP server. This allows your agents to autonomously run regression tests, check statuses, and approve baselines without any local installation.

🔑 Prerequisite: API Key RequiredTo hook your agent up to the remote MCP server, you will need a valid API key.

Get API Key →

Install

Configure your MCP client to connect to RegressionBot via Streamable HTTP using https://mcp.regressionbot.com/ and the x-api-key header.

Claude Code

Agentic IDEs and CLIs (such as Anthropic's Claude Code, OpenAI Codex, Cursor, Windsurf, and Gemini CLI) natively support Streamable HTTP transports and header configuration via command line, config files, or visual settings. RegressionBot's MCP endpoint is Streamable HTTP: configure the remote URL directly and authenticate with the x-api-key header. Do not configure it as a legacy SSE server.

Terminal

claude mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"

OpenAI Codex

Codex supports Streamable HTTP MCP servers in both the CLI and IDE extension. Both clients share the same MCP configuration, so you only need to configure RegressionBot once. The RegressionBot server does not use a legacy SSE endpoint event.

Add the following to ~/.codex/config.toml. The env_http_headers entry tells Codex to read the key from your environment rather than storing it in the config file.

~/.codex/config.toml

[mcp_servers.regressionbot]
url = "https://mcp.regressionbot.com/"
env_http_headers = { "x-api-key" = "REGRESSIONBOT_API_KEY" }

Alternatively, register the server via the Codex CLI and it will write the [mcp_servers.regressionbot] entry for you. You still need to add env_http_headers manually afterwards as the CLI does not accept header flags.

Terminal (optional)

export REGRESSIONBOT_API_KEY="YOUR_API_KEY"
codex mcp add regressionbot --url https://mcp.regressionbot.com/
# then manually add env_http_headers to ~/.codex/config.toml

In the Codex terminal UI, run /mcp to verify the server is active. In the Codex IDE extension, open MCP settings from the gear menu after editing config.toml to verify that RegressionBot is listed.

Cursor

Configure remote servers directly through Cursor's graphical interface to keep your API keys encrypted.

Open Cursor Settings and navigate to Features > MCP Servers.
Click + Add New MCP Server.
Fill out the configuration:
- Name: regressionbot
- Type: http
- URL: https://mcp.regressionbot.com/
Add a header row with x-api-key as the key and your API key as the value.
Click Save and verify the green status indicator.

Windsurf

Windsurf handles remote Streamable HTTP MCPs using a structured JSON file. Add the following to your ~/.codeium/windsurf/mcp_config.json:

mcp_config.json

{
  "mcpServers": {
    "regressionbot": {
      "serverUrl": "https://mcp.regressionbot.com/",
      "headers": {
        "x-api-key": "YOUR_API_KEY"
      }
    }
  }
}

Gemini CLI

Gemini CLI natively supports direct Streamable HTTP transport. You can add it via command line or settings file.

Option 1: Command Line

gemini mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"

Option 2: settings.json

{
  "mcpServers": {
    "regressionbot": {
      "httpUrl": "https://mcp.regressionbot.com/",
      "headers": {
        "x-api-key": "YOUR_API_KEY"
      }
    }
  }
}

Agent Examples

These examples show what a real session looks like with an MCP-enabled Agentic IDE or CLI (like Claude Code, Codex, Cursor, Windsurf, etc.) with the RegressionBot MCP connected. The agent handles polling, result interpretation, and image analysis automatically.

Legend: USER> is what the user types, AGENT> is the assistant response, TOOL> is an MCP tool call, and TOOL RESULT> is raw tool output.

Compare a preview URL against production baselines

Agentic IDE — terminal session

USER> run visual regression on https://staging.myapp.com against stored baselines

AGENT> I'll run a visual regression test using the RegressionBot MCP.

TOOL> [run_regression_job]
  testOrigin: "https://staging.myapp.com"
  project:    "myapp"
  devices:    ["Desktop Chrome", "iPhone 12"]

TOOL RESULT> {
    "message": "Job started",
    "jobId": "6ee84bac-40de-44cc-8dca-4760ab108153"
  }

TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153

TOOL RESULT> {
    "jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
    "status": "PROCESSING",
    "isComplete": false,
    "progress": { "total": 6, "completed": 3, "percent": "50.00" }
  }

TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153

TOOL RESULT> {
    "jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
    "status": "COMPLETED",
    "isComplete": true,
    "progress": { "total": 6, "completed": 6, "percent": "100.00" }
  }

TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153

TOOL RESULT> {
    "status": "COMPLETED",
    "isComplete": true,
    "progress": { "total": 6, "completed": 6, "percent": "100.00" },
    "executionTime": 48
  }

TOOL> [get_job_summary] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153, regressionsOnly: true

TOOL RESULT> {
    "regressionCount": 2,
    "matchCount": 4,
    "newBaselineCount": 0,
    "results": [
      {
        "url": "https://staging.myapp.com/pricing",
        "variantName": "Desktop Chrome",
        "status": "SUCCESS",
        "diffPercentage": 3.638671875,
        "score": 96.361328125,
        "isNewBaseline": false,
        "regressionbotSummary": "The pricing table layout changed: the Pro tier
          card is now highlighted with a border and the CTA button colour shifted
          from grey to blue. No structural breakage detected.",
        "diffUrl": "https://screenshots.regressionbot.com/r/token"
      },
      {
        "url": "https://staging.myapp.com/pricing",
        "variantName": "iPhone 12",
        "status": "SUCCESS",
        "diffPercentage": 3.638671875,
        "score": 96.361328125,
        "isNewBaseline": false,
        "regressionbotSummary": "Same pricing change on iPhone 12 — the card
          highlight and button colour update are visible. Font sizes unchanged.",
        "diffUrl": "https://screenshots.regressionbot.com/r/token"
      }
    ]
  }

AGENT> The job completed in 48 seconds. I found 2 regressions:

• /pricing [Desktop Chrome] — 3.64% diff, score 96
  "The Pro tier card is now highlighted with a border and the CTA button
   colour shifted from grey to blue."

• /pricing [iPhone 12] — 3.64% diff, score 96
  "Same change on iPhone 12 — card highlight and button colour visible."

AGENT> The other 4 pages matched their baselines exactly.
AGENT> Would you like me to fetch the diff images and review them before approving?

Inspect a diff image and approve

Agentic IDE — terminal session (continued)

USER> yes, look at the pricing diff and tell me if the change looks intentional

AGENT> [fetching diffUrl for /pricing Desktop Chrome via HTTP GET...]
AGENT> I can see the 3-panel diff — BEFORE | AFTER | ANNOTATED DIFF.
AGENT> The changes are clearly intentional:
- The Pro plan card now has a visible border and background highlight
- The "Get started" button changed from #6b7280 to #2563eb (grey → blue)
- No layout shifts, broken text, or clipped elements

AGENT> This looks like a deliberate design update. Safe to approve.

USER> approve it

TOOL> [approve_job] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153

TOOL RESULT> {
    "message": "Job approved",
    "approvedCount": 2,
    "failedCount": 0
  }

AGENT> Done. The 2 updated screenshots have been promoted to new baselines.
AGENT> Future runs will compare against the new design.

Run multiple jobs in parallel across devices

Agentic IDE — terminal session

USER> test the PR preview against baselines on desktop, mobile, and tablet

AGENT> I'll start 3 parallel jobs — one per device — then poll all of them.

TOOL> [run_regression_job] project: "myapp", devices: ["Desktop Chrome"]
TOOL RESULT> { "jobId": "31c66aa4-5bd7-4df1-a8e3-1394d77a33f9" }

TOOL> [run_regression_job] project: "myapp", devices: ["iPhone 12"]
TOOL RESULT> { "jobId": "1a597938-b08e-4417-9802-cbb457870c96" }

TOOL> [run_regression_job] project: "myapp", devices: ["tablet"]
TOOL RESULT> { "jobId": "d5452cb0-6b7a-4186-a0b1-0e23204a31e3" }

TOOL> [get_job_status] 31c66aa4
TOOL RESULT> PROCESSING 33%
TOOL> [get_job_status] 1a597938
TOOL RESULT> PROCESSING 50%
TOOL> [get_job_status] d5452cb0
TOOL RESULT> PROCESSING 17%
...
TOOL> [get_job_status] 31c66aa4
TOOL RESULT> COMPLETED ✓
TOOL> [get_job_status] 1a597938
TOOL RESULT> COMPLETED ✓
TOOL> [get_job_status] d5452cb0
TOOL RESULT> COMPLETED ✓

TOOL> [get_job_summary] 31c66aa4, regressionsOnly: true
TOOL RESULT> 1 regression
TOOL> [get_job_summary] 1a597938, regressionsOnly: true
TOOL RESULT> 1 regression
TOOL> [get_job_summary] d5452cb0, regressionsOnly: true
TOOL RESULT> 0 regressions

AGENT> Results across all devices:

• /about  [Desktop Chrome]  — 3.64% diff, score 96
  "Hero section padding increased, CTA button widened."

• /about  [iPhone 12]          — 3.64% diff, score 96
  "Same layout change on iPhone 12 — consistent with desktop update."

AGENT> • All tablet pages matched baselines.

Recover a previous job by listing recent runs

Agentic IDE — terminal session

USER> what was the result of the last run for the myapp project?

TOOL> [list_jobs] project: "myapp", limit: 10

TOOL RESULT> {
    "jobs": [
      {
        "jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
        "project": "myapp",
        "status": "APPROVED",
        "totalUrls": 6,
        "completedCount": 6,
        "createdAt": "2026-05-19T10:53:26.000Z",
        "finishedAt": "2026-05-19T10:54:14.000Z"
      },
      {
        "jobId": "e69062bb-f140-40e7-a3e8-40f84deadcdb",
        "project": "myapp",
        "status": "COMPLETED",
        "totalUrls": 6,
        "completedCount": 6,
        "createdAt": "2026-05-17T13:22:01.000Z",
        "finishedAt": "2026-05-17T13:22:06.000Z"
      }
    ]
  }

AGENT> Most recent: job 6ee84bac — APPROVED, 6 pages tested, finished ~1 hour ago.
AGENT> The run before that is COMPLETED but not yet approved (2 regressions pending).

Tip: The agent can fetch and visually inspect diff images directly from the shortened, clickable diffUrl in the results. Each diff is a 3-panel image — BEFORE | AFTER | ANNOTATED DIFF — with changed pixels highlighted in red and bounding boxes drawn around clusters of changes. This is the same image used to generate regressionbotSummary.

Projects

A project is a named configuration that locks in your test parameters — testOrigin, devices, paths, and more. The first run_regression_job call with a new project name creates the config automatically. Subsequent runs enforce that the params match, protecting baseline integrity.

How it works

First run — supply all params. RegressionBot creates the project config and captures baselines.
Repeat runs — omit params (or supply identical ones) to reuse the stored config. If you pass params that differ from the stored config, the run is rejected with an error listing the differing fields.
Shorthand — use run_from_project to run a repeat job without supplying any params at all.
Update config — use update_project to update a saved project's configuration and reset its baselines. Example: update_project({ name: "my-project", testOrigin: "https://v2.example.com" })
Approve by name — use approve_project to approve the most recent completed job for a project without needing a job ID. Example: approve_project({ name: "my-project" })
Live-vs-live runs — always accepted regardless of stored config. Passing baseOrigin skips the config check entirely.

Managed CI workflow example

Agentic IDE — terminal session

USER> set up a new project for myapp and capture baselines

AGENT> I'll create the project on the first run and capture baselines.

TOOL> [run_regression_job]
  project:    "myapp"
  testOrigin: "https://www.myapp.com"
  devices:    ["Desktop Chrome", "iPhone 12"]
  paths:      [{"path":"/"}, {"path":"/pricing"}, {"path":"/docs"}]
  autoApprove: true

TOOL RESULT> { "message": "Job started", "jobId": "a1b2c3d4-..." }

AGENT> Baselines are being captured for 3 pages on desktop and mobile.
AGENT> The config is now saved as project "myapp". Future runs are locked to these params.

---

USER> run regression check for myapp

TOOL> [run_from_project] name: "myapp"

TOOL RESULT> { "message": "Job started", "jobId": "e5f6a7b8-...", "project": "myapp" }

AGENT> Job started using stored config — testOrigin myapp.com, Desktop Chrome + iPhone 12, 3 paths.

---

USER> what projects do I have?

TOOL> [list_projects]

TOOL RESULT> {
  "projects": [
    {
      "name": "myapp",
      "testOrigin": "https://www.myapp.com",
      "devices": ["Desktop Chrome", "iPhone 12"],
      "lastRunAt": "2026-06-02T10:15:00.000Z",
      "lastJobId": "e5f6a7b8-..."
    }
  ]
}

Available Tools

Once connected, your agent has access to the following tools. The MCP server provides instructions automatically, so your agent understands the workflow without any additional prompting.

Tool	What it does
run_regression_job	Start a visual regression job. Accepts explicit `paths`, a `sitemapUrl` to crawl, or glob `scans` to filter a sitemap. Supports multiple `devices`, element `masks`, and an optional `baseOrigin` for ad-hoc origin-vs-origin comparison.
get_job_status	Poll progress. Returns `status`, `summaryStatus`, `progress` (total / completed / percent), `executionTime`, and an `isComplete` flag so your agent knows when to stop polling. Status lifecycle: `INITIALIZING → PROCESSING → COMPLETED`. Once status is complete, you can download images and view diff metrics immediately, but you must wait for `summaryStatus === 'COMPLETE'` to fetch AI summaries.
get_job_summary	Get full results once the job is complete. Returns `summaryStatus` alongside results; if 'COMPLETE', results include the AI-written `regressionbotSummary`, otherwise it is filtered out to avoid incomplete/empty outputs. Each result also includes `diffPercentage`, `score`, `visualMatchScore`, and shortened screenshot redirect URLs (falling back to pre-signed cloud storage URLs on write errors). Pass `regressionsOnly: true` to filter out matching pages.
list_jobs	List recent jobs ordered by most recent first. Filter by `project` or set a `limit`. Useful for recovering a `jobId` or checking the status of recent runs without leaving the agent session.
approve_job	Promote current screenshots to baselines, accepting all visual changes as intentional. This is permanent — existing baselines are overwritten. Only call after reviewing regressions in `get_job_summary`.
run_from_project	Run a job using a saved project config — no need to re-supply `testOrigin`, `devices`, or paths. Ideal for CI repeat runs once a project is established. Accepts optional `autoApprove`.
get_project	Retrieve the stored configuration for a named project — `testOrigin`, `devices`, `paths`, `scans`, and run metadata.
list_projects	List all saved project configs for your org. Returns name, testOrigin, devices, and last run info for each project.
update_project	Update a saved project's configuration and reset its baselines. Pass any subset of `testOrigin`, `devices`, `paths`, `masks`, etc. Triggers baseline invalidation — the next run will re-baseline everything.
approve_project	Approve the most recent completed job for a project, promoting its screenshots to baselines. No job ID required — ideal for CI workflows where the approving step runs separately from the regression check.