Agentic Workflows (MCP)
RegressionBot natively supports the Model Context Protocol (MCP). Expose visual testing tools directly to your AI agents (Claude Code, OpenAI Codex, Gemini, Windsurf, Cursor) via our remote Streamable HTTP MCP server. This allows your agents to autonomously run regression tests, check statuses, and approve baselines without any local installation.
Install
Configure your MCP client to connect to RegressionBot via Streamable HTTP using https://mcp.regressionbot.com/ and the x-api-key header.
Claude Code
Agentic IDEs and CLIs (such as Anthropic's Claude Code, OpenAI Codex, Cursor, Windsurf, and Gemini CLI) natively support Streamable HTTP transports and header configuration via command line, config files, or visual settings. RegressionBot's MCP endpoint is Streamable HTTP: configure the remote URL directly and authenticate with the x-api-key header. Do not configure it as a legacy SSE server.
claude mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"OpenAI Codex
Codex supports Streamable HTTP MCP servers in both the CLI and IDE extension. Both clients share the same MCP configuration, so you only need to configure RegressionBot once. The RegressionBot server does not use a legacy SSE endpoint event.
Add the following to ~/.codex/config.toml. The env_http_headers entry tells Codex to read the key from your environment rather than storing it in the config file.
[mcp_servers.regressionbot]
url = "https://mcp.regressionbot.com/"
env_http_headers = { "x-api-key" = "REGRESSIONBOT_API_KEY" }Alternatively, register the server via the Codex CLI and it will write the [mcp_servers.regressionbot] entry for you. You still need to add env_http_headers manually afterwards as the CLI does not accept header flags.
export REGRESSIONBOT_API_KEY="YOUR_API_KEY"
codex mcp add regressionbot --url https://mcp.regressionbot.com/
# then manually add env_http_headers to ~/.codex/config.tomlIn the Codex terminal UI, run /mcp to verify the server is active. In the Codex IDE extension, open MCP settings from the gear menu after editing config.toml to verify that RegressionBot is listed.
Cursor
Configure remote servers directly through Cursor's graphical interface to keep your API keys encrypted.
- Open Cursor Settings and navigate to Features > MCP Servers.
- Click + Add New MCP Server.
- Fill out the configuration:
- Name:
regressionbot - Type:
http - URL:
https://mcp.regressionbot.com/
- Name:
- Add a header row with
x-api-keyas the key and your API key as the value. - Click Save and verify the green status indicator.
Windsurf
Windsurf handles remote Streamable HTTP MCPs using a structured JSON file. Add the following to your ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"regressionbot": {
"serverUrl": "https://mcp.regressionbot.com/",
"headers": {
"x-api-key": "YOUR_API_KEY"
}
}
}
}Gemini CLI
Gemini CLI natively supports direct Streamable HTTP transport. You can add it via command line or settings file.
gemini mcp add --transport http regressionbot https://mcp.regressionbot.com/ --header "x-api-key: YOUR_API_KEY"{
"mcpServers": {
"regressionbot": {
"httpUrl": "https://mcp.regressionbot.com/",
"headers": {
"x-api-key": "YOUR_API_KEY"
}
}
}
}Agent Examples
These examples show what a real session looks like with an MCP-enabled Agentic IDE or CLI (like Claude Code, Codex, Cursor, Windsurf, etc.) with the RegressionBot MCP connected. The agent handles polling, result interpretation, and image analysis automatically.
Legend: USER> is what the user types, AGENT> is the assistant response, TOOL> is an MCP tool call, and TOOL RESULT> is raw tool output.
Compare a preview URL against production baselines
USER> run visual regression on https://staging.myapp.com against stored baselines
AGENT> I'll run a visual regression test using the RegressionBot MCP.
TOOL> [run_regression_job]
testOrigin: "https://staging.myapp.com"
project: "myapp"
devices: ["Desktop Chrome", "iPhone 12"]
TOOL RESULT> {
"message": "Job started",
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153"
}
TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
TOOL RESULT> {
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"status": "PROCESSING",
"isComplete": false,
"progress": { "total": 6, "completed": 3, "percent": "50.00" }
}
TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
TOOL RESULT> {
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"status": "COMPLETED",
"isComplete": true,
"progress": { "total": 6, "completed": 6, "percent": "100.00" }
}
TOOL> [get_job_status] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
TOOL RESULT> {
"status": "COMPLETED",
"isComplete": true,
"progress": { "total": 6, "completed": 6, "percent": "100.00" },
"executionTime": 48
}
TOOL> [get_job_summary] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153, regressionsOnly: true
TOOL RESULT> {
"regressionCount": 2,
"matchCount": 4,
"newBaselineCount": 0,
"results": [
{
"url": "https://staging.myapp.com/pricing",
"variantName": "Desktop Chrome",
"status": "SUCCESS",
"diffPercentage": 3.638671875,
"score": 96.361328125,
"isNewBaseline": false,
"regressionbotSummary": "The pricing table layout changed: the Pro tier
card is now highlighted with a border and the CTA button colour shifted
from grey to blue. No structural breakage detected.",
"diffUrl": "https://screenshots.regressionbot.com/r/token"
},
{
"url": "https://staging.myapp.com/pricing",
"variantName": "iPhone 12",
"status": "SUCCESS",
"diffPercentage": 3.638671875,
"score": 96.361328125,
"isNewBaseline": false,
"regressionbotSummary": "Same pricing change on iPhone 12 ā the card
highlight and button colour update are visible. Font sizes unchanged.",
"diffUrl": "https://screenshots.regressionbot.com/r/token"
}
]
}
AGENT> The job completed in 48 seconds. I found 2 regressions:
⢠/pricing [Desktop Chrome] ā 3.64% diff, score 96
"The Pro tier card is now highlighted with a border and the CTA button
colour shifted from grey to blue."
⢠/pricing [iPhone 12] ā 3.64% diff, score 96
"Same change on iPhone 12 ā card highlight and button colour visible."
AGENT> The other 4 pages matched their baselines exactly.
AGENT> Would you like me to fetch the diff images and review them before approving?Inspect a diff image and approve
USER> yes, look at the pricing diff and tell me if the change looks intentional
AGENT> [fetching diffUrl for /pricing Desktop Chrome via HTTP GET...]
AGENT> I can see the 3-panel diff ā BEFORE | AFTER | ANNOTATED DIFF.
AGENT> The changes are clearly intentional:
- The Pro plan card now has a visible border and background highlight
- The "Get started" button changed from #6b7280 to #2563eb (grey ā blue)
- No layout shifts, broken text, or clipped elements
AGENT> This looks like a deliberate design update. Safe to approve.
USER> approve it
TOOL> [approve_job] jobId: 6ee84bac-40de-44cc-8dca-4760ab108153
TOOL RESULT> {
"message": "Job approved",
"approvedCount": 2,
"failedCount": 0
}
AGENT> Done. The 2 updated screenshots have been promoted to new baselines.
AGENT> Future runs will compare against the new design.Run multiple jobs in parallel across devices
USER> test the PR preview against baselines on desktop, mobile, and tablet
AGENT> I'll start 3 parallel jobs ā one per device ā then poll all of them.
TOOL> [run_regression_job] project: "myapp", devices: ["Desktop Chrome"]
TOOL RESULT> { "jobId": "31c66aa4-5bd7-4df1-a8e3-1394d77a33f9" }
TOOL> [run_regression_job] project: "myapp", devices: ["iPhone 12"]
TOOL RESULT> { "jobId": "1a597938-b08e-4417-9802-cbb457870c96" }
TOOL> [run_regression_job] project: "myapp", devices: ["tablet"]
TOOL RESULT> { "jobId": "d5452cb0-6b7a-4186-a0b1-0e23204a31e3" }
TOOL> [get_job_status] 31c66aa4
TOOL RESULT> PROCESSING 33%
TOOL> [get_job_status] 1a597938
TOOL RESULT> PROCESSING 50%
TOOL> [get_job_status] d5452cb0
TOOL RESULT> PROCESSING 17%
...
TOOL> [get_job_status] 31c66aa4
TOOL RESULT> COMPLETED ā
TOOL> [get_job_status] 1a597938
TOOL RESULT> COMPLETED ā
TOOL> [get_job_status] d5452cb0
TOOL RESULT> COMPLETED ā
TOOL> [get_job_summary] 31c66aa4, regressionsOnly: true
TOOL RESULT> 1 regression
TOOL> [get_job_summary] 1a597938, regressionsOnly: true
TOOL RESULT> 1 regression
TOOL> [get_job_summary] d5452cb0, regressionsOnly: true
TOOL RESULT> 0 regressions
AGENT> Results across all devices:
⢠/about [Desktop Chrome] ā 3.64% diff, score 96
"Hero section padding increased, CTA button widened."
⢠/about [iPhone 12] ā 3.64% diff, score 96
"Same layout change on iPhone 12 ā consistent with desktop update."
AGENT> ⢠All tablet pages matched baselines.Recover a previous job by listing recent runs
USER> what was the result of the last run for the myapp project?
TOOL> [list_jobs] project: "myapp", limit: 10
TOOL RESULT> {
"jobs": [
{
"jobId": "6ee84bac-40de-44cc-8dca-4760ab108153",
"project": "myapp",
"status": "APPROVED",
"totalUrls": 6,
"completedCount": 6,
"createdAt": "2026-05-19T10:53:26.000Z",
"finishedAt": "2026-05-19T10:54:14.000Z"
},
{
"jobId": "e69062bb-f140-40e7-a3e8-40f84deadcdb",
"project": "myapp",
"status": "COMPLETED",
"totalUrls": 6,
"completedCount": 6,
"createdAt": "2026-05-17T13:22:01.000Z",
"finishedAt": "2026-05-17T13:22:06.000Z"
}
]
}
AGENT> Most recent: job 6ee84bac ā APPROVED, 6 pages tested, finished ~1 hour ago.
AGENT> The run before that is COMPLETED but not yet approved (2 regressions pending).Tip: The agent can fetch and visually inspect diff images directly from the shortened, clickable diffUrl in the results. Each diff is a 3-panel image ā BEFORE | AFTER | ANNOTATED DIFF ā with changed pixels highlighted in red and bounding boxes drawn around clusters of changes. This is the same image used to generate regressionbotSummary.
Projects
A project is a named configuration that locks in your test parameters ā testOrigin, devices, paths, and more. The first run_regression_job call with a new project name creates the config automatically. Subsequent runs enforce that the params match, protecting baseline integrity.
How it works
- First run ā supply all params. RegressionBot creates the project config and captures baselines.
- Repeat runs ā omit params (or supply identical ones) to reuse the stored config. If you pass params that differ from the stored config, the run is rejected with an error listing the differing fields.
- Shorthand ā use
run_from_projectto run a repeat job without supplying any params at all. - Update config ā use
update_projectto update a saved project's configuration and reset its baselines. Example:update_project({ name: "my-project", testOrigin: "https://v2.example.com" }) - Approve by name ā use
approve_projectto approve the most recent completed job for a project without needing a job ID. Example:approve_project({ name: "my-project" }) - Live-vs-live runs ā always accepted regardless of stored config. Passing
baseOriginskips the config check entirely.
Managed CI workflow example
USER> set up a new project for myapp and capture baselines
AGENT> I'll create the project on the first run and capture baselines.
TOOL> [run_regression_job]
project: "myapp"
testOrigin: "https://www.myapp.com"
devices: ["Desktop Chrome", "iPhone 12"]
paths: [{"path":"/"}, {"path":"/pricing"}, {"path":"/docs"}]
autoApprove: true
TOOL RESULT> { "message": "Job started", "jobId": "a1b2c3d4-..." }
AGENT> Baselines are being captured for 3 pages on desktop and mobile.
AGENT> The config is now saved as project "myapp". Future runs are locked to these params.
---
USER> run regression check for myapp
TOOL> [run_from_project] name: "myapp"
TOOL RESULT> { "message": "Job started", "jobId": "e5f6a7b8-...", "project": "myapp" }
AGENT> Job started using stored config ā testOrigin myapp.com, Desktop Chrome + iPhone 12, 3 paths.
---
USER> what projects do I have?
TOOL> [list_projects]
TOOL RESULT> {
"projects": [
{
"name": "myapp",
"testOrigin": "https://www.myapp.com",
"devices": ["Desktop Chrome", "iPhone 12"],
"lastRunAt": "2026-06-02T10:15:00.000Z",
"lastJobId": "e5f6a7b8-..."
}
]
}Available Tools
Once connected, your agent has access to the following tools. The MCP server provides instructions automatically, so your agent understands the workflow without any additional prompting.
| Tool | What it does |
|---|---|
| run_regression_job | Start a visual regression job. Accepts explicit paths, a sitemapUrl to crawl, or glob scans to filter a sitemap. Supports multiple devices, element masks, and an optional baseOrigin for ad-hoc origin-vs-origin comparison. |
| get_job_status | Poll progress. Returns status, summaryStatus, progress (total / completed / percent), executionTime, and an isComplete flag so your agent knows when to stop polling. Status lifecycle: INITIALIZING ā PROCESSING ā COMPLETED. Once status is complete, you can download images and view diff metrics immediately, but you must wait for summaryStatus === 'COMPLETE' to fetch AI summaries. |
| get_job_summary | Get full results once the job is complete. Returns summaryStatus alongside results; if 'COMPLETE', results include the AI-written regressionbotSummary, otherwise it is filtered out to avoid incomplete/empty outputs. Each result also includes diffPercentage, score, visualMatchScore, and shortened screenshot redirect URLs (falling back to pre-signed cloud storage URLs on write errors). Pass regressionsOnly: true to filter out matching pages. |
| list_jobs | List recent jobs ordered by most recent first. Filter by project or set a limit. Useful for recovering a jobId or checking the status of recent runs without leaving the agent session. |
| approve_job | Promote current screenshots to baselines, accepting all visual changes as intentional. This is permanent ā existing baselines are overwritten. Only call after reviewing regressions in get_job_summary. |
| run_from_project | Run a job using a saved project config ā no need to re-supply testOrigin, devices, or paths. Ideal for CI repeat runs once a project is established. Accepts optional autoApprove. |
| get_project | Retrieve the stored configuration for a named project ā testOrigin, devices, paths, scans, and run metadata. |
| list_projects | List all saved project configs for your org. Returns name, testOrigin, devices, and last run info for each project. |
| update_project | Update a saved project's configuration and reset its baselines. Pass any subset of testOrigin, devices, paths, masks, etc. Triggers baseline invalidation ā the next run will re-baseline everything. |
| approve_project | Approve the most recent completed job for a project, promoting its screenshots to baselines. No job ID required ā ideal for CI workflows where the approving step runs separately from the regression check. |