Browser Automation in Claude Code: 5 Tools Compared (2026)
Browser-use vs Agent Browser vs Playwright CLI vs Playwright MCP vs DevTools MCP — real-world token usage differs by 10x+. Speed, cost, and stability compared with setup instructions.
Claude CodeBrowser AutomationMCPPlaywright CLIPlaywright MCPAgent BrowserBrowser-use
3594  Words
2026-01-28
Writing code with AI is old news. The real game-changer is having AI control a browser — opening pages, clicking buttons, filling forms, and scraping data — all from a single natural-language prompt.
The Claude Code ecosystem now offers five mainstream browser automation options: Browser-use (an AI-agent-native automation framework), Vercel’s Agent Browser, Microsoft’s Playwright CLI (new in 2026), Microsoft’s Playwright MCP, and Google’s DevTools MCP. Each excels in different scenarios, and picking the wrong one can cost you time and tokens.
This guide provides a deep comparison of all five so you can make the right choice for your workflow.
March 2026 update: Added Browser-use — an AI-agent-first browser automation framework with local, cloud, and real-browser modes plus persistent sessions and cloud parallelism.
February 2026 update: Added Playwright CLI — Microsoft’s officially recommended token-efficient approach, with 4-100x lower token usage than MCP in real-world tests.
Why Browser Automation Matters
The Pain Without It
Imagine asking your AI assistant to:
- Screenshot a competitor’s new feature page
- Log into an internal dashboard and export a report
- Test whether a form you just built submits correctly
- Inspect an API response on a live page to debug an issue
Without browser automation, you have to manually open the browser, take screenshots or copy content, and paste it back to the AI. That workflow breaks down entirely for dynamic content behind logins or rendered by JavaScript.
What Browser Automation Enables
With browser automation, a single instruction does it all:
You say: "Open Amazon, search for 'mechanical keyboard', and list the top 5 prices."
The AI:
1. Launches a browser
2. Navigates to amazon.com
3. Types "mechanical keyboard" in the search box
4. Clicks search
5. Reads the top 5 product prices
6. Returns a formatted table
One sentence from you. Fully automated execution.
Quick Comparison of All Five Tools
Before diving deep, here is the overview:
| Dimension | Browser-use | Agent Browser | Playwright CLI | Playwright MCP | DevTools MCP |
|---|---|---|---|---|---|
| Developer | Browser-use team | Vercel Labs | Microsoft | Microsoft | |
| Purpose | AI-agent-native automation | Lightweight AI agent tool | Token-efficient agent automation | General browser automation | Chrome debugging protocol wrapper |
| Integration | Bash CLI / Skill | Bash CLI / Skill | Shell command / Skill | MCP Server | MCP Server + extension |
| Token usage | Very low | Very low (93% reduction) | Very low (75-99% reduction) | High | Medium |
| Browser support | Chromium / real Chrome / cloud | Chromium | Chrome/Firefox/WebKit | Chrome/Firefox/WebKit | Chrome only |
| Key strength | Multi-mode + persistent sessions + cloud parallelism | Fast, low tokens | Low tokens + cross-browser | Stable, full-featured | Deep debugging |
One-line summaries:
- Browser-use: The all-in-one solution — local, cloud, and real-browser modes with session persistence
- Agent Browser: Lightweight and fast — best for everyday browsing
- Playwright CLI: Token-efficient with professional capabilities — the new default for coding agents
- Playwright MCP: Most feature-complete — the stable choice for non-CLI environments
- DevTools MCP: The debugging specialist — best for inspecting and troubleshooting
Deep Dive: What Makes Each Tool Unique
Browser-use: The All-in-One Powerhouse
Browser-use is a browser automation framework built specifically for AI agents. Its core philosophy: let AI agents operate browsers like humans do, but 100x more efficiently.
Unlike the other tools, Browser-use is not just a utility — it is a complete AI agent browser automation platform supporting three distinct modes: isolated local browsers, real Chrome profiles, and cloud-hosted remote browsers.
Core Mechanism: State + Index
Browser-use takes a minimalist approach to page representation. The state command returns a list of interactive elements, each assigned a numeric index. Operations reference these indexes directly:
# Get page state
browser-use state
# Example output:
# [0] link "Home"
# [1] link "Products"
# [2] input "Search..."
# [3] button "Sign In"
# Operate by index
browser-use click 3 # Click "Sign In"
browser-use input 2 "iPhone" # Type in the search box
Similar to Agent Browser’s ref system, but Browser-use goes further — sessions persist across commands, so you never need to relaunch the browser.
Three Browser Modes
This is what sets Browser-use apart:
| Mode | Flag | Characteristics | Best for |
|---|---|---|---|
| chromium | -b chromium | Fast, isolated, headless by default | Automated testing, data scraping |
| real | -b real | Uses your actual Chrome with profiles | Sites requiring login state, extensions, cookies |
| remote | -b remote | Cloud-hosted browser with built-in proxy | Anti-bot bypass, parallel tasks, no local dependencies |
# Isolated mode: fast and clean
browser-use -b chromium open https://example.com
# Real browser: uses your Chrome profile
browser-use -b real --profile "Default" open https://example.com
# Cloud browser: no local resources needed, built-in proxy
browser-use -b remote open https://example.com
Real browser mode means the AI can use your existing login sessions, installed extensions, and saved passwords — no extra authentication setup required.
Cloud Parallelism: Sub-Agent Capability
Browser-use’s cloud mode supports async tasks and parallel agents — a capability no other tool offers:
# Launch a cloud task (async execution)
browser-use -b remote run "Open example.com and extract all product prices"
# Launch multiple tasks simultaneously
browser-use -b remote run "Check competitor A pricing" --session task-a
browser-use -b remote run "Check competitor B pricing" --session task-b
browser-use -b remote run "Check competitor C pricing" --session task-c
# View all task statuses
browser-use task list
# Get a specific task's result
browser-use task status --id <task-id>
Imagine having the AI open 10 competitor websites in parallel, scraping pricing data from all of them in seconds. Serial approaches simply cannot match this.
Advanced Features
Beyond basic browser operations, Browser-use offers several standout capabilities:
| Feature | Description |
|---|---|
| Python execution | Built-in Python session with cross-command state; direct access to the browser object |
| Profile sync | Sync cookies/profiles between local and cloud |
| Tunnel | Expose localhost:3000 to cloud browsers (browser-use tunnel 3000) |
| Data extraction | get text / get html / eval for direct page data retrieval |
| Smart waits | wait selector / wait text to wait for specific elements or text |
| Session management | Named sessions for parallel multi-browser operation |
Diagnostics
Not sure if your setup is correct? Browser-use includes a dedicated diagnostic command:
browser-use doctor
# Auto-checks: browser installation, dependency versions, network connectivity, cloud API availability
Best Use Cases
| Scenario | Example |
|---|---|
| Automation requiring login state | “Open our internal dashboard using my Chrome profile” |
| Parallel data collection | “Scrape pricing from 10 competitor sites simultaneously” |
| Anti-bot scenarios | “Use a cloud browser with proxy to access this site” |
| Local dev tunneling | “Tunnel localhost:3000 to a cloud browser for testing” |
| Complex Python scripts | “Process page data in batch using Python” |
Installation
# Install (requires Python)
pip install browser-use
# Diagnose your environment
browser-use doctor
# Start using
browser-use open https://example.com --headed
In Claude Code, Browser-use integrates as a Skill, supporting natural-language commands:
"Use browser-use to open example.com, get the page state, and click the sign-in button"
Agent Browser: The Lightweight Speed Demon
Agent Browser is Vercel’s purpose-built browser automation tool for AI agents. Its design philosophy: give the AI the minimum information needed to understand a webpage.
Core Mechanism: Snapshot + Refs
Traditional approaches send the entire DOM tree or accessibility tree to the AI, often consuming tens of thousands of tokens. Agent Browser takes a different approach — it sends a compact “snapshot” with short reference IDs (refs) for each interactive element.
# Agent Browser snapshot format
- button "Sign In" [ref=e1]
- input "Username" [ref=e2]
- input "Password" [ref=e3]
- link "Forgot Password" [ref=e4]
The AI sees this clean structure. To click “Sign In,” it simply says “click e1” — no CSS selectors or XPath needed.
Token Usage Comparison
| Operation | Traditional approach | Agent Browser |
|---|---|---|
| Open a moderately complex page | ~15,000 tokens | ~1,000 tokens |
| Fill out a form | ~8,000 tokens | ~500 tokens |
| Execute a 10-step workflow | ~100,000 tokens | ~7,000 tokens |
A 93% reduction in token usage means:
- Faster responses (less data for the AI to process)
- Lower costs (if billed by token)
- Less risk of hitting context window limits
Best Use Cases
| Scenario | Example |
|---|---|
| Browse a webpage | “Open the competitor’s homepage and check it out” |
| Screenshot comparison | “Take a screenshot to see the updated design” |
| Fill forms | “Enter the test data into the form” |
| Information gathering | “Check the pricing on this page” |
| Simple interactions | “Click that button” |
Installation
# Global install (recommended for best performance)
npm install -g agent-browser
# Install Chromium (required on first setup)
agent-browser install
# Start using
agent-browser open https://example.com
# Or try without installing globally (slower)
npx agent-browser open https://example.com
In Claude Code, Agent Browser typically integrates as a Skill:
"Use Agent Browser to open https://example.com and take a screenshot"
Playwright CLI: The Token-Efficient Specialist (New in 2026)
Playwright CLI is Microsoft’s next-generation browser automation approach, launched in early 2026. If Playwright MCP is the “heavy infantry,” the CLI is a “special forces unit” built specifically for coding agents like Claude Code, Cursor, and Copilot — same firepower, dramatically lower supply costs.
Microsoft explicitly recommends this approach in the Playwright MCP repository:
“Modern coding agents increasingly favor CLI-based workflows exposed as SKILLs over MCP because CLI invocations are more token-efficient.”
Core Mechanism: Data on Disk, Not in Context
The fundamental difference between Playwright CLI and MCP is where data lives:
Playwright MCP approach:
Page snapshot → returned in full to the AI → consumes many tokens
Screenshot → encoded as data in response → consumes even more tokens
Console logs → attached every time → ongoing token cost
Playwright CLI approach:
Page snapshot → saved as YAML file → AI reads only when needed
Screenshot → saved as PNG file → AI views only when needed
Console logs → written to log file → retrieved on demand
Think of it this way: MCP is a verbose assistant who dumps every detail into every report. CLI is an efficient assistant who says “the report is on your desk” and lets you read it when you need to.
Token Usage: Real-World Benchmarks
| Scenario | Playwright MCP | Playwright CLI | Savings |
|---|---|---|---|
| Single page snapshot | ~15,000 tokens | ~200 tokens (file path) | 98.7% |
| 10-step automation | ~114,000 tokens | ~27,000 tokens | 76.3% |
| Test flow with screenshots | ~150,000 tokens | ~5,000 tokens | 96.7% |
| Long sessions (50+ steps) | Context overflow risk | Runs stably | Qualitative leap |
Benchmark data from TestCollab and SupaTest independent reviews.
Why such a dramatic difference? MCP stuffs the full accessibility tree and console messages into context at every step. CLI returns only a file path and a short confirmation. Tokens saved, context window preserved.
Workflow Example
# 1. Open a page
playwright-cli open https://example.com --headed
# 2. Take a page snapshot (saved as YAML, not stuffed into context)
playwright-cli snapshot
# Output: Snapshot saved to .playwright/snapshots/page-001.yaml
# Each element has a ref ID (e.g., e8, e21, e35)
# 3. Operate elements by ref ID (extremely concise)
playwright-cli fill e8 "[email protected]"
playwright-cli fill e12 "password123"
playwright-cli click e15
# 4. Take a screenshot (saved as file, not converted to tokens)
playwright-cli screenshot
# Output: Screenshot saved to .playwright/screenshots/page-001.png
# 5. Save login state (reusable next time)
playwright-cli state-save login-state.json
Notice how every command response is just a short file path — not thousands of tokens of DOM tree. That is the secret to CLI’s efficiency.
50+ Commands, Full Coverage
Playwright CLI is not a stripped-down MCP. It has complete automation capabilities:
| Category | Commands | Purpose |
|---|---|---|
| Navigation | open, goto, go-back, reload | Page navigation |
| Interaction | click, fill, type, drag, hover | Element operations |
| Snapshots | snapshot | Get compact page structure |
| Screenshots | screenshot, pdf | Visual verification and export |
| State | state-save, state-load, cookie | Login state management |
| Debugging | console, network, tracing, video | Dev debugging |
| Sessions | Named sessions | Parallel multi-browser operation |
Best Use Cases
| Scenario | Example |
|---|---|
| Long automation tasks | “Run screenshot comparisons across 50 pages” |
| In-code test flows | “Test the login → checkout → payment flow end to end” |
| Token budget constraints | “Complete the browser task with minimal token usage” |
| Extending Playwright tests | “Add AI-driven tests on top of the existing test suite” |
Installation
# Install
npm install -g @playwright/cli@latest
# Initialize (auto-installs browsers)
playwright-cli install
# Start using
playwright-cli open https://example.com --headed
In Claude Code, Playwright CLI integrates as a Skill rather than an MCP Server — the approach Microsoft recommends.
CLI vs MCP: Which One?
Microsoft’s guidance is straightforward:
| Condition | Choice |
|---|---|
| Using Claude Code / Cursor / Copilot or similar coding agents | CLI (preferred) |
| Agent has filesystem and shell access | CLI |
| Long-running automation tasks | CLI |
| Sandboxed environment (no shell access) | MCP |
| Need MCP protocol standard for generic agent workflows | MCP |
Bottom line: If you are using Claude Code, CLI should be your default choice in most scenarios.
Playwright MCP: The Battle-Tested Workhorse
Playwright is Microsoft’s established browser automation framework, used by countless companies worldwide for E2E testing. Playwright MCP is its AI extension, purpose-built for tools like Claude Code.
Core Mechanism: Accessibility Tree
Playwright sends the full accessibility tree of a webpage to the AI. This tree contains detailed information about every element: role, name, state, hierarchy, and more.
# Playwright accessibility tree excerpt
- document
- navigation
- link "Home"
- link "Products"
- link "About Us"
- main
- heading "Welcome" [level=1]
- form
- textbox "Username" [required]
- textbox "Password" [required] [type=password]
- button "Sign In"
More comprehensive information, but higher token consumption.
Unique Strengths: Cross-Browser + Professional Testing
Playwright supports three browser engines:
- Chromium (Chrome, Edge)
- Firefox
- WebKit (Safari)
This means you can test your site across different browsers using the same set of commands.
Additional professional testing features include:
- Auto-wait: Only interacts with elements once they are ready — no race conditions
- Network interception: Mock API responses on the fly
- Multi-tab management: Control multiple pages simultaneously
- Video recording: Automatically record the entire operation sequence
Best Use Cases
| Scenario | Example |
|---|---|
| Feature testing | “Test the login flow” |
| User journey validation | “Run through the checkout process” |
| Regression testing | “Verify the fix did not break other features” |
| Multi-step automation | “Sign up → log in → post → log out” |
| Long-running stable execution | “This script needs to run for a while” |
Installation
// claude_desktop_config.json or settings.json
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@anthropic-ai/mcp-server-playwright"]
}
}
}
DevTools MCP: The Debugging Expert
DevTools MCP is Google’s official offering, wrapping the Chrome DevTools Protocol (CDP) directly. If you have ever used Chrome’s developer tools (F12), you know how powerful they are.
Core Mechanism: Chrome DevTools Protocol
CDP is Chrome’s internal protocol, providing access to nearly every aspect of the browser’s internals:
- Console output
- Network requests and responses
- DOM structure and styles
- JavaScript execution environment
- Performance metrics
- And more
DevTools MCP exposes all of this to the AI, turning it into your advanced debugging assistant.
Unique Strength: Unmatched Debugging
While other tools focus on operating the browser, DevTools MCP focuses on understanding what is happening inside it.
You say: "The page is blank. Help me figure out why."
DevTools MCP will:
1. Check the console for errors
2. Inspect network requests for failures
3. Analyze JavaScript execution for exceptions
4. Check whether key elements rendered correctly
5. Provide a diagnostic conclusion
No other tool can do this.
Best Use Cases
| Scenario | Example |
|---|---|
| Console error inspection | “The page is blank — investigate” |
| Network request debugging | “What did the API return?” |
| Performance analysis | “The page loads too slowly” |
| CSS/DOM inspection | “Why does the layout look wrong?” |
| Variable inspection | “Show me the value of this variable” |
Installation
DevTools MCP requires a Chrome extension:
- Install the MCP Server:
{
"mcpServers": {
"devtools": {
"command": "npx",
"args": ["@anthropic-ai/mcp-server-chrome-devtools"]
}
}
}
Install the companion extension from the Chrome Web Store
Launch Chrome with remote debugging enabled:
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
# Windows
chrome.exe --remote-debugging-port=9222
Practical Selection Guide
Scenario 1: Quick Page Inspection
Recommended: Agent Browser
You just want the AI to open a page, take a screenshot, or read some content — nothing complex. Agent Browser is the fastest and most token-efficient option.
"Open competitor.com and show me their pricing page"
"Take a screenshot of the homepage"
"Check if this form renders correctly"
Scenario 2: Authenticated Automation or Parallel Scraping
Recommended: Browser-use
Need to use a real Chrome profile (with existing logins and extensions), or run tasks across multiple sites simultaneously? Browser-use is the only option with three browser modes and cloud parallelism.
"Use my Chrome profile to open the internal dashboard and export the monthly report"
"Open 10 competitor sites in parallel and scrape their pricing"
"Use a cloud browser with proxy to access this geo-restricted site"
Scenario 3: Complex User Flow Testing
Recommended: Playwright CLI (in Claude Code) / Playwright MCP (in sandboxed environments)
Sign up, log in, place an order, pay, log out — multi-step flows need stable, reliable execution. The 2026 recommendation is Playwright CLI — it has the same Playwright engine under the hood but uses 4x fewer tokens, preventing context overflow during long flows.
"Test the user registration flow: fill form → verify email → complete profile → redirect to dashboard"
"Run the full checkout flow and save screenshots of each step"
If your agent lacks shell access (e.g., a browser-based AI assistant), stick with Playwright MCP.
Scenario 4: Debugging Page Issues
Recommended: DevTools MCP
Blank pages, API errors, broken layouts — these require deep access to the browser’s internals. DevTools MCP is the only tool with direct Console, Network, and DOM access.
"The page keeps showing a spinner. Find out which API call is hanging."
"This button does nothing when clicked. Check for JavaScript errors."
Scenario 5: Long-Running, High-Volume Browser Operations
Recommended: Playwright CLI
If your task involves 50+ browser operations (batch testing, large-scale scraping), Playwright MCP’s context will gradually bloat until it overflows. CLI’s “data on disk” architecture is naturally suited for long-running tasks.
"Open these 100 URLs one by one, take a snapshot, inspect elements, and save screenshots to the results directory"
Scenario 6: Multiple Capabilities at Once
You can combine them.
These tools are not mutually exclusive. Configure all of them and let the AI pick the best tool for each task.
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
},
"devtools": {
"command": "npx",
"args": ["@anthropic-ai/mcp-server-chrome-devtools"]
}
}
}
Add Agent Browser, Playwright CLI, and Browser-use as Skills, and you have a complete browser automation toolkit. Recommended combo: Agent Browser for everyday browsing, Browser-use for authenticated/parallel tasks, Playwright CLI for testing, DevTools MCP for debugging.
Pro Tips
1. Persist Login State
Many sites require authentication. Instead of logging in every time, have the AI save cookies:
"Open site.com with Agent Browser, let me log in, then save the login state"
Next time, the AI loads the saved cookies automatically — no re-authentication needed.
2. Headless Mode
If you do not need to see the browser window (e.g., running on a server), use headless mode:
"Run the login test flow with Playwright in headless mode"
3. Screenshot Comparison
When developing frontend features, compare before and after:
"Save a screenshot as before.png"
# Make code changes
"Take another screenshot as after.png and compare the differences"
4. Batch Operations
When you need the same operation on multiple pages:
"Open these 10 URLs one by one and save screenshots to the screenshots folder"
FAQ
Q1: Why can’t my Playwright MCP connect?
Check these common issues:
- Ensure Node.js 18+ is installed
- Verify the MCP Server configuration is correct
- Try running
npx @anthropic-ai/mcp-server-playwrightmanually to see errors
Q2: DevTools MCP says “Cannot connect to Chrome”?
Make sure:
- Chrome is running with remote debugging enabled on port 9222
- No other process is using that port
- The Chrome extension is installed and active
Q3: Agent Browser screenshots are blank?
The page likely has not finished loading. Try:
"Wait 3 seconds after opening the page, then take a screenshot"
Q4: Which tool is the most stable?
For stability, Playwright MCP is the best choice. It has robust wait mechanisms and error handling — a production-grade automation framework.
Q5: Is the token difference really that large?
Yes. In real-world tests running the same 10-step operation:
- Playwright MCP: ~114,000 tokens
- DevTools MCP: ~50,000 tokens
- Playwright CLI: ~27,000 tokens
- Agent Browser: ~7,000 tokens
The gap is substantial. Playwright CLI uses roughly 4x fewer tokens than its MCP counterpart — a qualitative difference in long-running tasks where MCP may overflow the context window while CLI runs to completion without issues.
Q6: Can I install Playwright CLI and MCP at the same time?
Yes. CLI operates via shell commands while MCP runs as an MCP Server — they do not conflict. You can even have the AI use CLI for simple operations (saving tokens) and switch to MCP when full accessibility tree analysis is needed.
Summary
| If you need… | Choose |
|---|---|
| Quick browsing, screenshots, simple interactions | Agent Browser |
| Authenticated sessions / parallel scraping / anti-bot bypass | Browser-use |
| Testing and automation in Claude Code | Playwright CLI (2026 top pick) |
| Browser automation in sandboxed environments | Playwright MCP |
| Debugging, performance analysis, network inspection | DevTools MCP |
| All of the above | Configure all five — the AI picks the right tool |
Quick reference:
- Browse and fill forms → Agent Browser
- Login state, parallelism, anti-bot → Browser-use
- Test and automate (with shell access) → Playwright CLI
- Test and automate (sandboxed) → Playwright MCP
- Debug and inspect → DevTools MCP
2026 recommendation: If you only install one, choose Browser-use — it covers three browser modes, persistent sessions, and cloud parallelism, making it the most versatile option for AI agent browser automation. If you focus on testing workflows, go with Playwright CLI. For the most token-efficient everyday browsing, add Agent Browser.
Now go let your AI assistant truly take the wheel.
Related Reading
- Claude Code Complete Guide: From Beginner to Expert
- Claude Code Best Practices
- Claude Code Command Cheat Sheet
- Anthropic Launches Claude Cowork: AI That Operates Your Computer Files
- OpenClaw’s 30-Day Sprint: 180K Stars, 40+ Vulnerabilities, Founder Joins OpenAI
References:
- Browser-use GitHub
- Vercel Agent Browser GitHub
- Playwright MCP Official Repository
- Playwright CLI Deep Review - TestCollab
- MCP vs CLI Analysis - SupaTest
- Chrome DevTools Protocol Documentation
Comments
Join the discussion — requires a GitHub account