This commit is contained in:
2026-04-12 01:06:31 +07:00
commit 10d660cbcb
1066 changed files with 228596 additions and 0 deletions

View File

@@ -0,0 +1,112 @@
# agent-browser vs chrome-devtools
Detailed comparison guide for choosing between browser automation skills.
## Feature Comparison
| Feature | agent-browser | chrome-devtools |
|---------|---------------|-----------------|
| **Engine** | Playwright (via Rust CLI) | Puppeteer |
| **Refs system** | `@e1` inline | `[ref=e1]` YAML |
| **Session persistence** | Named sessions (`--session`) | `.browser-session.json` |
| **Screenshot** | Basic | Auto-compress >5MB (Sharp) |
| **Network intercept** | `route` command | `network.js` script |
| **Console capture** | Basic | With filtering |
| **WebSocket debug** | Limited | Full frames support |
| **Video recording** | Built-in `record` | Not available |
| **PDF export** | Built-in `pdf` | Via Puppeteer API |
| **Auth persistence** | `state save/load` | `inject-auth.js` |
| **Multi-tab** | Full support | Limited |
| **Cloud browsers** | Browserbase native | Manual setup |
| **Performance** | Rust CLI (fast) | Node.js |
| **Custom scripts** | None (CLI only) | 20+ utilities |
## Token Efficiency Benchmarks
| Metric | agent-browser | chrome-devtools | Playwright MCP |
|--------|---------------|-----------------|----------------|
| Homepage snapshot | ~280 chars | ~300-500 chars | ~8,247 chars |
| Context reduction | 93% vs MCP | 90% vs MCP | Baseline |
| Tool definitions | ~2K tokens | 0 (CLI scripts) | ~17K tokens |
**Conclusion:** Both agent-browser and chrome-devtools are similarly efficient. Both dramatically outperform Playwright MCP.
## Use Case Decision Tree
```
Need browser automation?
|
+-- Long autonomous AI session?
| +-- YES --> agent-browser (better context efficiency)
| +-- NO --> Continue
|
+-- Need video recording?
| +-- YES --> agent-browser (built-in)
| +-- NO --> Continue
|
+-- Cloud browser (CI/CD)?
| +-- YES --> agent-browser (Browserbase native)
| +-- NO --> Continue
|
+-- Custom Puppeteer scripts?
| +-- YES --> chrome-devtools (20+ utilities)
| +-- NO --> Continue
|
+-- WebSocket debugging?
| +-- YES --> chrome-devtools (full frames)
| +-- NO --> Continue
|
+-- Screenshot auto-compression?
| +-- YES --> chrome-devtools (Sharp)
| +-- NO --> agent-browser OR chrome-devtools
```
## Parallel Usage Patterns
Both skills can coexist - use the right tool for each task:
```bash
# Quick screenshot with compression -> chrome-devtools
node "$SKILL_DIR/screenshot.js" --url https://example.com --output ss.png
# Long autonomous session -> agent-browser
agent-browser --session test1 open https://example.com
agent-browser snapshot -i
# ... many interactions ...
agent-browser close
```
## Migration Guide
### From chrome-devtools to agent-browser
| chrome-devtools | agent-browser |
|-----------------|---------------|
| `node navigate.js --url X` | `agent-browser open X` |
| `node aria-snapshot.js --url X` | `agent-browser open X && agent-browser snapshot -i` |
| `node select-ref.js --ref e5 --action click` | `agent-browser click @e5` |
| `node fill.js --selector "#email" --value "X"` | `agent-browser fill @e1 "X"` |
| `node screenshot.js --output X.png` | `agent-browser screenshot -o X.png` |
| `node console.js --types error` | No direct equivalent |
| `node network.js` | No direct equivalent |
### Key Differences
1. **Refs format:** `[ref=e5]` vs `@e5`
2. **Session:** File-based vs named sessions
3. **Commands:** Node scripts vs CLI commands
4. **Output:** JSON always vs JSON with `--json` flag
## When to Switch
**Switch to agent-browser when:**
- Starting new long-running automation
- Need video recording capability
- Moving to cloud browsers (Browserbase)
- Want simpler CLI syntax
**Keep chrome-devtools when:**
- Existing workflows depend on custom scripts
- Need WebSocket full-frame debugging
- Need automatic screenshot compression
- Need fine-grained console log filtering

View File

@@ -0,0 +1,161 @@
# Browserbase Cloud Setup
Configure agent-browser to use Browserbase cloud browsers for CI/CD and headless environments.
## Overview
Browserbase provides remote browser infrastructure. Use when:
- Running in CI/CD pipelines
- Local browser not available
- Need consistent browser environment
- Scaling parallel browser sessions
## Account Setup
1. Sign up at [browserbase.com](https://browserbase.com)
2. Create a project
3. Get API key from dashboard
4. Note your project ID
## Environment Variables
```bash
# Required
export BROWSERBASE_API_KEY="bb_live_xxxxxxxxxxxxx"
export BROWSERBASE_PROJECT_ID="proj_xxxxxxxxxxxxx"
# Optional: set provider default
export AGENT_BROWSER_PROVIDER="browserbase"
```
## Usage
### Explicit Provider Flag
```bash
agent-browser -p browserbase open https://example.com
agent-browser snapshot -i
agent-browser click @e1
agent-browser close
```
### With Default Provider (env var)
```bash
# After setting AGENT_BROWSER_PROVIDER=browserbase
agent-browser open https://example.com # Uses Browserbase automatically
```
## CI/CD Integration
### GitHub Actions
```yaml
name: Browser Tests
on: [push]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install agent-browser
run: npm install -g agent-browser
- name: Run browser tests
env:
BROWSERBASE_API_KEY: ${{ secrets.BROWSERBASE_API_KEY }}
BROWSERBASE_PROJECT_ID: ${{ secrets.BROWSERBASE_PROJECT_ID }}
AGENT_BROWSER_PROVIDER: browserbase
run: |
agent-browser open https://example.com
agent-browser snapshot -i
agent-browser screenshot -o screenshot.png
agent-browser close
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: screenshots
path: screenshot.png
```
### GitLab CI
```yaml
browser-test:
image: node:20
variables:
AGENT_BROWSER_PROVIDER: browserbase
script:
- npm install -g agent-browser
- agent-browser open https://example.com
- agent-browser snapshot -i
- agent-browser close
artifacts:
paths:
- "*.png"
```
## Session Management
Browserbase sessions are managed automatically. Each `open` creates a new session, `close` terminates it.
```bash
# Long-running session
agent-browser -p browserbase open https://example.com
# ... many commands ...
agent-browser close # Terminates Browserbase session
```
## Parallel Sessions
Use named sessions for parallel browser instances:
```bash
# Session 1
agent-browser -p browserbase --session user1 open https://example.com
# Session 2 (separate terminal/process)
agent-browser -p browserbase --session user2 open https://example.com
```
## Debugging
### View Session Logs
Check Browserbase dashboard for:
- Session recordings
- Network logs
- Console output
- Screenshots
### Local Fallback
If Browserbase unavailable, remove provider flag to use local browser:
```bash
agent-browser open https://example.com # Uses local Chromium
```
## Troubleshooting
| Issue | Solution |
|-------|----------|
| Authentication failed | Verify API key is correct and not expired |
| Project not found | Check BROWSERBASE_PROJECT_ID matches dashboard |
| Connection timeout | Check network/firewall allows outbound connections |
| Session limit reached | Upgrade Browserbase plan or wait for sessions to expire |
| Commands hang | Ensure previous session closed properly |
## Pricing Considerations
- Browserbase charges per session minute
- Close sessions promptly with `agent-browser close`
- Use local browser for development, cloud for CI/CD
- Monitor usage in Browserbase dashboard
## Resources
- [Browserbase Documentation](https://docs.browserbase.com/)
- [Browserbase Dashboard](https://browserbase.com/dashboard)
- [agent-browser GitHub](https://github.com/vercel-labs/agent-browser)