---
name: browserman
description: Use when the user wants to automate browser tasks, control a web page, take screenshots, interact with websites, or run social media automation scripts. Supports pre-built scripts (dynamically updated) and low-level browser commands. You MUST call GET /api/scripts before every task to check for matching scripts — if a match exists, use run_script instead of low-level commands.
---

# BrowserMan Browser Automation

Control a real Chrome browser via HTTP API. Supports pre-built platform scripts and low-level browser commands (navigate, click, type, screenshot, etc.).

## Setup

You need three values. If any are missing, ask the user.

| Variable | Description | Example |
|----------|-------------|---------|
| `BROWSERMAN_URL` | Server URL (local or remote) | `https://browserman-server.fly.dev` |
| `BROWSERMAN_API_KEY` | API key (starts with `bm_key_`) | `bm_key_abc123...` |
| `BROWSERMAN_EXTENSION` | Extension key (starts with `bm_ext_`) | `bm_ext_def456...` |

Check environment variables first. If not set, ask the user.

## Authentication

All requests require a Bearer token in the Authorization header:

```
Authorization: Bearer <BROWSERMAN_API_KEY>
```

## Core Endpoint

All browser commands go through a single endpoint:

```
POST {BROWSERMAN_URL}/api/command
Content-Type: application/json
Authorization: Bearer <BROWSERMAN_API_KEY>

{
  "extension": "<BROWSERMAN_EXTENSION>",
  "action": "<action_name>",
  "params": { ... },
  "timeout": 30000
}
```

Response: JSON object. On error: `{ "error": "Error message" }`

HTTP status codes: `200` Success, `202` Async started, `400` Bad request, `401` Unauthorized, `403` Forbidden, `404` Not found, `503` Extension offline.

## Check Extension Status

**Before running any command**, check if the browser extension is online:

```json
{
  "extension": "<BROWSERMAN_EXTENSION>",
  "action": "ping"
}
```

Response:
```json
{
  "success": true,
  "online": true,
  "browserInfo": { "name": "Chrome", "version": "146.0.0.0", "platform": "MacIntel" },
  "timestamp": "2026-04-06T13:44:54.728Z"
}
```

- `online: true` → Extension is connected, ready for commands
- `online: false` → Extension is disconnected, ask user to check Chrome

**⚠️ Do NOT use `get_sessions` to check online status.** Empty sessions just means no active script executions — the extension can still be online and ready.

---

## Task Execution Flow (MUST follow this order)

### Step 0: Query Script Catalog (MANDATORY for every task)

Before doing ANYTHING else, call this endpoint to discover available pre-built scripts:

```
GET {BROWSERMAN_URL}/api/scripts
Authorization: Bearer <BROWSERMAN_API_KEY>
```

No extension key needed. Returns a dynamic catalog of platform scripts:
```json
{
  "x.com": {
    "post": { "description": "Create a new tweet", "params": { "text": "required", "mediaUrls": "optional array" } },
    "like": { "description": "Like a tweet", "params": { "url": "required" } }
  },
  "linkedin.com": { ... },
  "reddit.com": { ... },
  "medium.com": { ... }
}
```

**This list updates dynamically. Do NOT rely on memory — always call this endpoint.**

Each action in the catalog includes a `health` field indicating reliability:

### Script Health Status

| Status | Icon | Meaning | Recommendation |
|--------|------|---------|---------------|
| `healthy` | ✅ | Recently tested and working | Preferred — use confidently |
| `degraded` | ⚠️ | Partially working (intermittent failures) | Use with caution, have a fallback plan |
| `broken` | ❌ | Known broken (recent tests all failed) | Avoid — fall back to low-level commands (Step 1) |
| `untested` | ❓ | Never tested or not tested recently | Use at your own risk |

**Decision priority:**
1. Always prefer `healthy` scripts
2. For `degraded` scripts: try the script, but be ready to retry or fall back
3. For `broken` scripts: skip the script, use low-level commands directly
4. For `untested` scripts: you may try, but treat as experimental

**Public endpoints (no auth required):**
- `GET /api/scripts/catalog` — full catalog with health status
- `GET /api/scripts/health` — health status only

**Decision point:**
- Task matches a `platform + action` in the catalog? → Go to **"Run a Script"** below
- No match? → Go to **Step 1** (low-level commands)

---

### Run a Script (when Step 0 finds a match)

Scripts are executed via the same `POST /api/command` endpoint. The `extension` field is required.

```
POST {BROWSERMAN_URL}/api/command
Content-Type: application/json
Authorization: Bearer <BROWSERMAN_API_KEY>

{
  "extension": "<BROWSERMAN_EXTENSION>",
  "action": "run_script",
  "params": {
    "platform": "x.com",
    "action": "post",
    "text": "Hello from BrowserMan!",
    "mediaUrls": ["https://example.com/image.png"]
  }
}
```

Response (HTTP 202):
```json
{ "executionId": "exec_abc123...", "status": "running" }
```

**Poll for result:**
```
GET {BROWSERMAN_URL}/api/execution/<executionId>
Authorization: Bearer <BROWSERMAN_API_KEY>
```

Response:
```json
{
  "id": "exec_abc123...",
  "status": "completed",
  "currentStep": "Done",
  "result": { "success": true, "url": "https://x.com/user/status/123" },
  "error": null
}
```

`status` values: `"running"`, `"completed"`, `"error"`. Poll every 2-3 seconds. Timeout after 2 minutes.

**Scripts handle login flows, element waits, retries, and multi-step orchestration automatically.** This is why they are preferred over low-level commands.

---

### Step 1: session_init (low-level path)

Only reach here if Step 0 found no matching script.

Creates or reuses a browser tab for automation.

```json
{
  "action": "session_init",
  "params": { "sessionId": "my-session" }
}
```

`sessionId` is optional. Use the same `sessionId` across commands to reuse the same tab.

### Step 2: navigate

```json
{
  "action": "navigate",
  "params": { "url": "https://example.com" }
}
```

### Step 3: read_page

Returns the accessibility tree. Each interactive element has a `ref` ID for use with `click_ref` and `form_input`.

```json
{
  "action": "read_page",
  "params": { "filter": "interactive" }
}
```

`filter`: `"interactive"` (buttons, links, inputs — recommended) or `"all"` (full page tree).

**Important:** Always call `read_page` before clicking or typing. Refs change on navigation or DOM updates.

### Step 4: Interact with elements

**click_ref** — Click by ref ID:
```json
{ "action": "click_ref", "params": { "ref": "5" } }
```

**type** — Type text at cursor:
```json
{ "action": "type", "params": { "text": "Hello, world!" } }
```

**form_input** — Set a form field (more reliable than click + type):
```json
{ "action": "form_input", "params": { "ref": "3", "value": "john@example.com", "format": "text" } }
```
Formats: `"text"`, `"select"`, `"checkbox"`, `"radio"`, `"contenteditable"`

**press** — Keyboard key:
```json
{ "action": "press", "params": { "key": "Enter" } }
```
Common keys: `Enter`, `Tab`, `Escape`, `Backspace`, `ArrowDown`, `ArrowUp`

**scroll** — Scroll page:
```json
{ "action": "scroll", "params": { "direction": "down", "pixels": 500 } }
```
Or scroll element into view: `{ "action": "scroll", "params": { "ref": "12" } }`

### Step 5: Verify and repeat

- Call `read_page` again after interactions (refs may have changed)
- Use `screenshot` to verify visually:
```json
{ "action": "screenshot", "params": {} }
```
Returns `{ "success": true, "data": "<base64 JPEG>", "mimeType": "image/jpeg" }`

- Use `url` to check current page:
```json
{ "action": "url", "params": {} }
```

- Use `evaluate` to run JavaScript:
```json
{ "action": "evaluate", "params": { "expression": "document.title" } }
```

### Step 6: Clean up

```json
{ "action": "task_end", "params": {} }
```

### Other commands

**upload_file** — Upload file to a file input:
```json
{ "action": "upload_file", "params": { "ref": "7", "url": "https://example.com/image.png", "fileName": "image.png" } }
```

**tab_new** — Open a new tab:
```json
{ "action": "tab_new", "params": { "url": "https://example.com" } }
```

**tab_activate** — Bring the current session's tab to foreground:
```json
{ "action": "tab_activate", "params": {} }
```

**get_tabs** — List all open Chrome tabs:
```json
{ "action": "get_tabs", "params": {} }
```

**wait** — Wait for a selector to appear:
```json
{ "action": "wait", "params": { "selector": "#my-element", "timeout": 10000 } }
```

**fetch_url** — Fetch a URL via the browser (uses browser cookies):
```json
{ "action": "fetch_url", "params": { "url": "https://api.example.com/data" } }
```

**network_listen** — Listen for network requests matching a pattern:
```json
{ "action": "network_listen", "params": { "urlPattern": "api.example.com" } }
```

**network_intercept** — Intercept and modify network requests:
```json
{ "action": "network_intercept", "params": { "urlPattern": "api.example.com", "action": "block" } }
```

---

## Error Handling

| HTTP Status | Meaning | Action |
|-------------|---------|--------|
| 200 | Success | Process response |
| 202 | Async started | Poll `/api/execution/:id` |
| 400 | Bad request | Check params |
| 401 | Unauthorized | Check API key |
| 403 | Forbidden | Check extension ownership |
| 404 | Not found | Check extension key or execution ID |
| 503 | Extension offline | Ask user to check Chrome extension is running |

### Common issues

- **Extension offline (503):** The Chrome extension must be running and connected. Ask the user to open Chrome and verify the BrowserMan extension is active.
- **Stale refs:** Element refs change when the page updates. Always call `read_page` again after navigation or significant interactions.
- **Timeouts:** Default timeout is 30 seconds. For slow pages, pass `"timeout": 60000` in the request body.
- **Script execution timeout:** Scripts timeout after 5 minutes. Check execution status for error details.

---

## Script Health

Health status is **read-only** for regular API users. Only the platform operator can modify health data.

**Public endpoints (no auth required):**
- `GET /api/scripts/health` — all health statuses
- `GET /api/scripts/catalog` — full catalog with health status

**Admin-only endpoints (require `X-Admin-Key` header):**
- `POST /api/scripts/health/report` — report test result
- `PATCH /api/scripts/health/:platform/:action` — manual override

Regular API keys (`bm_key_`) cannot write to health endpoints. If you encounter a script bug, report it to the platform operator rather than calling the health API directly.

---

## Registration

To use BrowserMan, you need an account. Register via API:

```
POST {BROWSERMAN_URL}/auth/register
Content-Type: application/json

{
  "email": "agent@example.com",
  "password": "securePassword123",
  "name": "My AI Agent"
}
```

Returns: `{ user, token }`

A default Agent key and Browser key are created automatically on registration.

To create additional agents:

```
POST {BROWSERMAN_URL}/api/keys
Authorization: Bearer <session_token>

{ "name": "my-second-agent" }
```

Returns: `{ id, name, key }` — save the `key` value, it won't be shown again.

Use the Agent key as `BROWSERMAN_API_KEY` in the Authorization header for all subsequent requests.

### Key Types

BrowserMan uses two types of keys:

| Key Type | Prefix | Purpose | Dashboard Name |
|----------|--------|---------|----------------|
| **Agent Key** | `bm_key_` | Authenticate API requests (scripts, commands, health reports) | "My Agents" |
| **Browser Key** | `bm_ext_` | Connect Chrome extension to server (passed in `extension` field) | "My Browsers" |

When you sign up, a default Agent and Browser are created automatically.

### Key Management API

All key management endpoints accept both session tokens (`bm_sess_`) and Agent keys (`bm_key_`) for authentication.

**Agents (API Keys):**
```
GET    /api/keys          — list your agents
POST   /api/keys          — create: { "name": "my-agent" } → returns full key (once)
DELETE /api/keys/:id      — delete an agent
```

**Browsers (Extension Keys):**
```
GET    /api/extensions          — list your browsers (with live online/offline status)
POST   /api/extensions          — create: { "name": "work-chrome" } → returns full key (once)
DELETE /api/extensions/:id      — delete a browser
GET    /api/extensions/:id/status — check if browser is online
```
