Glossary¶

Quick definitions for project-specific terms. For deeper explanations, follow the links into the relevant section.

Term	Definition
Action	A unit of agent behavior the env can execute. One of `CLICK`, `DOUBLE_CLICK`, `TYPE`, `KEY_PRESS`, `SCROLL`, `DRAG`, `WAIT`, `DONE`.
Brain	An inference client. Today: `BrainHolo3` (llama.cpp / OpenAI-compat), `BrainClaude` (Anthropic API).
CUA	Computer Use Agent. The category Mantis sits in.
Detached run	A run started with `detached: true`. The server returns a `run_id` immediately and continues work in the background. Caller polls / waits for a webhook.
Gate	A micro-plan step with `gate: true`. Claude verifies a free-text condition; if it fails, the run halts.
GymEnvironment	The abstract env interface: `reset()`, `step(action)`, `screenshot()`, `close()`. `XdotoolGymEnv` is the concrete production impl.
Holo3	Holo3-35B-A3B, the small specialist multimodal model used for click / scroll / type / drag. Runs on a single GPU as GGUF.
Idempotency-Key	Header-based dedup key. Same key → same `run_id` returned. 24 h cache.
MicroPlan	A list of micro-step objects: `[{intent, type, section, ...}, ...]`. Reliable for high-volume workflows.
MicroPlanRunner	The orchestrator. Walks the micro-plan, runs each step, handles retries / gates / loops / checkpoint resume.
Plan text	Plain-English instruction submitted via `plan_text`; the server decomposes via Claude (cached).
Polished video	The composed walkthrough (title card → captioned run with action overlays → outro card) produced when `record_video: true`.
Raw recording	The unprocessed Xvfb screencast. Available via `?raw=1` on the video endpoint.
`run_id`	Server-generated identifier for a detached run. Format: `<YYYYMMDD>_<HHMMSS>_<random_hex>`.
`state_key`	Caller-chosen identifier for a workflow. Server prefixes it with `tenant_id` to isolate. Controls browser-profile reuse and checkpoint resume.
Step	One element of a micro-plan. Has `intent`, `type`, optional `section`, `gate`, `verify`, `loop_target`, etc.
Task suite	Multi-task plan format (`{tasks: [...]}`). Each task has its own `verify` predicate.
Tenant	A unit of isolation. Each tenant has its own token, scopes, caps, browser profile dir, run dir, and optionally its own Anthropic key.
TenantConfig	The resolved configuration for one request: `(tenant_id, scopes, max_, anthropic_secret_name, allowed_domains, webhook_)`.
Tier 1 / Tier 2	The hardening tiers. Tier 1: multi-tenant safe (auth, caps, isolation). Tier 2: production-quality (rate limits, idempotency, webhooks, allowlist, metrics).
xdotool	The X11 input-injection tool used to send clicks / keys to Chrome inside Xvfb. Fingerprint-free (no DevTools / CDP signal).
Xvfb	X virtual framebuffer. The headless display the agent's browser paints to and ffmpeg captures.
`X-Mantis-Token`	The container-level auth header. Custom header (not `Authorization: Bearer`) so it doesn't collide with platform-gateway auth.
`register_tool`	Host-provided tool registration on `MicroPlanRunner`. Lets the embedding host expose its own tools — `auth_credentials`, `request_user_input`, `bash`, etc. — to the brain mid-plan. JSON-schema input definition. See issue #71.
`PauseRequested`	Exception a registered tool raises when it needs out-of-band input (OTP, 2FA, confirmation). The runner catches it, snapshots state, and returns a serializable `PauseState`. Resume by calling `runner.resume(state, user_input=..., plan=plan)`. See issue #73.
`PauseState`	JSON-safe snapshot of a paused run: step index, pending tool + arguments, replayed step results, loop counters. Designed to round-trip through Postgres JSONB.
`RunnerResult`	Rich return type from `MicroPlanRunner.run_with_status(plan)` and `runner.resume(...)`. Carries `steps`, `status` (`completed` / `halted` / `cancelled` / `paused`), `cancelled` bool, and optional `pause_state`.
`cancel_event`	External cancellation hook on `MicroPlanRunner`. Pass any object with `.is_set()` (e.g. `threading.Event`) or a no-arg callable. Checked at every step boundary. ECS deploy SIGTERM is the canonical use case. See issue #76.
`step_callback`	Per-step observability hook: `Callable[[idx, intent, action_or_none, ok], None]`. Errors are logged, never raised — observability bugs cannot break a run. See issue #74.
`LAUNCH_APP`	An `ActionType` that lets a plan start a desktop binary on demand (`chromium`, custom wrapper script, ...). Closes the symmetry gap with the Claude backend's `bash` tool. See issue #72.
`scale_brain_to_display`	Helper that maps brain-space pixel coordinates to display-space pixels when a host resizes screenshots before inference. The contract that prevents the 1.5× click-offset bug class. See `reference/coordinate-spaces.md` and issue #75.
`fill_field`	Step type for a labelled text input. `params={"label": "User ID", "value": "alice"}`. The runner clicks the field, clears any pre-filled value, and types `value`. Uses `find_form_target` (single labelled element), not `find_all_listings`. See issue #80.
`submit`	Step type for a single labelled button — Login / Save / Update / Submit / Continue. `params={"label": "Login", "aliases"?: ["Sign In", "Continue"]}`. One click, then waits 2.5 s. The runner scrolls Page_Down up to 4 times when the button isn't in viewport (issue #89 §2); `aliases` covers copy variation across products (e.g. "Update Lead" / "Save" / "Save Changes").
`select_option`	Step type for a dropdown / `<select>` choice. `params={"dropdown_label": "Industry Vertical", "option_label": "Space Exploration"}`. Two-phase: click dropdown → screenshot the open menu → click option.
`find_form_target`	`ClaudeExtractor` method that locates one labelled element on a non-listings page. Counterpart to `find_all_listings` for forms. Returns `{x, y, action, value, label}`.