Skip to content

Environment variables

Reference for every server-side env knob. Set on the container (Baseten / k8s Deployment / docker run -e ...) — never in client-side code.

Auth

Var Default Effect
MANTIS_API_TOKEN unset Single-tenant mode: any caller with this token gets DEFAULT_TENANT permissions. Ignored if MANTIS_TENANT_KEYS_PATH is set.
MANTIS_TENANT_KEYS_PATH unset Path to JSON keys file. When set, the server runs in multi-tenant mode.

Caps

Var Default Effect
MANTIS_MAX_STEPS_PER_PLAN 200 Reject plans larger than this with 400
MANTIS_MAX_LOOP_ITERATIONS 50 Silently clamp loop_count in micro-plans
MANTIS_MAX_RUNTIME_MINUTES 60 Hard wall-time cap on every run
MANTIS_MAX_COST_USD 25.0 Hard cost cap on every run

These are global hard caps; tenant config can be tighter, never looser.

Paths

Var Default Effect
MANTIS_DATA_DIR /workspace/mantis-data Top-level data volume. Per-tenant subtree at tenants/<tenant_id>/.
MANTIS_REPO_ROOT /workspace/cua-agent Where task_file / micro paths are resolved from.
MANTIS_DEBUG_DIR <MANTIS_DATA_DIR>/screenshots/claude_debug Where Claude extraction prompt + screenshot debug bundles land.
MANTIS_IDEMPOTENCY_DIR <MANTIS_DATA_DIR>/idempotency Sidecar files for idempotency cache.
MANTIS_CHROME_PROFILE_DIR set per-request by handler Chrome profile dir used by the Xvfb env. The handler overrides this per tenant + profile_id (#341; falls back to legacy state_key when profile_id isn't set).

Inference

Var Default Effect
MANTIS_LLAMA_PORT 18080 Internal port the in-pod llama.cpp server binds to. The /v1/chat/completions proxy forwards here.
MANTIS_BRAIN holo3 Brain backend selector. One of holo3, claude, opencua, llamacpp, gemma4, agent-s, mock. Wins over the legacy MANTIS_MODEL. mock is a deterministic always-DONE stub for plan authoring without GPU / API cost (#274).
MANTIS_MODEL (set by Truss) Legacy alias of MANTIS_BRAIN for one minor release. gemma4-cua aliases to gemma4.
MANTIS_HOLO3_MODEL_DIR /models/holo3 Where Holo3 GGUF weights are mounted.
ANTHROPIC_API_KEY unset Default Anthropic key. Per-tenant anthropic_secret_name overrides per request.
MANTIS_PROMPTS_DIR unset Override directory for prompt files. When set, the loader reads <dir>/<name>.txt before falling back to the in-tree constant — lets a tenant tune wording without forking the wheel. Names: system_v1, gemma4_system, holo3_system, claude_system, opencua_system, llamacpp_system.

Proxy (IPRoyal)

Var Default Effect
PROXY_URL unset host:port for the upstream IPRoyal proxy
PROXY_USER unset session id
PROXY_PASS unset password
MANTIS_PROXY_CITY unset Default proxy geo override (caller can override per request)
MANTIS_PROXY_STATE unset Same

Webhooks

Var Default Effect
MANTIS_WEBHOOK_SECRET_DEFAULT unset Fallback HMAC signing secret when a tenant's webhook_secret_name doesn't resolve

Cost model (#122)

Var Default Effect
MANTIS_COST_GPU_HOURLY_USD 3.25 GPU compute, $/hour. Used by CostConfig.gpu_cost.
MANTIS_COST_CLAUDE_CALL_USD 0.003 Per-Claude-API-call rate. Multiplied by claude_extract + claude_grounding counters.
MANTIS_COST_PROXY_PER_GB_USD 5.00 Egress proxy bandwidth $/GB.
MANTIS_COST_GPU_SECONDS_PER_STEP 3.0 Per-step GPU seconds when the runner doesn't measure exact wall time.
MANTIS_COST_PROXY_MB_PER_NAV 5.0 Estimated proxy MB per page load.
MANTIS_COST_PROXY_MB_PER_SCROLL 0.5 Estimated proxy MB per scroll.

See operations/cost.md for the full rate-tuning workflow.

Trace export (#155)

Var Default Effect
MANTIS_TRACE_EXPORT_DIR unset Enable per-run trace export. When set, every completed / halted / cancelled / paused run writes <dir>/<tenant_id>/<run_id>.json with the full step list, costs, status, and predicted/observed outcomes. Empty tenant_id falls back to __shared__/. Off by default — feature flag for the continual-fine-tuning pipeline.
MANTIS_TRACE_INCLUDE_SCREENSHOTS unset When truthy (1/true/yes/on) and trace export is enabled, also persists per-step PNG screenshots to <dir>/<tenant_id>/<run_id>_screens/<step:04d>.png. Default off because screenshot bytes ~100× the on-disk trace size.

Augur observability (#509)

Active only when the augur-sdk package is importable; install via pip install 'mantis-agent[observability]'. See Augur integration for the full contract.

Var Default Effect
AUGUR_DSN unset Sentry-style DSN. When set, the SDK opens a streaming sink to the workspace alongside the on-disk bundle. When unset, only the bundle is written.
AUGUR_CAPTURE_MODE screenshots One of off / metadata / trace / screenshots / video / model_io / dispatch / replay / full. Controls what the SDK captures.
MANTIS_AUGUR_DIR unset Override the root directory where bundles are written. Run id is still appended. Falls back to <MANTIS_DATA_DIR>/augur/.
MANTIS_AUGUR_DISABLED unset Truthy (1/true/yes/on) → adapter is a no-op even with the SDK installed. Useful for tests / CI.
MANTIS_VERSION unset Surfaced as client.version on the bundle manifest — useful for bisecting which build produced a bundle.
MANTIS_GIT_SHA unset Surfaced as client.git_sha on the bundle manifest.

Logging

Var Default Effect
LOG_LEVEL INFO Standard Python logging level
MANTIS_LOG_FORMAT json json (default) emits one-line JSON per record with tenant_id enrichment; plain reverts to ad-hoc format

Runner / verification

Var Default Effect
MANTIS_PREDICATE_VERIFY enabled Per-step world-model verification (#291). When the brain emits a structured prediction ({"expected": [...]} or Predicted: ...), the runner parses, evaluates, and writes per-predicate booleans into the trajectory plus a world_model_error reward component. Set to disabled to ablate — predicted_outcome is still recorded for distillation, but no evaluation runs. See Predicate grammar.
MANTIS_DONE_GATE enabled Deterministic done-acceptance gate (#303). Runs cheap predicates (empty summary, plan steps incomplete, pending form values, etc.) before the model-based verify_done. Set to disabled to ablate — the runner falls through to the existing model verifier and done_rejections_by_reason stays empty. See Done-acceptance gate.
MANTIS_FORM_CONTROLLER enabled First-class runtime form controller (#301) owning pending-values / used-regions / submit-latch state. Set to disabled to ablate — the runner falls back to the legacy scattered force_fill_* locals; runner.form_controller is None. See Form controller.
MANTIS_ADAPTIVE_SETTLE enabled Replaces post-action time.sleep(settle_time) (#294) with a frame-stability gate (xdotool path) or wait_for_load_state("networkidle") gate (Playwright path), capped at the legacy budget. Set to disabled to ablate — both gates short-circuit back to a fixed sleep without a redeploy. See Adaptive settle.
MANTIS_CHROME_REUSE enabled Container-scoped Xvfb + Chrome session reuse (#311). Successive /v1/cua requests with the same (profile_dir, proxy_key) reuse the live browser instead of paying the ~10 s launch tax. Set to disabled to ablate. Per-request opt-out: payload["reuse_session"]=false. See Chrome session reuse.
MANTIS_SPECULATIVE_INFERENCE disabled Wraps the inner brain in SpeculativeBrain (#118) so think() overlaps with the post-action settle. Default OFF because the E2E ablation on Holo3 Q8 + single-llama.cpp showed a wall-time regression (GPU contention between speculative + sync requests, 55.6% hit rate → +52% wall). Quality is preserved by the strict validator. Enable on multi-GPU backends where the two think() requests don't serialize. See Speculative inference.
MANTIS_PERCEPTUAL_VERIFY enabled Perceptual-diff verifier (#293) for high-risk actions (submit, confirm, buy, send, delete, login, save). Compares pre/post frame hashes — both global and a 200×200 region around the click — and emits action_effect_observed: bool per step. WARNING line injected into next step's feedback on no-effect. Observational only — never blocks or substitutes the action. Set to disabled to ablate. See Perceptual diff verifier.
MANTIS_LOOP_RECOVERY enabled Action-class-transition policy (#302) that forces TAB / TYPE / RETURN when the brain loops on a no-effect click. Runs after the existing substitution chain (force-fill, force-submit, claude-director, top-click-guard) — the last gate before dispatch. Per-reason count surfaces on RunResult.loop_recoveries_by_reason. Set to disabled to ablate. See Loop recovery policy.

API documentation surface

Var Default Effect
MANTIS_ENABLE_DOCS_UI 1 Serve /docs (Swagger) and /redoc (Redoc) over the FastAPI app. Set to 0 / false / no / off on production tenant fleets that don't want the interactive UIs exposed publicly. /openapi.json is served regardless.
MANTIS_GIT_SHA unset Surfaced verbatim in GET /v1/version so clients can pin to a specific build. Typically populated by the deploy pipeline.
MANTIS_BUILD_TIME unset Surfaced verbatim in GET /v1/version. Populated by the deploy pipeline.

Context (set per request, not per deployment)

The handler sets these on every /v1/predict so downstream code (the runtime, the JSON log formatter) can read them via os.environ. Don't rely on them being set at deployment time.

  • MANTIS_TENANT_ID — current request's tenant id
  • MANTIS_CHROME_PROFILE_DIR — per-tenant per-profile_id Chrome user-data-dir for this run (#341)

See also