Runs and polling¶
After submitting a detached run you get a run_id. Everything else is POST /v1/predict with an action field.
Actions¶
{ "action": "status", "run_id": "20260428_…" }
{ "action": "result", "run_id": "..." }
{ "action": "logs", "run_id": "...", "tail": 200 }
{ "action": "cancel", "run_id": "..." }
| Action | Returns | Notes |
|---|---|---|
status |
Current state + summary block when terminal |
Cheapest call; use for polling |
result |
Full result JSON (leads, per-step trace, costs) | Only meaningful after terminal |
logs |
Last tail events written by the runner |
Useful for debugging stuck runs; max tail is 10000 |
cancel |
Marks the run cancelled; runner halts at the next checkpoint boundary | Cooperative — not instant |
Status lifecycle¶
| Status | Meaning |
|---|---|
queued |
Server accepted the request; not yet on a worker |
running |
Runner is working on it |
succeeded |
Run completed; summary populated |
failed |
Run threw an unhandled error; error populated |
cancelled |
A cancel action was processed |
Polling pattern¶
RUN_ID="..."
while true; do
STATE=$(curl -fsS -X POST "$ENDPOINT/v1/predict" \
-H "Authorization: Api-Key $BASETEN_API_KEY" \
-H "X-Mantis-Token: $MANTIS_API_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"action\":\"status\",\"run_id\":\"$RUN_ID\"}" | jq -r .status)
echo "$(date '+%H:%M:%S') $STATE"
case "$STATE" in succeeded|failed|cancelled) break ;; esac
sleep 30
done
Recommended polling cadence:
| Phase | Interval |
|---|---|
| First minute | every 5–10 s (catch quick failures fast) |
| Next 5 min | every 30 s |
| Beyond 5 min | every 60 s |
Each status call hits the rate limiter; per-tenant rate_limit_per_minute defaults to 30, so a 1/min poll is fine.
Status response¶
{
"status": "succeeded",
"run_id": "20260428_021432_076255ef",
"started_at": "2026-04-28T02:14:32Z",
"finished_at": "2026-04-28T02:24:01Z",
"summary": {
"total_time_s": 569,
"steps_executed": 17,
"viable": 3,
"leads_with_phone": 1,
"cost_total": 0.42,
"cost_breakdown": {
"gpu": 0.12,
"claude": 0.12,
"proxy": 0.18
},
"dynamic_verification_summary": { ... }
}
}
The summary.dynamic_verification_summary block contains the per-page health checks the runner emits for extraction workflows (page_1_required_filters_present: pass, etc.).
Result response¶
{
"result": {
"leads": [
"VIABLE | Year: <YYYY> | Make: <Make> | Model: <Model> | Price: <Price> | Phone: <Phone or 'none'>",
"VIABLE | Year: <YYYY> | Make: <Make> | Model: <Model> | ...",
"VIABLE | Year: <YYYY> | Make: <Make> | Model: <Model> | ..."
],
"steps": [
{ "intent": "Navigate to ...", "type": "navigate", "success": true, "duration": 1.2, ... },
...
],
"video": {
"path": "/.../recording.mp4",
"polished_path": "/.../recording_polished.mp4",
"actions": { "clicks": 17, "keys": 3, "types": 2, "scrolls": 8, "drags": 0 },
"duration_seconds": 567.3
},
"summary": { ... same as status response ... }
},
...
}
The lead format depends on the plan — for marketplace-listings extractions it's the VIABLE | … strings above; for task_suite runs each task gets its own result block.
Idempotency¶
Pass Idempotency-Key: <unique-id> on the original POST /v1/predict and retries with the same key return the cached run_id instead of starting a new run.
curl -X POST "$ENDPOINT/v1/predict" \
-H "Authorization: Api-Key $BASETEN_API_KEY" \
-H "X-Mantis-Token: $MANTIS_API_TOKEN" \
-H "Idempotency-Key: order-7afc3b91" \
-H "Content-Type: application/json" \
-d '{ ... }'
Cache TTL is 24 hours (configurable on the server). Useful when your client has a flaky network and you can't easily tell whether a POST landed.
See Idempotency for the operator side.
Webhooks (instead of polling)¶
If your tenant has webhook_url configured (or you pass callback_url per request), you don't need to poll — the server will POST a signed JSON to your URL when the run reaches a terminal state.
Body:
{
"run_id": "20260428_021432_076255ef",
"tenant_id": "tenant_a",
"status": "succeeded",
"summary": { ... same as status response ... },
"delivered_at": "2026-04-28T02:24:01Z"
}
Verify the HMAC-SHA256 signature in X-Mantis-Signature: sha256=<hex>:
import hmac, hashlib
def verify(body: bytes, header: str, secret: str) -> bool:
expected = "sha256=" + hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, header)
3 retries with exponential backoff (1 s, 5 s, 30 s) if your endpoint returns non-2xx. After that the server gives up — keep polling as a fallback. See Webhooks for the operator side.
Cancelling¶
curl -fsS -X POST "$ENDPOINT/v1/predict" \
-H "Authorization: Api-Key $BASETEN_API_KEY" \
-H "X-Mantis-Token: $MANTIS_API_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"action\":\"cancel\",\"run_id\":\"$RUN_ID\"}"
Cancels are cooperative — the runner finishes the current step's checkpoint write before stopping. Expect 5–60 s before the status flips to cancelled.
Resuming¶
If a run failed mid-way (network hiccup, replica restart, OOM), re-submit with the same workflow_id and resume_state: true. Keep the same profile_id too so the resumed run reuses the logged-in Chrome session from the failed attempt (#341):
{
"detached": true,
"micro": "plans/example/...json",
"profile_id": "marketplace-prod", // ← same profile (cookies / sessions)
"workflow_id": "marketplace-listings-v1", // ← same workflow (checkpoint to resume from)
"resume_state": true,
"max_cost": 2
}
The runner picks up from the last checkpoint (extracted leads kept, browser profile + cookies still on the volume). You get a new run_id but the same logical workflow continues.
Paused runs¶
OTP / 2FA / human-in-the-loop confirmation — #344.
A run can pause mid-flight — for OTP / 2FA / explicit human confirmation — without failing. The default request_user_input host tool is wired into every /v1/predict run: if the brain emits Action(TOOL_CALL, name="request_user_input", params={"prompt": "..."}), the runner snapshots state and the run's status flips to paused.
from mantis_agent.client import MantisClient
client = MantisClient.from_env()
handle = client.predict({"micro": "plans/login-flow.json", "profile_id": "alice"})
def on_status(s):
if s.status == "paused":
code = input(s.prompt + " ") # surface to the human
client.resume(handle.run_id, user_input=code)
# wait_for_completion polls past `paused` (it's NON-terminal); the
# on_status callback above handles the prompt + calls resume() and
# the loop continues to terminal succeeded / failed / cancelled.
final = client.wait_for_completion(handle.run_id, on_status=on_status)
Key points:
pausedis non-terminal.wait_for_completionkeeps polling — it only stops onsucceeded/failed/cancelled(theTERMINAL_STATUSESset is unchanged).RunStatus.prompt,RunStatus.reason, andRunStatus.pause_stateare populated when status ispaused. The pause_state is opaque — the server keeps the canonical copy on disk, so you don't need to send it back yourself.client.resume(run_id, user_input=...)is synchronous: it returns{status: running}once the worker has been kicked back into the runner. The continuation runs in the background; keep polling status until terminal.- The runner can pause more than once on the same run (OTP → confirmation → another OTP). Each pause flips status back to
pausedwith a freshprompt; resume the same way. - Plan-signature mismatch on resume returns 400 — usually a sign that someone edited the on-disk state, since the plan can't change between submit and resume.
See API / Pause / resume for the full request/response shapes.
Lifecycle endpoints (cheap-poll + queue)¶
The action-based POST /v1/predict {action: status} returns the full
detail payload — fine for one-off checks, expensive when polled in a
loop. The lifecycle endpoints below give a cheap phase + backoff hint
poll surface plus a per-tenant queue snapshot. Use these for active
polling, then call the action-based status route once a terminal phase
arrives if you need full detail.
GET /v1/runs/{run_id} — phase + backoff hint¶
Returns:
{
"run_id": "<run_id>",
"phase": "running",
"last_event_at": 1722500400.0,
"polling_backoff_ms_hint": 1500,
"started_at": 1722500390.0,
"finished_at": null,
"halt_class": null
}
Phases (queued / running / recovering / complete / halted /
cancelled) are stable wire constants. complete covers both fully
successful runs and completed_with_failures. polling_backoff_ms_hint
is adaptive: terminal phases give a long hint (state won't change),
fresh transitions give a short hint, idle runs grow progressively
longer.
Client pattern:
import requests, time
def watch(run_id: str) -> dict:
while True:
r = requests.get(
f"{ENDPOINT}/v1/runs/{run_id}",
headers={"X-Mantis-Token": TOKEN},
)
r.raise_for_status()
body = r.json()
if body["phase"] in {"complete", "halted", "cancelled"}:
return body
time.sleep(body["polling_backoff_ms_hint"] / 1000)
GET /v1/queue — tenant queue snapshot¶
Returns:
Scoped to the calling tenant. Terminal runs are excluded — operators
wanting historical totals can scan status.json files via
POST /v1/predict {action: status}. eta_ms is best-effort and may
be null when no recent dispatch samples exist.
POST /v1/runs/{run_id}/cancel¶
Already documented under Cancelling above. The route
accepts an optional JSON body {"reason": "<operator-tag>"} for log
attribution; the response carries the post-cancel status.
Note for self-hosters running multiple API replicas: the lifecycle data structures shipped in
mantis_agent.run_lifecycle.RunLifecycleStoreare single-process (in-memory) by design. The deployed routes here derive phase from the file-backedstatus.jsonwritten by the executor, so they work across replicas without modification. If you build a richer integration on theRunLifecycleStoreAPI directly (for example, addingshould_cancel()polling inside an executor), back the store with Redis or Modal Dict before scaling past one replica.
See also¶
- Recordings — fetching the screencast
- Errors — full error reference
- Reference / HTTP API — endpoint detail