Authentication¶

Mantis uses two layers of auth when deployed behind a platform gateway (Baseten / GCLB / ALB):

Header	Layer	Set by	Required
`Authorization: Api-Key <BASETEN_API_KEY>`	gateway	platform admin	Baseten only
`X-Mantis-Token: <tenant_token>`	container	operator (per tenant)	Always

Self-hosted deployments (your own Docker / EKS / GKE without a gateway in front) only need X-Mantis-Token.

Getting a tenant token¶

Tokens are issued by your operator. The flow:

Operator generates a token (openssl rand -hex 32) and adds an entry to the deployment's tenant keys file with your tenant_id, scopes, and caps. See Tenant keys for the operator side.
Operator shares the token with you out-of-band (1Password / Vault / signed email — never in plaintext over Slack).
You keep it in your secrets store (AWS Secrets Manager, GCP Secret Manager, Vault, etc.) and pass it via X-Mantis-Token on every call.

Single-tenant deployments use the MANTIS_API_TOKEN env var on the server side; the same token works for everyone.

Request shape¶

POST /v1/predict HTTP/1.1
Host: model-qvvgkneq.api.baseten.co
Authorization: Api-Key bsk_live_…           ← platform (Baseten only)
X-Mantis-Token: 9f3e1b2a4c8d…                ← container, always
Content-Type: application/json

{ "detached": true, "micro": "plans/...", ... }

The same headers go on every endpoint: - POST /v1/predict - POST /v1/chat/completions - GET /v1/runs/{run_id}/video

Open / un-auth'd endpoints (no tokens needed): - GET /health, GET /v1/health — platform liveness probes - GET /v1/models — model discovery - GET /metrics — Prometheus scrape

Scopes¶

Each tenant has a list of allowed scopes:

Scope	What it lets you do
`run`	`POST /v1/predict` to start a new run
`status`	Poll an existing run for status
`result`	Fetch the final result of a completed run
`logs`	Fetch the live event log

A read-only key (e.g., a dashboard scraping run history) can be issued with just ["status", "result"] and will get 403 on POST /v1/predict { ... new run ... }.

Error responses¶

Status	Cause	What to do
401 missing	No `X-Mantis-Token` header sent	Add it
401 invalid	Token doesn't match any tenant key	Check the token wasn't truncated; ask operator to re-issue
403 scope	Token valid but tenant lacks the required scope	Operator needs to add the scope to your tenant config
429 rate	Tenant exceeded `rate_limit_per_minute`	Honor the `Retry-After` header
429 concurrent	Tenant at `max_concurrent_runs`	Wait or have operator raise the cap
503 auth-not-configured	Server has neither `MANTIS_API_TOKEN` nor `MANTIS_TENANT_KEYS_PATH` set	Operator misconfiguration

Token rotation¶

Tokens are hot-reloaded from the server's keys file with a 5-second cache. Rotation is operator-side:

Operator updates the JSON keys file (replaces your entry's key string).
Within 5 seconds, the new token works and the old one is rejected.
Update your secrets store with the new value.

No pod restart is needed.

What `X-Mantis-Token` reveals on the server¶

Once the server validates your token, it resolves a TenantConfig with these fields. Caller-side you don't need to know the internals — but understanding what's enforced helps debug 403/429/etc:

TenantConfig(
    tenant_id="tenant_a",
    scopes=("run", "status", "result", "logs"),
    max_concurrent_runs=3,
    max_cost_per_run=5.0,
    max_time_minutes_per_run=30,
    rate_limit_per_minute=60,
    anthropic_secret_name="anthropic_api_key_tenant_a",
    allowed_domains=("*.marketplace.example.com", "crm.example.com"),
    webhook_url="https://callbacks.example.com/mantis",
    webhook_secret_name="webhook_secret_tenant_a",
)

The server uses these to clamp your max_cost and max_time_minutes to whichever is smaller — its config or your request. So if you ask for max_cost: 50 and your tenant cap is 5, you get 5.