Tenant keys (provisioning + registration)¶
The operator-side runbook for everything tenant-shaped: issuing tokens, attaching scopes/caps/Anthropic keys, rotating, revoking.
How it works¶
The server reads a JSON keys file at MANTIS_TENANT_KEYS_PATH (typically a Baseten / k8s secret mounted as a file). Every POST /v1/predict looks the presented X-Mantis-Token up in this file (constant-time compare, 5 s read cache for hot reload) and resolves a TenantConfig for the request.
JSON keys file → in-memory hash → constant-time lookup per request
↑
│ 5s cache, hot-reloads on edit
│
operator edits secret store → mount → file changes
If MANTIS_TENANT_KEYS_PATH is not set, the server falls back to single-tenant mode using MANTIS_API_TOKEN. The same token works for all callers in that mode.
Keys file shape¶
{
"tenant_keys": {
"<x-mantis-token-value-1>": {
"tenant_id": "tenant_a",
"scopes": ["run", "status", "result", "logs"],
"max_concurrent_runs": 3,
"max_cost_per_run": 5.0,
"max_time_minutes_per_run": 30,
"rate_limit_per_minute": 60,
"anthropic_secret_name": "anthropic_api_key_tenant_a",
"allowed_domains": ["*.marketplace.example.com", "crm.example.com"],
"webhook_url": "https://callbacks.example.com/mantis",
"webhook_secret_name": "webhook_secret_tenant_a"
},
"<x-mantis-token-value-2>": {
"tenant_id": "readonly_dashboard",
"scopes": ["status", "result"]
}
}
}
Every field except tenant_id is optional and falls back to the DEFAULT_TENANT defaults if missing. Use the minimal shape for read-only / observability keys.
Field reference¶
| Field | Default | Purpose |
|---|---|---|
tenant_id |
(required) | Identifier used as the prefix for state_key, browser profile dir, run dir, log lines, and metric labels. Pick something stable and human-readable. |
scopes |
["run", "status", "result", "logs"] |
Which actions this token can perform. Use ["status", "result"] for read-only consumers. |
max_concurrent_runs |
5 | Per-tenant concurrency gauge. 6th in-flight run returns 429. |
max_cost_per_run |
25.0 | Per-tenant cost cap (clamps each request's max_cost). |
max_time_minutes_per_run |
60 | Per-tenant wall-time cap. |
rate_limit_per_minute |
30 | Token-bucket rate limit per tenant. 0 disables. |
anthropic_secret_name |
anthropic_api_key |
Secret file name to read this tenant's Anthropic key from. Lets each tenant have its own Anthropic billing. |
allowed_domains |
[] (no restriction) |
Wildcards matched against navigate URLs. Empty = no allowlist (legacy). |
webhook_url |
"" |
Per-tenant default webhook for run-completion notifications. Caller can override with per-request callback_url. |
webhook_secret_name |
"" |
Secret file name for the HMAC-SHA256 signing secret. |
Issuing a new tenant token¶
-
Pick a tenant_id. Stable, human-readable. Used in metrics labels, log lines, and the data-volume directory layout.
tenant_a,customer_acme,internal_ops_team. Avoid spaces, slashes, or anything URL-unsafe. -
Generate a token. Hex-encoded 256 bits is plenty:
-
Decide scopes + caps. Start tight:
Raise caps later as the tenant proves out. -
Add it to the keys file. Pull the current file, append the entry, push it back:
Within 5 s the new token is live (hot reload).# Pull the current keys aws secretsmanager get-secret-value \ --secret-id mantis-prod/mantis_tenant_keys \ --query SecretString --output text > /tmp/keys.json # Edit /tmp/keys.json, add the new entry under tenant_keys jq --arg tok "$TOKEN" \ --argjson cfg '{ "tenant_id": "new_tenant", "scopes": ["run","status","result"], "max_cost_per_run": 5 }' \ '.tenant_keys[$tok] = $cfg' \ /tmp/keys.json > /tmp/keys-new.json # Push back aws secretsmanager put-secret-value \ --secret-id mantis-prod/mantis_tenant_keys \ --secret-string file:///tmp/keys-new.json -
Provision the tenant's Anthropic key (if the tenant needs its own billing):
Reference it from the tenant config:aws secretsmanager create-secret \ --name mantis-prod/anthropic_api_key_new_tenant \ --secret-string "sk-ant-...""anthropic_secret_name": "anthropic_api_key_new_tenant". -
Hand off the token out-of-band — 1Password share / Vault entry / signed email. Never paste in Slack.
Rotation¶
When you need to rotate:
- Generate a new token.
- Add a second entry for the same
tenant_idwith the new token. - Tenant updates their secret store with the new token.
- After 24 h (or whatever overlap window you're comfortable with), remove the old token entry.
{
"tenant_keys": {
"OLD_TOKEN_HEX": { "tenant_id": "acme", ... },
"NEW_TOKEN_HEX": { "tenant_id": "acme", ... } // ← add during rotation
}
}
Both tokens resolve to the same tenant_id. Once the tenant has flipped, drop the old entry.
Revocation¶
Remove the token entry from the keys file. Within 5 s the server rejects it with 401. No pod restart needed.
# Compact one-liner
jq 'del(.tenant_keys["BAD_TOKEN_HEX"])' /tmp/keys.json > /tmp/keys-new.json
aws secretsmanager put-secret-value --secret-id mantis-prod/mantis_tenant_keys --secret-string file:///tmp/keys-new.json
Migrating from single-tenant to multi-tenant¶
Existing single-tenant deployments use MANTIS_API_TOKEN env. To migrate:
-
Create the keys file with one entry that uses your existing token:
-
Mount it as a secret at
Push the deployment.MANTIS_TENANT_KEYS_PATH. For Baseten: -
Add real tenants to the keys file as you onboard them.
-
Eventually drop
MANTIS_API_TOKENonce nothing relies on the legacy single-tenant path.
The default tenant entry can stay forever — it's harmless and gives existing callers continuity.
Per-platform secret-store snippets¶
Modal Secrets are env-var-shaped, so the keys file is shipped as a
single env var MANTIS_TENANT_KEYS_JSON that the container writes
to /tmp/tenant_keys.json at boot (see
deploy/modal/modal_mantis_server.py::_bootstrap_tenant_keys).
# Create / overwrite the secret with the contents of /tmp/keys.json
modal secret create --force mantis-tenant-keys \
MANTIS_TENANT_KEYS_JSON="$(cat /tmp/keys.json)"
# Force a fresh container to pick up the new secret on next request.
# Modal scales replicas at request time, so this is automatic after
# a few minutes of idle, but `app stop` makes it immediate.
modal app stop mantis-server
To edit (e.g. add a domain to allowed_domains or rotate a token):
# 1. Pull the current keys
modal secret get mantis-tenant-keys MANTIS_TENANT_KEYS_JSON > /tmp/keys.json
# 2. Edit /tmp/keys.json — add domains, change caps, rotate tokens
# 3. Push it back + restart replicas
modal secret create --force mantis-tenant-keys \
MANTIS_TENANT_KEYS_JSON="$(cat /tmp/keys.json)"
modal app stop mantis-server
The 5-second hot-reload still works within a container (edits to the
file the bootstrap wrote), but secret updates require a new container
boot to pick up — that's what the app stop gives you. Don't
redeploy the function unless the code itself changed; app stop is
enough to retire all warm replicas.
Auditing the live keys¶
The runtime never logs tokens, only tenant_id. Every request emits:
{
"ts": "2026-04-28T02:14:32Z",
"level": "INFO",
"logger": "mantis_agent.baseten_server",
"msg": "predict tenant=tenant_a scope=run state_key=… detached=true action=run",
"tenant_id": "tenant_a"
}
To see who's active right now:
kubectl logs -l app=mantis-holo3-server --tail=1000 \
| jq -r 'select(.tenant_id) | .tenant_id' \
| sort | uniq -c | sort -rn
For longer windows, use the mantis_predict_requests_total counter from /metrics (see Metrics).
What's NOT in this PR¶
- Self-service token issuance (admin API for tenants to rotate their own keys without operator action) — Tier 3 follow-up.
- Tenant-scoped Anthropic budget tracking — costs are reported per-run today; aggregating to a tenant-level monthly budget is left to your billing system using the metrics counters.
- Token expiration / TTL — currently tokens are immortal until removed. Add a cron that prunes by hand or wait for the Tier 3 admin API.
See also¶
- Authentication (caller side) — what tenants do with their tokens
- Rate limits — caps the keys file controls
- URL allowlist —
allowed_domainsenforcement detail - Webhooks —
webhook_url+ signature verification