Tenant keys (provisioning + registration)¶

The operator-side runbook for everything tenant-shaped: issuing tokens, attaching scopes/caps/Anthropic keys, rotating, revoking.

How it works¶

The server reads a JSON keys file at MANTIS_TENANT_KEYS_PATH (typically a Baseten / k8s secret mounted as a file). Every POST /v1/predict looks the presented X-Mantis-Token up in this file (constant-time compare, 5 s read cache for hot reload) and resolves a TenantConfig for the request.

JSON keys file → in-memory hash → constant-time lookup per request
                        ↑
                        │ 5s cache, hot-reloads on edit
                        │
                operator edits secret store → mount → file changes

If MANTIS_TENANT_KEYS_PATH is not set, the server falls back to single-tenant mode using MANTIS_API_TOKEN. The same token works for all callers in that mode.

Keys file shape¶

{
  "tenant_keys": {
    "<x-mantis-token-value-1>": {
      "tenant_id": "tenant_a",
      "scopes": ["run", "status", "result", "logs"],
      "max_concurrent_runs": 3,
      "max_cost_per_run": 5.0,
      "max_time_minutes_per_run": 30,
      "rate_limit_per_minute": 60,
      "anthropic_secret_name": "anthropic_api_key_tenant_a",
      "allowed_domains": ["*.marketplace.example.com", "crm.example.com"],
      "webhook_url": "https://callbacks.example.com/mantis",
      "webhook_secret_name": "webhook_secret_tenant_a"
    },
    "<x-mantis-token-value-2>": {
      "tenant_id": "readonly_dashboard",
      "scopes": ["status", "result"]
    }
  }
}

Every field except tenant_id is optional and falls back to the DEFAULT_TENANT defaults if missing. Use the minimal shape for read-only / observability keys.

Field reference¶

Field	Default	Purpose
`tenant_id`	(required)	Identifier used as the prefix for `state_key`, browser profile dir, run dir, log lines, and metric labels. Pick something stable and human-readable.
`scopes`	`["run", "status", "result", "logs"]`	Which actions this token can perform. Use `["status", "result"]` for read-only consumers.
`max_concurrent_runs`	5	Per-tenant concurrency gauge. 6^th in-flight run returns 429.
`max_cost_per_run`	25.0	Per-tenant cost cap (clamps each request's `max_cost`).
`max_time_minutes_per_run`	60	Per-tenant wall-time cap.
`rate_limit_per_minute`	30	Token-bucket rate limit per tenant. 0 disables.
`anthropic_secret_name`	`anthropic_api_key`	Secret file name to read this tenant's Anthropic key from. Lets each tenant have its own Anthropic billing.
`allowed_domains`	`[]` (no restriction)	Wildcards matched against `navigate` URLs. Empty = no allowlist (legacy).
`webhook_url`	`""`	Per-tenant default webhook for run-completion notifications. Caller can override with per-request `callback_url`.
`webhook_secret_name`	`""`	Secret file name for the HMAC-SHA256 signing secret.

Issuing a new tenant token¶

Pick a tenant_id. Stable, human-readable. Used in metrics labels, log lines, and the data-volume directory layout. tenant_a, customer_acme, internal_ops_team. Avoid spaces, slashes, or anything URL-unsafe.

Generate a token. Hex-encoded 256 bits is plenty:

TOKEN=$(openssl rand -hex 32)
echo "Send this token securely to the tenant: $TOKEN"

Decide scopes + caps. Start tight:

{
  "tenant_id": "<id>",
  "scopes": ["run", "status", "result"],
  "max_concurrent_runs": 2,
  "max_cost_per_run": 5.0,
  "max_time_minutes_per_run": 20,
  "rate_limit_per_minute": 30
}

Raise caps later as the tenant proves out.

Add it to the keys file. Pull the current file, append the entry, push it back:

# Pull the current keys
aws secretsmanager get-secret-value \
  --secret-id mantis-prod/mantis_tenant_keys \
  --query SecretString --output text > /tmp/keys.json

# Edit /tmp/keys.json, add the new entry under tenant_keys
jq --arg tok "$TOKEN" \
   --argjson cfg '{ "tenant_id": "new_tenant", "scopes": ["run","status","result"], "max_cost_per_run": 5 }' \
   '.tenant_keys[$tok] = $cfg' \
   /tmp/keys.json > /tmp/keys-new.json

# Push back
aws secretsmanager put-secret-value \
  --secret-id mantis-prod/mantis_tenant_keys \
  --secret-string file:///tmp/keys-new.json

Within 5 s the new token is live (hot reload).

Provision the tenant's Anthropic key (if the tenant needs its own billing):
```
aws secretsmanager create-secret \
  --name mantis-prod/anthropic_api_key_new_tenant \
  --secret-string "sk-ant-..."
```
Reference it from the tenant config: "anthropic_secret_name": "anthropic_api_key_new_tenant".
Hand off the token out-of-band — 1Password share / Vault entry / signed email. Never paste in Slack.

Rotation¶

When you need to rotate:

Generate a new token.
Add a second entry for the same tenant_id with the new token.
Tenant updates their secret store with the new token.
After 24 h (or whatever overlap window you're comfortable with), remove the old token entry.

{
  "tenant_keys": {
    "OLD_TOKEN_HEX": { "tenant_id": "acme", ... },
    "NEW_TOKEN_HEX": { "tenant_id": "acme", ... }   // ← add during rotation
  }
}

Both tokens resolve to the same tenant_id. Once the tenant has flipped, drop the old entry.

Revocation¶

Remove the token entry from the keys file. Within 5 s the server rejects it with 401. No pod restart needed.

# Compact one-liner
jq 'del(.tenant_keys["BAD_TOKEN_HEX"])' /tmp/keys.json > /tmp/keys-new.json
aws secretsmanager put-secret-value --secret-id mantis-prod/mantis_tenant_keys --secret-string file:///tmp/keys-new.json

Migrating from single-tenant to multi-tenant¶

Existing single-tenant deployments use MANTIS_API_TOKEN env. To migrate:

Create the keys file with one entry that uses your existing token:

{
  "tenant_keys": {
    "<existing-MANTIS_API_TOKEN-value>": {
      "tenant_id": "default",
      "scopes": ["run", "status", "result", "logs"]
    }
  }
}

Mount it as a secret at MANTIS_TENANT_KEYS_PATH. For Baseten:

# deploy/baseten/holo3/config.yaml
secrets:
  mantis_tenant_keys: null
environment_variables:
  MANTIS_TENANT_KEYS_PATH: /secrets/mantis_tenant_keys

Push the deployment.

Add real tenants to the keys file as you onboard them.
Eventually drop MANTIS_API_TOKEN once nothing relies on the legacy single-tenant path.

The default tenant entry can stay forever — it's harmless and gives existing callers continuity.

Per-platform secret-store snippets¶

BasetenAWS Secrets ManagerGCP Secret ManagerPlain k8s SecretModal Secret

BASETEN_API_KEY="..."
cat /tmp/keys.json | curl -sS -X POST \
  -H "Authorization: Api-Key $BASETEN_API_KEY" \
  -H "Content-Type: application/json" \
  --data-binary "$(jq -n --arg v "$(cat /tmp/keys.json)" '{name:"mantis_tenant_keys",value:$v}')" \
  https://api.baseten.co/v1/secrets

aws secretsmanager put-secret-value \
  --secret-id mantis-prod/mantis_tenant_keys \
  --secret-string file:///tmp/keys.json

gcloud secrets versions add mantis-prod-mantis_tenant_keys \
  --data-file=/tmp/keys.json

kubectl create secret generic mantis-tenant-keys \
  --from-file=mantis_tenant_keys=/tmp/keys.json \
  --dry-run=client -o yaml | kubectl apply -f -

Modal Secrets are env-var-shaped, so the keys file is shipped as a single env var MANTIS_TENANT_KEYS_JSON that the container writes to /tmp/tenant_keys.json at boot (see deploy/modal/modal_mantis_server.py::_bootstrap_tenant_keys).

# Create / overwrite the secret with the contents of /tmp/keys.json
modal secret create --force mantis-tenant-keys \
  MANTIS_TENANT_KEYS_JSON="$(cat /tmp/keys.json)"

# Force a fresh container to pick up the new secret on next request.
# Modal scales replicas at request time, so this is automatic after
# a few minutes of idle, but `app stop` makes it immediate.
modal app stop mantis-server

To edit (e.g. add a domain to allowed_domains or rotate a token):

# 1. Pull the current keys
modal secret get mantis-tenant-keys MANTIS_TENANT_KEYS_JSON > /tmp/keys.json

# 2. Edit /tmp/keys.json — add domains, change caps, rotate tokens

# 3. Push it back + restart replicas
modal secret create --force mantis-tenant-keys \
  MANTIS_TENANT_KEYS_JSON="$(cat /tmp/keys.json)"
modal app stop mantis-server

The 5-second hot-reload still works within a container (edits to the file the bootstrap wrote), but secret updates require a new container boot to pick up — that's what the app stop gives you. Don't redeploy the function unless the code itself changed; app stop is enough to retire all warm replicas.

Auditing the live keys¶

The runtime never logs tokens, only tenant_id. Every request emits:

{
  "ts": "2026-04-28T02:14:32Z",
  "level": "INFO",
  "logger": "mantis_agent.baseten_server",
  "msg": "predict tenant=tenant_a scope=run state_key=… detached=true action=run",
  "tenant_id": "tenant_a"
}

To see who's active right now:

kubectl logs -l app=mantis-holo3-server --tail=1000 \
  | jq -r 'select(.tenant_id) | .tenant_id' \
  | sort | uniq -c | sort -rn

For longer windows, use the mantis_predict_requests_total counter from /metrics (see Metrics).

What's NOT in this PR¶

Self-service token issuance (admin API for tenants to rotate their own keys without operator action) — Tier 3 follow-up.
Tenant-scoped Anthropic budget tracking — costs are reported per-run today; aggregating to a tenant-level monthly budget is left to your billing system using the metrics counters.
Token expiration / TTL — currently tokens are immortal until removed. Add a cron that prunes by hand or wait for the Tier 3 admin API.