block. The verdict lives in the body so your code can
read the reason and act accordingly. Non‑200 codes are reserved for
protocol failures (auth, rate limit, quota).
POST /api/runtime-security/scan/input
Scan a user prompt before forwarding it to your LLM.
Request
| Field | Type | Required | Notes |
|---|---|---|---|
text | string | yes | The prompt to scan. |
source_app | string | no | Free‑form app identifier (≤ 128 chars). Diagnostic; not the App UUID. |
provider | string | no | openai, anthropic, gemini, vertex, bedrock, groq, etc. ≤ 32 chars. |
model | string | no | Model name (≤ 128 chars). |
metadata | object | no | Anything you want kept on the audit record. |
custom_params | object | no | Typed key / value pairs declared by the resolved App. |
Response
POST /api/runtime-security/scan/output
Scan an LLM response before returning it to the user.
Request
| Field | Type | Required | Notes |
|---|---|---|---|
response | string | yes | The model output to scan. |
prompt | string | no | The prompt that produced it (for context). |
source_app | string | no | As above. |
provider | string | no | As above. |
model | string | no | As above. |
metadata | object | no | As above. |
Response
SameScanResponse shape as /scan/input. The output threat model is
leakage and jailbreak success indicators, distinct from the input
model. Configure separate thresholds via output_block_threshold and
output_redact_threshold in the App config.
Rate limit: 20 requests / second / workspace.
POST /api/runtime-security/scan/batch
Score up to 32 items in one call. Each item still consumes one
license call, but you save the HTTP overhead and the warm classifier
amortises across the batch.
Request
Response
len(items) license calls; if the workspace runs out of quota
mid‑batch, the whole call returns 402 before any scoring runs.
The ScanResponse schema
Every scan call returns the same shape.
| Field | What to do with it |
|---|---|
verdict | Branch on this. See Verdicts. |
redacted_text | On verdict: redact, send this to your model instead of the original. PII spans are replaced with <CATEGORY> markers. |
blocked_reason | Sanitise and surface to your end user as a refusal message. |
uuid | Log it. The same UUID appears on the audit event so you can correlate dashboards with your application logs. |
Wrapping it around your model
The minimal integration: scan input, call your model, scan output.When to prefer the Scan API over the proxy
- You don’t want the firewall in the chat request path.
- You’re scanning text that isn’t going to an LLM at all.
- You’re processing a large batch (
/scan/batchis the right tool). - You’re wiring custom middleware in a language without a maintained OpenAI / Anthropic SDK.

