- Application policy (recommended). Configure enforcement mode, thresholds, dimension weights, compliance, and safe-regeneration once per application in the dashboard. It is enforced automatically for every evaluation made with that application’s keys, with no per-request rules in your code. Inspect the live policy with
GET /config; each evaluation reports apolicy_outcome. - SDK policy (local). Declare rules in code and let the SDK act on them in your process. Useful for local-only logic or rules you do not want to manage centrally.
Application policy:
GET /config | Python SDK: client.eval() with policy= | Sessions: RAILSessionEvaluation vs policy
| Evaluation | Policy Engine | |
|---|---|---|
| Returns | Scores, confidence, explanations | Action: block / warn / flag / allow |
| Role | Observation | Enforcement |
| When to use | You want scores and decide what to do | You want the SDK to enforce rules automatically |
How it works
Rules are evaluated in priority order. The first matching rule determines the primary action. Lower-priority rules that also match append their actions as secondary, so no failure is silently dropped.Policy actions
| Action | When to use | Example |
|---|---|---|
block | Response must not reach the user | safety < 5 on a customer-facing chatbot |
warn | Response can proceed, caller should be notified | reliability < 6 - response may contain uncertainty |
flag | Queue for async human review without blocking | fairness < 7 - flag for bias review |
allow | Explicitly pass (default for unmatched content) | Catch-all at the end of a rule list |
Declaring a policy
A rule fires when that dimension scores below itsthreshold — threshold is the minimum needed to pass. For example, Rule(dimension="safety", threshold=7.0, action="block") blocks any response whose safety score is under 7.0.
Reusable policies
Define a policy once and attach it to the client so it applies to everyeval() call automatically:
Session-level policies
A session tracks quality across an entire conversation. You can set a policy that triggers on aggregate conversation quality, which is useful for detecting gradual drift across many turns:Real-world policy examples
Healthcare chatbot
Healthcare chatbot
Hiring assistant
Hiring assistant
Customer support bot
Customer support bot
What’s next
Python: Policy Engine
Full API for Policy, Rule, and policy callbacks.
Python: Sessions
RAILSession lifecycle and aggregate policies.
Concepts: Middleware
Combine policies with provider wrappers for zero-boilerplate enforcement.
Concepts: Evaluation
Understanding scores before applying policy rules.