Concept: Evaluation | Python:
client.eval()Parameters
The AI-generated text to evaluate. Must be 10–10,000 characters.
Evaluation mode:
"basic"— RAIL’s core scoring models for fast, real-time scoring."deep"— a deeper, more detailed analysis that can also return per-dimension explanations and issue tags."auto"— runsbasic, and automatically escalates todeeponly when a real issue is detected (a low-scoring or low-confidence dimension, or a flagged signal). You get fast scoring on clean content and deep analysis exactly where it matters. The responseresultincludesresolved_mode("basic"or"deep"— which tier actually ran) andescalated(boolean). Billed at the tier that ran.
Subset of dimensions to score. Omit to score all 8. Options:
fairness, safety, reliability, transparency, privacy, accountability, inclusivity, user_impact.Custom dimension weights. Values must sum to 100. E.g.
{"safety": 25, "reliability": 20, ...}.Domain context hint:
"general", "healthcare", "legal", "finance", "code". Improves scoring accuracy.Include per-dimension explanations (deep mode only).
Include detected issue tags per dimension (deep mode only).
Include improvement suggestions per dimension (deep mode only).
Request
Response
How your application’s policy judged this result.
enforcement— the policy’s mode (log_only,block, orregenerate).threshold— the overall score required to pass.score— this result’s overall score.passed— whether the score met the threshold.enforced— whether the outcome was acted on. Whenfalse, the policy is in monitor mode: the verdict is reported but the response is not altered, so you can see what would be blocked. Check the live state withGET /config.
block policy returns 422 POLICY_BLOCKED and a regenerate policy attempts a safe rewrite before applying its fallback.Overall RAIL score (0.0–10.0), weighted average of all evaluated dimensions.
Model confidence in the score (0.0–1.0).
Per-dimension scores. Each entry has
score (0–10) and confidence (0–1). In deep mode: also explanation, issues, suggestions.true if this result was returned from cache (0 credits charged).For
mode: "auto", the tier that actually ran — "basic" or "deep". Use result.escalated to tell whether the deeper analysis was invoked.Credits charged for this request.
0 for cached responses.