Concept: Evaluation | Python:
client.eval()Parameters
The AI-generated text to evaluate. Must be 10–10,000 characters.
Evaluation mode:
"basic" (ML classifier, fast, 1.0 credit) or "deep" (LLM-as-judge, 2–5s, 3.0 credits).Subset of dimensions to score. Omit to score all 8. Options:
fairness, safety, reliability, transparency, privacy, accountability, inclusivity, user_impact.Custom dimension weights. Values must sum to 100. E.g.
{"safety": 25, "reliability": 20, ...}.Domain context hint:
"general", "healthcare", "legal", "finance", "code". Improves scoring accuracy.Include per-dimension explanations (deep mode only).
Include detected issue tags per dimension (deep mode only).
Include improvement suggestions per dimension (deep mode only).
Request
Response
Overall RAIL score (0.0–10.0), weighted average of all evaluated dimensions.
Model confidence in the score (0.0–1.0).
Per-dimension scores. Each entry has
score (0–10) and confidence (0–1). In deep mode: also explanation, issues, suggestions.true if this result was returned from cache (0 credits charged).Credits charged for this request.
0 for cached responses.