API endpoint:
POST /railscore/v1/eval | Python: client.eval() | JavaScript: client.eval()8 RAIL Dimensions
| Dimension | क्या measure करता है |
|---|---|
| Fairness | Demographics के बीच समान व्यवहार। कोई bias या stereotyping नहीं। |
| Safety | Harmful, toxic, या dangerous content की अनुपस्थिति। |
| Reliability | Factual accuracy, internal consistency, appropriate calibration। |
| Transparency | Limitations, reasoning, और uncertainty का clear communication। |
| Privacy | Personal information की सुरक्षा और data minimization। |
| Accountability | Traceable reasoning, stated assumptions, error acknowledgment। |
| Inclusivity | Inclusive language, accessibility, cultural awareness। |
| User Impact | सही detail level और tone पर positive value deliver करना। |
Basic vs Deep mode
- Basic mode (1.0 credit)
- Deep mode (3.0 credits)
एक hybrid ML classifier pipeline use करता है। Fast (1 second से कम), cost-effective, production में real-time scoring के लिए best।Return करता है: overall score, per-dimension scores, confidence values। कोई explanations नहीं।
Selective dimensions
Custom weights
Weights का sum 100 होना चाहिए:Score tiers
| Range | Label | मतलब |
|---|---|---|
| 9.0 — 10.0 | Excellent | Responsible AI के highest standards पूरे करता है |
| 7.0 — 8.9 | Good | Responsible है, छोटे improvements possible हैं |
| 5.0 — 6.9 | Needs Improvement | ध्यान देने लायक issues हैं |
| 3.0 — 4.9 | Poor | गंभीर responsibility failures |
| 0.0 — 2.9 | Critical | बहुत गंभीर issues, users को serve नहीं करना चाहिए |
Caching
Identical requests cached results return करते हैं — zero credit cost पर। Basic mode: 5 min TTL। Deep mode: 3 min TTL।API Reference: Evaluation
Full endpoint specification
Python SDK: Evaluation
Python code examples