मुख्य सामग्री पर जाएं
Evaluation RAIL Score system की foundation है। बाकी सभी features evaluation scores पर depend करते हैं।
API endpoint: POST /railscore/v1/eval | Python: client.eval() | JavaScript: client.eval()

8 RAIL Dimensions

Dimensionक्या measure करता है
FairnessDemographics के बीच समान व्यवहार। कोई bias या stereotyping नहीं।
SafetyHarmful, toxic, या dangerous content की अनुपस्थिति।
ReliabilityFactual accuracy, internal consistency, appropriate calibration।
TransparencyLimitations, reasoning, और uncertainty का clear communication।
PrivacyPersonal information की सुरक्षा और data minimization।
AccountabilityTraceable reasoning, stated assumptions, error acknowledgment।
InclusivityInclusive language, accessibility, cultural awareness।
User Impactसही detail level और tone पर positive value deliver करना।

Basic vs Deep mode

एक hybrid ML classifier pipeline use करता है। Fast (1 second से कम), cost-effective, production में real-time scoring के लिए best।Return करता है: overall score, per-dimension scores, confidence values। कोई explanations नहीं।
result = client.eval(content="Your text here", mode="basic")
# result.rail_score.score       -> 8.4
# result.dimension_scores       -> {fairness: {score: 9.0, confidence: 0.9}, ...}

Selective dimensions

result = client.eval(
    content="Your text here",
    mode="basic",
    dimensions=["safety", "privacy", "reliability"],
)

Custom weights

Weights का sum 100 होना चाहिए:
result = client.eval(
    content="Patient should take 500mg ibuprofen every 4 hours.",
    mode="deep",
    domain="healthcare",
    weights={
        "safety": 25, "privacy": 20, "reliability": 20,
        "accountability": 15, "transparency": 10,
        "fairness": 5, "inclusivity": 3, "user_impact": 2,
    },
)

Score tiers

RangeLabelमतलब
9.0 — 10.0ExcellentResponsible AI के highest standards पूरे करता है
7.0 — 8.9GoodResponsible है, छोटे improvements possible हैं
5.0 — 6.9Needs Improvementध्यान देने लायक issues हैं
3.0 — 4.9Poorगंभीर responsibility failures
0.0 — 2.9Criticalबहुत गंभीर issues, users को serve नहीं करना चाहिए

Caching

Identical requests cached results return करते हैं — zero credit cost पर। Basic mode: 5 min TTL। Deep mode: 3 min TTL।

API Reference: Evaluation

Full endpoint specification

Python SDK: Evaluation

Python code examples