Agent Evaluation

Agent Evaluation provides three safety checkpoints for agentic AI systems: evaluate tool calls before execution, scan tool results before passing them back to the agent, and detect prompt injection in any text the agent is about to process.

The three checkpoints

Tool Call Evaluation

ALLOW / FLAG / BLOCK before a tool runs. 1.5–3.0 credits.

Tool Result Scanning

PII detection + injection check on tool output. 0.5–1.0 credits.

Prompt Injection Detection

Fast injection scan on any input text. 0.5 credits.

Quick example

from rail_score_sdk import RailScoreClient

client = RailScoreClient(api_key="YOUR_RAIL_API_KEY")

# 1. Check for injection in user input
injection = client.agent.detect_injection(text=user_input)
if injection.injection_detected:
    return "Invalid input detected."

# 2. Evaluate tool call before execution
tool_check = client.agent.evaluate_tool_call(
    tool_name="send_email",
    tool_input={"to": "user@example.com", "body": agent_draft},
    agent_context="Customer support agent",
)
if tool_check.recommendation == "block":
    return f"Tool call blocked: {tool_check.explanation}"

# 3. Execute the tool, then scan the result
tool_output = execute_tool(tool_name, tool_input)
result_scan = client.agent.scan_tool_result(
    tool_name="send_email",
    tool_result=tool_output,
)
if result_scan.pii_detected:
    tool_output = result_scan.redacted_result

What’s next

API Reference: Tool Call

Full specification for tool call evaluation.

Python SDK: Agent Evaluation

Python SDK reference for all three agent endpoints.

MiddlewareDrop-in provider wrappers that intercept every LLM response and attach a RAIL score automatically.

⌘I

The three checkpoints
Quick example
What’s next

Getting Started

Concepts

The three checkpoints

Tool Call Evaluation

Tool Result Scanning

Prompt Injection Detection

Quick example

What’s next

API Reference: Tool Call

Python SDK: Agent Evaluation

Getting Started

Concepts

Documentation Index

​The three checkpoints

Tool Call Evaluation

Tool Result Scanning

Prompt Injection Detection

​Quick example

​What’s next

API Reference: Tool Call

Python SDK: Agent Evaluation

The three checkpoints

Quick example

What’s next