> ## Documentation Index > Fetch the complete documentation index at: https://docs.responsibleailabs.ai/llms.txt > Use this file to discover all available pages before exploring further. # Middleware > Drop-in provider wrappers that intercept every LLM response and attach a RAIL score automatically. Middleware is the pattern of intercepting every LLM response and attaching a RAIL score before it reaches the rest of your application. You replace your LLM client with a RAIL wrapper. The wrapper calls the LLM, evaluates the response, and returns both the content and the scores in a single object. **Python SDK:** [Integrations reference](/integrations/overview) | **API:** [Evaluation endpoint](/api-reference/evaluation) ## The problem it solves Without middleware, adding responsible-AI checks to every LLM call means writing evaluation code in every place you call the LLM, duplicating logic, risking coverage gaps, and cluttering your application code: ```python Without middleware theme={null} # Eval code scattered everywhere async def get_response(user_message): response = await openai_client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": user_message}] ) content = response.choices[0].message.content # Must remember to eval in every function score = rail_client.eval(content=content, mode="basic") if score.rail_score.score < 7.0: raise ValueError("Response below quality threshold") return content ``` ```python With middleware theme={null} # Scoring is automatic - set up once, forget about it from rail_score_sdk.integrations import RAILOpenAI client = RAILOpenAI( openai_api_key="...", rail_api_key="YOUR_RAIL_API_KEY", eval_mode="basic", threshold=7.0, ) async def get_response(user_message): response = await client.chat(messages=[{"role": "user", "content": user_message}]) return response.content # .rail_score and .threshold_met are also available ``` ## How it works ```mermaid theme={null} flowchart LR App["Your Application"] --> Wrapper["RAIL Wrapper"] Wrapper -->|"1. Forward request"| LLM["LLM API\n(OpenAI / Gemini / Anthropic)"] LLM -->|"2. Response content"| Wrapper Wrapper -->|"3. Evaluate"| RAIL["RAIL Score API"] RAIL -->|"Score + dimensions"| Wrapper Wrapper -->|"content + rail_score\n+ threshold_met"| App ``` When you call a method on the RAIL wrapper, three things happen transparently: 1. Your messages are forwarded to the underlying LLM API as a normal API call. 2. The LLM response is submitted to the RAIL evaluation endpoint in the mode you configured. 3. A wrapped response object is returned containing the original content, RAIL score, per-dimension scores, and a `threshold_met` boolean, all in one return value. ## Supported providers | Wrapper | Wraps | Python | JavaScript | | --------------- | ----------------------- | ------ | ---------- | | `RAILOpenAI` | OpenAI chat completions | Yes | Yes | | `RAILGemini` | Google Gemini | Yes | Yes | | `RAILAnthropic` | Anthropic Claude | Yes | Yes | | `RAILLangChain` | Any LangChain LLM | Yes | — | | Custom wrapper | Any HTTP-based LLM | Yes | Yes | ## Observe mode vs enforce mode Score every response, never block. Use this to measure quality without interrupting the response flow. ```python theme={null} client = RAILOpenAI( openai_api_key="...", rail_api_key="...", eval_mode="basic", # No threshold — always returns response ) response = await client.chat(messages=[...]) print(response.content) # The LLM's response print(response.rail_score) # RAIL score (always present) print(response.threshold_met) # None — no threshold configured ``` Raise `ThresholdError` when a response doesn't meet the bar. ```python theme={null} client = RAILOpenAI( openai_api_key="...", rail_api_key="...", eval_mode="basic", threshold=7.0, ) try: response = await client.chat(messages=[...]) return response.content except ThresholdError as e: # e.rail_score and e.failed_dimensions are available return fallback_response() ``` Automatically trigger Safe Regeneration when a response falls below threshold. ```python theme={null} client = RAILOpenAI( openai_api_key="...", rail_api_key="...", eval_mode="basic", threshold=7.0, on_fail="regenerate", max_iterations=3, ) # Returns the best content — original or regenerated response = await client.chat(messages=[...]) print(response.content) print(response.iterations_taken) # 1 if original passed ``` ## Writing custom middleware If you use an LLM provider without a built-in wrapper, build your own middleware using the core `eval()` call: ```python theme={null} from rail_score_sdk import RailScoreClient rail = RailScoreClient(api_key="...") async def rail_middleware(llm_call, messages, threshold=7.0): """Generic RAIL middleware for any async LLM call.""" content = await llm_call(messages) result = rail.eval(content=content, mode="basic") if result.rail_score.score < threshold: raise ValueError( f"Response scored {result.rail_score.score:.1f} — below threshold {threshold}. " f"Failed: {[d for d, s in result.dimension_scores.items() if s.score < threshold]}" ) return content, result # Use with any LLM: content, score = await rail_middleware(my_llm_call, messages, threshold=7.5) ``` ## What's next Declarative rules to act on scores across a session. Full provider wrapper documentation and options. TypeScript wrappers for OpenAI, Gemini, Anthropic. RAILMiddleware - wrap any LLM function.