Skip to content
@airblackbox

AIR Blackbox

The flight recorder for autonomous AI agents — record, replay, enforce, audit

AIR Blackbox

AIR Blackbox

Open-source infrastructure for safe deployment of autonomous AI agents.

The flight recorder for AI — record every decision, replay every incident, enforce every policy.

License Status OTel


The Problem

AI agents are making real decisions — calling APIs, executing code, moving money, accessing databases. But there is no standard infrastructure for:

  • Auditing what an agent actually did
  • Enforcing policies before an action executes
  • Shutting down a runaway agent in real time
  • Replaying an incident after something goes wrong
  • Redacting secrets before they hit your logging stack

Every team re-invents this differently. Secrets leak. Budgets burn. Regulators ask questions nobody can answer.

AIR Blackbox is the missing layer between your AI agents and your infrastructure.


How It Works

Your Agent ──→ Gateway ──→ Policy Engine ──→ LLM Provider
                 │               │
                 ▼               ▼
           OTel Collector   Kill Switches
                 │          Trust Scoring
                 ▼          Risk Tiers
           Episode Store
           Jaeger · Prometheus

One line change — swap your base_url — and every agent call flows through AIR Blackbox automatically. No SDK changes, no code refactoring.


5-Minute Quickstart

git clone https://github.com/airblackbox/air-platform.git && cd air-platform
cp .env.example .env          # add your OPENAI_API_KEY
make up                       # 6 services running in ~8 seconds

Then point any OpenAI-compatible client at localhost:8080. That's it.

  • Traceslocalhost:16686 (Jaeger)
  • Metricslocalhost:9091 (Prometheus)
  • Episodeslocalhost:8081 (Episode Store API)

Interactive Demos

Explore each component without installing anything:

Component Try It
Platform Orchestration Launch Demo →
Policy Engine Launch Demo →
Episode Store Launch Demo →
Gateway Launch Demo →
OTel Collector Launch Demo →

Repositories

Core Runtime

Repo What It Does
air-platform Full stack in one command — Docker Compose orchestration
gateway OpenAI-compatible reverse proxy — traces every LLM call
agent-episode-store Groups traces into replayable task-level episodes
agent-policy-engine Risk tiers, kill switches, trust scoring

Safety & Governance

Repo What It Does
otel-collector-genai PII redaction, cost metrics, loop detection
otel-prompt-vault Encrypted prompt storage with pre-signed URL retrieval
otel-semantic-normalizer Normalizes gen_ai.* attributes to a standard schema
agent-tool-sandbox Sandboxed execution for agent tool calls
runtime-aibom-emitter AI Bill of Materials generation at runtime

Instrumentation

Repo What It Does
python-sdk Python SDK — wraps OpenAI, Anthropic, and other LLM clients
trust-crewai Trust plugin for CrewAI
trust-langchain Trust plugin for LangChain / LangGraph
trust-autogen Trust plugin for Microsoft AutoGen
trust-openai-agents Trust plugin for OpenAI Agents SDK

Evaluation & Security

Repo What It Does
eval-harness Replay and score episodes against policies
trace-regression-harness Detect behavioral regressions across agent versions
agent-vcr Record and replay agent interactions for testing
mcp-security-scanner Scan MCP server configs for vulnerabilities
mcp-policy-gateway Policy enforcement for Model Context Protocol

Why Infrastructure-Level?

The same reason you don't implement TLS differently in every microservice.

Agent safety needs to be a standardized layer, not something each team builds ad hoc. AIR Blackbox operates in the OTel pipeline, as a reverse proxy, and as a policy engine — so it works across any framework, any model, any deployment.


Contributing

We're looking for contributors interested in AI safety, observability, and governance. See our Contributing Guide to get started.

Current priorities:

  • New framework connectors (Haystack, DSPy, Semantic Kernel)
  • Policy templates for common compliance scenarios
  • Documentation and integration examples

Apache 2.0 · Built on OpenTelemetry · 21 repositories

Pinned Loading

  1. gateway gateway Public

    A flight recorder for AI systems. OpenAI-compatible reverse proxy that records every LLM call for audit, replay, and incident reconstruction.

    Go 7

  2. python-sdk python-sdk Public

    Python SDK for AIR Blackbox Gateway — record, replay, and govern every AI decision

    Python 1

  3. trust-openclaw trust-openclaw Public

    AIR Trust Layer for OpenClaw — Drop-in security, audit, and compliance for OpenClaw TypeScript agent workflows. Part of the AIR Blackbox ecosystem.

    TypeScript 1

  4. trust-openai-agents trust-openai-agents Public

    AIR Trust Layer for OpenAI Agents SDK — Drop-in security, audit, and compliance for OpenAI agent workflows. Part of the AIR Blackbox ecosystem.

    Python

  5. trust-langchain trust-langchain Public

    AIR Trust Layer for LangChain — Drop-in security, audit, and compliance for LangChain agent workflows. Part of the AIR Blackbox ecosystem.

    Python 1

  6. trust-crewai trust-crewai Public

    AIR Trust Layer for CrewAI — Drop-in security, audit, and compliance for CrewAI multi-agent workflows. Part of the AIR Blackbox ecosystem.

    Python 1

Repositories

Showing 10 of 22 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…