The Governance Gap No One Is Talking About

Every protocol in the AI agent ecosystem defines how agents communicate. MCP, A2A, ACP, AGNTCY — all four. None of them define whether a worker should be trusted to execute. That gap has a name now: WCP.

The protocols we have

The AI agent infrastructure layer has been moving fast. In the last 18 months, we got four serious protocols that define how agents talk to each other and to tools:

Protocol Publisher What it governs
MCP Anthropic How agents call tools
A2A Google How agents communicate with each other
ACP IBM / BeeAgent Agent communication patterns
AGNTCY AGNTCY consortium Agent discovery and marketplace

These are real, solid protocols. MCP has production adoption. A2A is backed by Google's full weight. They solve real problems.

But look at the "What it governs" column. Every row covers the communication layer. None of them cover the execution layer. That is the gap.

"Technical protocols for inter-agent communication are solid. What's missing are organizational protocols — governance and policy frameworks for worker execution." — O'Reilly, AI Agent Infrastructure Report 2025 (Shyamsundar)

None of these protocols answer: When an AI worker executes, what policy governs it? Who approved it? What data did it touch? Can you prove it?

What "governance gap" means in practice

Here is what happens without a governance layer. These are not hypothetical:

Air Canada, 2024

An AI chatbot promised a bereavement fare discount that the airline's actual policy did not permit. When the customer tried to claim it, Air Canada's defense was that the chatbot was "a separate legal entity" responsible for its own statements.

The British Columbia Civil Resolution Tribunal rejected this. Air Canada was ordered to honor the discount plus pay damages. The court found Air Canada responsible for what its AI agent said.

Legal finding
"Air Canada did not explain why it...could not be held responsible for information provided by one of its agents." — BC Civil Resolution Tribunal, 2024

The chatbot had no policy layer. It had no record of what it promised or why. There was no audit trail. There was nothing to show it had checked whether the discount was approved. The execution had no governance.

OpenAI Operator, February 2025

OpenAI's Operator agent autonomously completed a $31.43 online purchase without explicit per-transaction user consent. The Washington Post documented the case. Operator had been configured to handle tasks, and it handled one the user did not intend.

This is not an attack. This is an agent doing exactly what it was told — just without the right controls on when and what it could execute.

Our own lab

We spent three hours debugging a worker routing failure. The agent kept calling cap.recall.fetch (a pre-WCP non-conformant ID with a version suffix). The routing rules expected cap.mem.retrieve.rag. Both seemed valid. The router silently denied the request and routed to the default. No error. No audit trail. Three hours.

That was a governance failure. The worker had no declared capability contract. The router had no way to verify what the worker could handle. The agent had no way to know it had been silently redirected.

Why observability alone is not enough

"Observability without enforcement creates a false sense of safety. You can see everything that happened. You cannot prevent what you're watching." — O'Reilly, AI Agent Infrastructure Report 2025 (Raj)

The existing tooling (LangSmith, Helicone, Arize Phoenix) solves observability. You can trace what happened. You can see the call graph. You can see which workers were invoked.

Observability answers: what happened?

Governance answers: was it allowed to happen?

These are different questions. In a regulated environment — finance, healthcare, federal contracting — you need both. And the governance layer has to run before execution, not after.

87%
of AI agents lack safety documentation (MIT CSAIL)
7%
of organizations have embedded AI governance (Knostic, 2025)
97%
of AI breach victims lacked access controls (IBM Security)

The token efficiency problem

There is a practical engineering argument for a governance layer beyond compliance. It is the token budget.

The current pattern for giving an agent access to workers: describe every worker inline in the system prompt. Ten workers, six hundred tokens each. That is six thousand tokens of overhead on every call. On top of that, the agent has to reason about which worker to call — another few hundred tokens.

Without WCP Hall
6,300
tokens overhead per call
(10 workers × 600 tokens + 300 routing)
With WCP Hall
280
tokens overhead per call
(10 × 20 token IDs + 80 dispatch payload)

The WCP pattern replaces inline worker descriptions with capability IDs. Instead of six hundred tokens describing what a worker does, you send cap.doc.summarize. The Hall looks it up. The agent asks for capabilities; it does not have to describe them.

This is not theoretical. Cloudflare's engineering blog documented a production AI agent system that reduced context size from 1.17 million tokens to approximately 1,000 tokens using exactly this pattern — a 99.9% reduction in context overhead per agent call (February 2026).

There is also a capability cliff. The Llama 3.2 3B model has a 4,096 token context window. Ten inline worker descriptions consume 4,500 tokens before the actual task is added. That model cannot run at all with inline worker descriptions. With WCP capability IDs: 200 tokens. The small model becomes viable.

The protocol, not the product

We built WCP as an open protocol, not a product. The reasoning is the same as OpenTelemetry.

OpenTelemetry is 100% free, Apache 2.0, governed by CNCF. It makes zero revenue. The ecosystem built on top of it — Datadog ($2.7B ARR), Grafana ($400M ARR), Honeycomb, New Relic — captures billions. OTel's authors work at those companies. They benefit from the standard they wrote.

WCP follows the same path. The spec is the protocol layer. PyHall is the reference implementation. The revenue happens on top:

Layer What it is OTel equivalent
WCP_SPEC.md MIT open standard OTel Specification
PyHall Reference implementation (Python, TypeScript, Go) OTel SDKs
pyhall.cloud Managed Hall SaaS (coming) Grafana Cloud
Compliance profiles FedRAMP, SOC2, EU AI Act profiles Honeycomb paid tier

The protocol has to be free and unencumbered for this to work. WCP is MIT licensed — the simplest possible terms for anyone to implement. PyHall, the reference implementation (SDK + Hall Monitor + Hall API server), is Apache 2.0, adding the patent grant that matters for enterprise adoption and CNCF submission. The "PyHall" and "WCP" names remain trademarks regardless of how the protocol is forked or extended.

What WCP actually is

WCP defines five required behaviors for any compliant worker dispatch system:

WCP-Basic required behaviors
1. Fail-closed — unknown capability request = deny, never pass-through
2. Deterministic — same inputs always produce same routing decision
3. Declared controls — every worker declares what it needs before enrolling
4. Mandatory telemetry — three events minimum: dispatch, complete/fail, evidence
5. Dry-run — every request can be routed without execution for testing

The routing decision object includes: the selected worker, why it was selected, what controls were verified, the blast radius score (how much damage could this worker do if it malfunctions), and an evidence receipt with a SHA-256 hash of the dispatch record.

# Three lines to route a capability request through a Hall
from pyhall import make_decision, RouteInput, load_rules, Registry

rules = load_rules("routing_rules.json")
registry = Registry(registry_dir="enrolled/")

decision = make_decision(RouteInput(
    capability_id="cap.doc.summarize",
    env="prod",
    data_label="CONFIDENTIAL",
    tenant_risk="low",
    qos_class="P2",
    tenant_id="acme-corp",
), rules, registry)

# decision.selected_worker_species_id — which worker handles this
# decision.controls_verified       — what was checked before routing
# decision.blast_score              — risk if this worker fails
# decision.evidence_receipt.hash   — SHA-256 of request payload

That is the entire routing call. The Hall checked capability availability, applied policy gates (data classification, environment, tenant risk), computed blast radius, and emitted a signed evidence receipt — before any worker ran.

Why the clock is running

EU AI Act — Article 12
High-risk AI systems must maintain comprehensive event logs for the system's lifetime, with full traceability of inputs, outputs, and decisions. Deadline: August 2, 2026. Penalty: up to 7% of global annual revenue.

The EU AI Act's high-risk system requirements are not abstract. AI systems making decisions in employment, credit, critical infrastructure, healthcare, and law enforcement qualify. Many enterprise AI agent deployments are in scope today.

WCP's evidence receipt builder generates exactly what Article 12 requires: lifetime event logs, tamper-evident per-decision artifact hashes, with full traceability of dispatch decisions (hash-chain ledger in-progress for v0.3.x). This is not a compliance add-on bolted on afterward. It is built into the dispatch protocol from the first call.

What ships today

WCP v0.1 is published. PyHall ships in three languages:

Package Install Status
pyhall (Python) pip install pyhall-wcp v0.1.0 — routing engine, conformance, CLI
@pyhall/core (TypeScript) npm install @pyhall/core v0.1.0 — 21/21 tests passing
pyhall-go (Go) go get github.com/pyhall/pyhall-go@latest v0.1 — interfaces and routing stub

v0.2 (60 days): WorkerBase class, HallClient for agents, pyhall serve (one command to run a Hall), and the evidence receipt auto-builder with hash chaining.

Try it in five minutes

Enroll a worker, route a capability request, verify the evidence receipt. WCP is MIT. pyhall and Hall are Apache 2.0. Connect pyhall.dev for the managed registry service.

$ pip install pyhall-wcp

The spec, not just the tool

The goal is not to build another AI governance SaaS. The goal is to own the protocol layer — the way OTel owns observability instrumentation — so that every governance tool built in the next five years builds on top of WCP rather than reinventing it.

WCP_SPEC.md v0.1 is published at workerclassprotocol.dev/spec under MIT. The spec covers: identifier rules, compliance levels (WCP-Basic / WCP-Standard / WCP-Full), five required behaviors, routing decision schema, evidence receipt format, and the blast radius scoring model.

Read the spec. Implement it. Extend it. If you are building AI agent infrastructure and you want a governance layer that is not locked to your vendor, this is the protocol to build on.

Sources

Claim Source
O'Reilly governance gap quotes O'Reilly AI Agent Infrastructure Report, 2025
87% agents lack safety documentation MIT CSAIL AI Agent Safety Checklist Study, 2024
7% of orgs have embedded AI governance Knostic AI Governance Survey, 2025
97% of AI breach victims lacked access controls IBM Security, AI Threat Intelligence Report, 2025
99.9% token reduction (1.17M → ~1K tokens) Cloudflare Engineering Blog, February 2026
Air Canada chatbot liability ruling Moffatt v. Air Canada, BC Civil Resolution Tribunal, 2024
OpenAI Operator autonomous purchase Washington Post, February 7, 2025
EU AI Act Article 12, 7% revenue penalty EU Official Journal, Regulation 2024/1689, August 2024
OTel: Datadog $2.7B ARR, Grafana $400M ARR Datadog Q4 2024 earnings; Grafana Labs funding announcement, 2024