Production Data

From 17% to 96%

How policy enforcement matured an autonomous engineering fleet in 8 days.

158

Receipts Issued

Agents Tracked

69%

Overall Acceptance

96%

Current Day

The Setup

Summit Cognitive runs on autonomous AI agents: Devin writes features, Codex refactors libraries, Jules triages bugs, Dependabot manages dependencies, and human-authored PRs flow through the same pipeline. Six agents. One monorepo. No gatekeeping layer between an agent's decision and production.

Day 1: May 12, 2026

Decision Receipt went live. It evaluated every PR merge event against 9 policy rules, issued a cryptographically signed receipt for each, and blocked any action that failed.

Day 1 acceptance rate: 16.7%. Out of 6 events, 5 were blocked. The agents were submitting work that looked fine in a log but failed basic evidence requirements when held to a formal standard.

The Improvement Curve

Date	Events	Accepted	Blocked	Rate
May 12	6	1	5	16.7%
May 13	5	2	3	40.0%
May 14	12	6	6	50.0%
May 15	7	3	4	42.9%
May 16	35	18	17	51.4%
May 17	28	28	0	100%
May 18	26	24	2	92.3%
May 19	25	24	1	96.0%

Per-Agent Breakdown

BrianCLong Human

Highest volume. Adapted fastest. Drove acceptance from 17% to 96% by improving evidence packaging.

Dependabot 0% Acceptance

100% block rate. Dependency bumps arrive with no evidence provenance, no source diversity, no confidence scoring. Dependabot cannot read policy feedback.

Devin Autonomous

Learned to include CI results + review evidence after initial blocks.

Codex Refactoring

Focused PRs. Improved after early blocks on provenance requirements.

Jules Triage

Intermittent activity. Subject to the same 9-rule evaluation.

demo-agent Demo

Used for prospect demonstrations against demo-org/demo-repo.

The Insight

The agents did not get smarter. The evidence got better.

No agent was retrained. No model was fine-tuned. Policy enforcement created a feedback loop. The 9 rules set a clear, deterministic bar. Agents that could adapt improved within days. Agents that could not (Dependabot) were identified immediately.

Data from the live Decision Receipt production API. All numbers reflect real enforcement receipts. Last updated: May 19, 2026.

Try Decision Receipt See the 9 Rules