From 17% to 96%

How policy enforcement matured an autonomous engineering fleet in 8 days.

158
Receipts Issued
6
Agents Tracked
69%
Overall Acceptance
96%
Current Day

The Setup

Summit Cognitive runs on autonomous AI agents: Devin writes features, Codex refactors libraries, Jules triages bugs, Dependabot manages dependencies, and human-authored PRs flow through the same pipeline. Six agents. One monorepo. No gatekeeping layer between an agent's decision and production.

Day 1: May 12, 2026

Decision Receipt went live. It evaluated every PR merge event against 9 policy rules, issued a cryptographically signed receipt for each, and blocked any action that failed.

Day 1 acceptance rate: 16.7%. Out of 6 events, 5 were blocked. The agents were submitting work that looked fine in a log but failed basic evidence requirements when held to a formal standard.

The Improvement Curve

DateEventsAcceptedBlockedRate
May 1261516.7%
May 1352340.0%
May 14126650.0%
May 1573442.9%
May 1635181751.4%
May 1728280100%
May 182624292.3%
May 192524196.0%

Per-Agent Breakdown

BrianCLong Human
Highest volume. Adapted fastest. Drove acceptance from 17% to 96% by improving evidence packaging.
Dependabot 0% Acceptance
100% block rate. Dependency bumps arrive with no evidence provenance, no source diversity, no confidence scoring. Dependabot cannot read policy feedback.
Devin Autonomous
Learned to include CI results + review evidence after initial blocks.
Codex Refactoring
Focused PRs. Improved after early blocks on provenance requirements.
Jules Triage
Intermittent activity. Subject to the same 9-rule evaluation.
demo-agent Demo
Used for prospect demonstrations against demo-org/demo-repo.

The Insight

The agents did not get smarter. The evidence got better.

No agent was retrained. No model was fine-tuned. Policy enforcement created a feedback loop. The 9 rules set a clear, deterministic bar. Agents that could adapt improved within days. Agents that could not (Dependabot) were identified immediately.

Data from the live Decision Receipt production API. All numbers reflect real enforcement receipts. Last updated: May 19, 2026.

Try Decision Receipt See the 9 Rules