The Promotion Protocol
Formally: Autonomy State Machine
AI earns autonomy through demonstrated performance, just like an employee. You do not deploy it or not deploy it. You promote it through a 3-state progression with statistical graduation criteria at each step - and you roll it back when performance degrades.
The Problem
Binary thinking about AI - “it works” vs. “it does not work” - misses the entire middle ground where AI assists but does not decide. Most teams are stuck at one extreme: either they refuse to deploy because they cannot guarantee perfection, or they deploy fully autonomous and pray.
The Promotion Protocol makes the middle ground operational.
The Three States
AI output is blocked from production entirely. Used for new deployments, post-incident recovery, or when error rates exceed the safety threshold. The system is off.
Exit criteria: system passes smoke tests and human review of a sample batch confirms baseline competence.
AI produces output. A human verifies every item before it reaches production. The human is the quality gate. This is where most AI deployments should live initially - and where many should stay permanently.
Promotion criteria: error rate below acceptance threshold for N consecutive samples. Not one good batch - N consecutive good batches.
AI output goes directly to production. Humans spot-check only. This state is earned, not assumed. And it is revocable.
Rollback triggers: error rate exceeds rollback threshold, distribution shift detected, or drift detector fires. Rollback goes to HITL, not Disabled (unless critical).
Promotion Criteria
Promotion is earned through statistical evidence, not vibes. The criteria are explicit and measurable:
The key word is consecutive. One good batch is noise. Ten consecutive good batches is signal. The graduation window prevents promotion on a lucky streak.
Rollback Triggers
Hysteresis
The promotion threshold and the rollback threshold must be different. If they are the same, the system oscillates: promote at 2% error, roll back at 2% error, promote again, roll back again.
The Parameters
| Parameter | Description | Example |
|---|---|---|
| acceptance_threshold | Max error rate for promotion | 2% |
| graduation_window | Consecutive batches required | 10 batches |
| graduation_sample_size | Total items in window | 500 items |
| rollback_threshold | Error rate triggering demotion | 5% |
| drift_threshold | Distribution shift sensitivity | KL > 0.1 |
Worked Example: Invoice Processing
New deployment. Smoke test on 50 sample invoices passes. System promoted to HITL. Every extraction is reviewed by an operator before posting to the ERP.
500 invoices processed. Error rate: 1.4% (below 2% threshold). Ten consecutive daily batches all below threshold. Graduation criteria met.
Extractions go directly to ERP. Operator reviews a 10% sample daily. Error rate holds at 1.2%.
New vendor with a non-standard invoice format. Distribution shift detected. Error rate spikes to 6.2%. System automatically rolled back to HITL. Operator verifies all output from the new vendor while the model is retrained.
Model retrained on new format. Error rate back to 1.1%. Ten consecutive batches pass. Re-promoted to Autonomous.
Why “Promote” Not “Deploy”
The metaphor matters. “Deploy” is binary - the system is deployed or it is not. “Promote” implies earned trust, demonstrated competence, and the possibility of demotion.
CTOs present this to boards. Board members understand promotions. “We promoted the AI from supervised to autonomous last quarter, then rolled it back when drift fired” is a sentence that parses for everyone in the room. “We adjusted the autonomy state machine transition parameters” is not.
Connection to Other Frameworks
Quality Hillclimb - the HITL review is a quality gate. The graduation criteria are the ratchet mechanism.
Dollarized Confusion Matrix - error costs determine the acceptance and rollback thresholds. The cost of a false positive vs. false negative sets the promotion criteria.
Designed Convergence - the Promotion Protocol is mechanism design for AI autonomy. The incentive structure ensures the system converges toward the right autonomy level.