← Home

Lexicon

The operating vocabulary. Every term here was invented to name something that existed but had no word for it. If people use these terms in meetings you are not in, the framework is working.

Templeton Ratio

T = time_to_do / time_to_check. The ratio of generation difficulty to verification difficulty for any task. Determines whether AI automation creates leverage or doubles the work.

Most AI pilots fail not because the AI is bad, but because nobody measured how hard it is to check the output. T = 1 means you are doing the work twice.

Scientists recognize the P vs NP intuition. Allocators call this verification leverage. Builders call it the check ratio.

Soft Spot

An edge in the directed graph of your business where value is leaking that no one has named yet. Identified by walking the graph and listening to the people closest to the work.

The people who find the biggest leaks are rarely executives. They are the people closest to the work who finally get the vocabulary to name what they have been feeling. Give a team the language of edges and soft spots, and the graph fills itself in.

Allocators call this a value leak. Operators call it process bleed. Builders call it tech debt. Same thing: value is leaving the system and nobody named it.

Verification Trap

A task that is easy to generate but hard to verify. The AI produces output effortlessly, but checking whether it is correct takes as long as doing it manually. T approaches 1.

A common failure mode in early AI deployments - frequently more damaging than technical limitation. The task looks automatable because generation is easy. But "is this output correct?" scales linearly with volume.

AI Sweet Spot

A task where generation is hard but verification is cheap. T >> 1. You can review 50 AI outputs in the time it takes to manually produce one. This is the P vs NP intuition applied to operations.

Look for tasks where the gap between doing and checking is widest. That is where the leverage lives.

Proof Layer

The verification rubric, asymmetry profile, and verification cost analysis built BEFORE the capability. Every AI system needs one. No exceptions.

Generation costs are collapsing. The bottleneck moved. It is now verification - knowing whether the output is correct. Build the rubric first. If you cannot describe what "good" looks like before building, you will not recognize it after.

Allocators call this pre-investment due diligence. Builders call it TDD. Same principle: define success before you deploy capital.

Autonomy State Machine

A graduated trust system for AI deployments with three states: Disabled, HITL (human verifies every output), and Autonomous (spot-check only). Transitions are driven by statistical evidence with hysteresis to prevent oscillation. See: The Promotion Protocol.

Binary thinking about AI (works / does not work) misses the entire middle ground where AI assists but does not decide. The state machine makes the middle ground operational.

Allocators call this graduated delegation. Builders call it feature flag progression. Operators call it staged rollout. Same structure: earn trust through evidence, then widen the aperture.

Gold Standard

A set of human-verified, ground-truth examples used to calibrate and evaluate AI output. The gold standard IS the verification instrument. Without it, you are measuring with a broken ruler.

You cannot improve what you cannot measure. You cannot measure without a reference. The gold standard is the reference.

Compile Time

Time spent building systems, frameworks, rubrics, and processes that produce returns across many future periods. The ROI is multiplicative.

Track your compile-to-runtime ratio. Every hour of compile time produces returns in every future period. Runtime produces returns in exactly one.

Allocators call this CapEx. Builders call this compile time. Same concept: invest now, compound later.

Runtime

Time spent executing tasks, fighting fires, reviewing outputs, and attending meetings. Produces returns in a single period. Necessary but should not dominate a leader's schedule.

The distinction gives leaders a measurable ratio. If the number is moving in the wrong direction, you are trading compounding returns for single-period returns.

Allocators call this OpEx. Builders call this runtime. If your CTO is 100% in meetings, you are running 100% OpEx and 0% CapEx on your most expensive asset.

Dollarized Confusion Matrix

A confusion matrix where counts are replaced with costs. The optimal threshold follows: theta* = C_FP / (C_FP + C_FN). Costs drive thresholds, thresholds drive autonomy levels.

Teams commonly default to 0.5 or an intuitive cutoff. The optimal threshold is a calculation, and the inputs are the actual dollar cost of being wrong in each direction.

Allocators call this the cost of being wrong. Scientists call it the Bayes risk matrix. Builders call it "how much does each type of mistake cost us?"

Quadrant Shifting

Capital investments that move a task to a better position on the Verification Quadrant. Five moves: build a verifier, decompose the task, enrich inputs, constrain outputs, build a rubric.

The quadrant is not destiny. If you can build a verifier that makes verification cheap, a task that was "do not automate" becomes "automate now." That verifier is a capital investment with compounding returns.

Allocators call this CapEx for OpEx leverage. Builders call it building the test infrastructure. Same play: spend once, earn forever.

Operational Alpha

Excess return on enterprise value generated through systematic identification and capture of mispriced edges in business operations. The operational equivalent of alpha in financial markets.

The same discipline that finds mispriced securities can find mispriced business processes. The directed graph is the model. The tools are the execution engine. The alpha is the return.

Allocators already know alpha. This is the same thing applied to operations instead of securities. Builders call it the automation opportunity. Operators call it the edge.

Construction Spread

S = (annual_value x P(success)) / build_cost. The risk-adjusted return on the capital deployed to build a knowledge asset. Rank opportunities by spread descending. Deploy capital top-down until budget is exhausted.

PE funds rank deals by risk-adjusted return on deployed capital. Knowledge assets are the same asset class with a different depreciation curve. If you cannot calculate the spread, you are guessing where to invest.

Allocators know this as yield on cost - the standard real estate development metric. Same formula, different asset class. Builders call it "is it worth building?"

Dual Curve

The simultaneous depreciation of AI models (distribution shift, competitive erosion) and appreciation of knowledge assets (data moats, verifier learning, institutional rubrics). The net rate determines whether an automation is a wasting asset or a compounder.

Physical capital allocation assumes depreciation. Knowledge capital allocation has both curves. Invest in the appreciating side (verifiers, data) not the depreciating side (models).

Allocators already think in depreciation schedules. This adds an appreciation schedule running simultaneously. Builders call it model decay vs data moat.

Quality Ratchet

A CI-enforced floor that only moves up. Once a quality metric hits a threshold, the system blocks any change that drops below it. Each improvement becomes the new minimum. The sequence of baselines is monotonically non-decreasing. Formally: a monotonic ratchet.

Ship a 15% improvement on Tuesday. Regress to baseline by Friday. The ratchet makes this structurally impossible - the floor is enforced by CI, not by good intentions.

Allocators call this a covenant - a contractual floor that prevents backsliding. Builders call it a CI floor. Same mechanism: enforce the minimum structurally, not with good intentions.

Oracle Gradient

The computed vector from your current performance toward the unobservable optimum in a domain's performance space. You never reach the oracle. You approach it. Each measurement sharpens your estimate of where it is.

In every domain, there is an ideal performance point you cannot directly observe. The oracle gradient gives you a direction to move, even when you cannot see the destination. It is the operational derivative of the Performance Frontier.

Structured Elicitation

A controlled experiment designed to learn the operator's preferences. Pairwise comparisons, best-worst scaling, or adaptive conjoint analysis. Highest information per query of the three Deity Problem channels, but requires operator attention.

Passive observation is cheap but noisy. Sometimes you need to design an experiment that asks exactly the right question - targeting the specific preference dimension where uncertainty is highest.

Revealed Preference

Inferring the operator's preferences by watching what they actually do - not what they say they want. Based on revealed preference theory (Afriat's theorem, GARP). Cheapest evidence channel because the operator is doing what they would do anyway.

People are unreliable narrators of their own preferences. What they choose when real stakes are on the line reveals what they actually value. Behavioral observation captures this without interrupting the operator.

Direct Query

A question posed to the operator, used only when the expected value of the answer exceeds the cost of the operator's attention. An agent that asks too many questions is not diligent - it is poorly calibrated.

Some ambiguities cannot be resolved through observation alone. Direct queries are the escape hatch - but they have a cost (operator attention), and that cost must be weighed against the expected improvement in decision quality.

Drift Detector

A posterior predictive check that detects when the operator's preferences have drifted from the learned model. Computed as the fraction of recent decisions the model predicted incorrectly. When the drift score exceeds a threshold, the agent triggers re-elicitation.

Preferences are not static. People change their minds, priorities shift, new constraints emerge. The drift detector is how the agent notices that the operator changed their mind - and responds proportionally.

Demand Gravity

The inescapable pull of real demand on product trajectories. Demand is always there, always pulling, and you cannot negotiate with it. You can map it or you can crash into it.

Ask "what can we build with this?" and you get a solution looking for a problem. Demand gravity is what kills it - the invisible force pulling users toward what they actually need, whether you measured it or not.

Allocators call this market pull. Builders call it product-market fit. Scientists call it an attractor basin. Same force: real demand bends every trajectory toward it eventually.

The Designer's Seat

The position of designing the game rather than playing it. Every multi-agent system is a game. You can optimize your moves within existing rules (playing), or you can choose the rules so that the equilibrium of self-interested behavior is your desired outcome (designing). The CTO's job is the second one.

Two jobs look identical from outside: optimizing your moves within existing rules, and choosing the rules so that the equilibrium IS your desired outcome. The second one is the job.

Scientists call this mechanism design (Hurwicz, Maskin, Myerson - 2007 Nobel). Allocators call it the principal position. Builders call it platform vs application. Same insight: design the rules so self-interested behavior produces your desired outcome.