← Home

I Have This Problem

Find the right framework, tool, or vocabulary term for your AI deployment challenge. Each answer links to the full concept with worked examples and the math behind it.

We keep shipping improvements and then regressing. How do I make sure quality only goes up?

Quality RatchetLexicon

A CI-enforced floor that only moves up. Each improvement becomes the new minimum.

How do I make AI agents improve over time without writing an improvement plan?

Quality HillclimbFramework

Ratcheted quality gates on stochastic output create emergent ascent. The agent does not need a plan.

We deployed AI but we are scared to remove the human. How do we safely give it more independence?

The Promotion ProtocolFramework

A 3-state progression: Disabled, HITL, Autonomous. Promote on statistical evidence, roll back on drift.

Nobody agrees what "good" looks like for this task. How do I define quality?

The Performance FrontierFramework

Map the distribution of human performance, find the 99th percentile, compute the gradient toward it.

My AI keeps doing things I don't want. I can't write a complete spec of what I want.

The Deity ProblemFramework

Three evidence channels: structured elicitation (conjoint), revealed preference (behavioral observation), and direct query (ask when VOI exceeds attention cost).

Should I automate this specific task? How do I decide?

TaskVectorTool

Score across 9 dimensions. If any single dimension is a landmine (1), it is a hard no.

Is this task a good candidate for AI? What is the ROI?

Verification Quadrant + Templeton RatioTool

T = time_to_do / time_to_check. High T = AI creates leverage. Low T = you are doing the work twice.

The AI output looks good but I don't know if it is correct. Checking takes as long as doing it.

Verification TrapLexicon

Easy to generate, hard to verify. T approaches 1. You have added a step without saving time.

How do I brainstorm business ideas that actually match real demand?

The Demand FieldFramework

Fix demand (immutable), vary means (mutable). Demand is a hidden force on your optimization gradient.

How do I guarantee this initiative succeeds instead of hoping?

Designed ConvergenceFramework

Design the game so rational agents converge to your outcome. Finite state + Bayesian search + ratchet = theorem.

What are the actual dollar costs of AI being wrong?

Dollarized Confusion MatrixTool

Replace accuracy with dollars. Optimal threshold: theta* = C_FP / (C_FP + C_FN).

The task is in a bad quadrant. What capital investment moves it to a better one?

Quadrant ShiftingTool

Five moves: build a verifier, decompose, enrich inputs, constrain outputs, build a rubric.

Where is value leaking in my business that nobody has named?

Directed Graph + Soft SpotsFramework

Your chart of accounts is a directed graph. Walk the edges. The soft spots are where value leaks.

I spend all my time in meetings and firefighting. How do I invest in systems?

Compile Time vs. RuntimeLexicon

Compile time: building systems with multiplicative ROI. Runtime: executing tasks with single-period returns.

Is this AI automation a wasting asset or a compounder?

Dual Curve + Knowledge CapitalLexicon

Models depreciate, data appreciates. The net rate determines the investment type. See also: knowledge-capital framework.

What is the NPV of automating this task? Should I build, buy, or hire?

Automation NPVTool

Calculate NPV, IRR, and payback period. Compare to hiring, SaaS, or doing nothing. Same math, different asset class.

How should I think about AI output I cannot observe directly? How does the agent learn what I want?

Structured Elicitation, Revealed Preference, Direct Query, Drift DetectorLexicon

The four evidence channels from The Deity Problem: structured-elicitation (conjoint), revealed-preference (behavioral observation), direct-query (ask when worth it), and drift-detector (posterior predictive check). See also: oracle-gradient, designers-seat.

How do I infer what the operator actually wants without asking? How do I learn from behavior instead of instructions?

Revealed PreferenceLexicon

Watch what the operator does, not what they say. Revealed preference theory (Afriat, GARP) applied to AI alignment. Cheapest evidence channel because the operator behaves naturally.

When should the AI ask the operator a question? How do I avoid over-asking or under-asking?

Direct QueryLexicon

Ask only when the expected value of the answer exceeds the cost of the operator's attention. An agent that asks too many questions is poorly calibrated, not diligent.

How do I detect when the operator's preferences have changed and the model is stale?

Drift DetectorLexicon

A posterior predictive check on recent decisions. When the fraction the model predicted incorrectly exceeds a threshold, trigger re-elicitation. Preferences drift - the system must detect it.

What does "operational alpha" mean? How is it different from just doing a good job?

Operational AlphaLexicon

Excess return on enterprise value generated through systematic identification of mispriced edges. The directed graph finds them. The tools evaluate them.

What is the AI Sweet Spot? When does AI create the most value?

AI Sweet Spot + Templeton RatioLexicon

Hard to do, easy to check. T >> 1. The templeton-ratio measures the gap. High ratio = transformative ROI. Also: the proof-layer and gold-standard define what "correct" means.

What should I build first - the AI system or the verification instrument?

Proof Layer + Gold StandardLexicon

Build the rubric first. The gold-standard IS the verification instrument. Without it, you are measuring with a broken ruler. See: soft-spot for finding where to look.

How do I manage the pull of real demand on my product trajectory?

Demand GravityLexicon

The inescapable pull of real demand on product trajectories. Map it or crash into it.

Should I be playing the game or designing it? What is the CTO's real job?

The Designer's SeatLexicon

Design the game so self-interested agents produce the outcome you want. Most engineering is game-playing. Mechanism design is game-designing.

The AI deployment is at autonomy-state-machine HITL state. When do I promote to runtime autonomous?

Promotion Protocol + Autonomy State MachineLexicon

Consecutive batches below acceptance threshold. Not one good batch - N consecutive. The construction-spread is the gap between build cost and operational value.

How do I measure what I should invest in next for my AI operations portfolio?

Knowledge Capital + PermutationsFramework

Knowledge work either compounds or depreciates. Invest in the appreciating side: verifiers, data, rubrics. Not the depreciating side: models.

The Three Layers

Frameworks tell you where to look - strategic models for finding enterprise value.

Tools tell you how to evaluate - interactive calculators for specific decisions.

Lexicon gives you the language - vocabulary that travels in meetings you are not in.