The Dollarized Confusion Matrix

Replace accuracy with dollars. The optimal threshold for any AI classifier is not a feeling - it is a calculation, and the inputs are the actual cost of being wrong in each direction.

The Matrix

A standard confusion matrix counts errors. A dollarized confusion matrix costs them. The two wrong cells drive every decision that follows.

Predicted Neg

Predicted Pos

Actual Neg

True Negative

False Positive

Actual Pos

False Negative

$3k

True Positive

The insight: C_FP and C_FN are almost never equal.

Threshold Calculator

C_FP: False Positive Cost

Said yes when should have said no

C_FN: False Negative Cost

Said no when should have said yes

Optimal Threshold

θ* = 0.0010

Asymmetry

1000:1

0 (flag everything)0.5 (default)1 (flag nothing)

Missing a positive is 1000x more expensive than a false alarm. Set the threshold low - flag aggressively. At θ* = 0.0010, you flag anything above 0.1% confidence. The asymmetry is extreme - even a small probability of a positive should trigger a flag.

The Formula

# Optimal threshold from cost asymmetry

θ* = C_FP / (C_FP + C_FN)

# When C_FP = C_FN → θ* = 0.50 (symmetric, use default)

# When C_FN >> C_FP → θ* → 0 (flag aggressively)

# When C_FP >> C_FN → θ* → 1 (flag conservatively)

The expected cost of a classifier at threshold θ is E[Cost] = P(FP) × C_FP + P(FN) × C_FN. Setting the derivative to zero gives the optimal threshold: θ* = C_FP / (C_FP + C_FN).

Your threshold should be the proportion of total possible error cost attributable to false positives. If false positives are cheap relative to false negatives, the threshold is low - you accept more false alarms to avoid the expensive miss.

Worked Examples

From Threshold to Autonomy

The threshold tells you where to draw the line. The expected cost per item tells you how much autonomy the classifier earns.

HARMFUL

Expected cost >$50/item

Classifier is worse than manual. Disable it. You are paying for AI and getting negative ROI.

HITL

Expected cost $5-50/item

Classifier assists but humans verify everything. AI does the first pass, humans make the call. Graduate to Autonomous when expected cost drops below $5.

AUTONOMOUS

Expected cost <$5/item

Classifier runs independently. Spot-check only. Demote back to HITL if cost rises above $10 (hysteresis prevents oscillation).

Connection to the Templeton Ratio

The Verification Quadrant asks: how hard is it to check? The Dollarized Confusion Matrix asks: what happens when you check wrong?

Cost asymmetry modifies the effective Templeton Ratio. A task with T = 10 (fast verification) but 1,000:1 cost asymmetry needs verification at a much higher confidence level. The time to verify stays cheap, but the required thoroughness is driven by the stakes.

# Effective Templeton Ratio

T_effective = time_to_do / time_to_check_at_required_confidence

# Required confidence is set by cost asymmetry

required_confidence = f(C_FP, C_FN, θ*)

The Verification Quadrant tells you where to automate. This matrix tells you how carefully to calibrate what you automate.