kl-divergence

Shows two discrete distributions over the same bins: the true distribution P (bright) and a model Q (dim). Samples are drawn from P, and the visualization highlights the per-sample extra log-loss log2(P(X)/Q(X)). A running average and a decomposition into per-bin contributions illustrate that KL divergence is an expectation under P and is always non-negative, reaching 0 only when P and Q match.

canvasclick to interact

t=0s

practical uses

01.Measuring how well a probabilistic model Q approximates a true/target distribution P
02.Variational inference and VAEs (minimizing D_KL between approximate and true/posterior distributions)
03.Information theory / coding: average extra bits when coding data from P with a code optimized for Q

technical notes

Uses a discrete-bins approximation: D_KL(P||Q) ≈ Σ_i P(i) log(P(i)/Q(i)). Q smoothly shifts/warps away from P over a 4.2s loop. Samples are generated deterministically via a seeded PRNG and inverse-CDF sampling from P, then smoothed with easing for the cursor. All drawing is grid-snapped for a blocky green-on-black aesthetic.

← lagrangian-duality cross-validation →