Shows SGD as repeated parameter updates on a 2D loss surface using a noisy mini-batch gradient estimate g^_t. The true (full-data) gradient direction is shown alongside the stochastic estimate; mini-batch size cycles to demonstrate variance reduction and the unbiased expectation E[g^_t] = ∇f(θ_t).
Implements a simple convex quadratic loss with an unbiased per-example gradient = true gradient + noise. g^_t is computed by averaging batchSize samples (variance ~ 1/sqrt(batchSize)), then θ is updated once per stepTime. Rendering uses snapped 4px grid alignment for a retro blocky look; animation interpolates θ between discrete updates using the provided ease(t) function.