Mean, Median, Mode

Probability & StatisticsDifficulty: █░░░░Depth: 0Unlocks: 0

Basic measures of central tendency for data sets.

Interactive Visualization

t=0s

Core Concepts

▸Arithmetic mean: the average found by combining all values into a single representative number
▸Median: the middle value that separates the higher half and lower half of an ordered data set
▸Mode: the value or values that occur most frequently in a data set

Key Symbols & Notation

x-bar (x-bar) - symbol for the sample meancapital Sigma (Sigma) - summation operator used to add all values

Essential Relationships

↔Each measure summarizes 'center' differently: the mean depends on the magnitudes of all values (sum/count), the median depends only on ordered position(s), and the mode depends only on value frequency

▶ Advanced Learning Details

Graph Position

Depth Cost

Fan-Out (ROI)

Bottleneck Score

Chain Length

Cognitive Load

Atomic Elements

Total Elements

Percentile Level

Atomic Level

All Concepts (13)

• Central tendency: a single-number summary representing the 'center' of a data set
• Arithmetic mean: the average obtained by summing all values and dividing by the count
• Sample mean (concept of computing the mean from a sample) versus population mean (concept of the mean for a whole population)
• Median: the middle value of a data set after ordering
• Median computation for odd number of observations (select the middle element)
• Median computation for even number of observations (take the average of the two middle elements)
• Mode: the value(s) that occur most frequently in the data set
• Possibility of multiple modes (bimodal, multimodal) or no mode at all
• Outlier: an observation that is unusually large or small relative to the rest of the data
• Frequency table / relative frequency: counting how often each value (or class) occurs
• Weighted mean: computing a mean using frequencies or explicit weights (value × weight summed, divided by total weight)
• Applicability of the three measures to different data types (e.g., mode for categorical data)
• Sorting/ordering the data as a required step for computing the median

Teaching Strategy

Deep-dive lesson - accessible entry point but dense material. Use worked examples and spaced repetition.

When you summarize a data set with “one number,” you’re really choosing a definition of “typical.” Mean, median, and mode are three different (and useful) ways to say what “typical” means—especially when your data are messy, skewed, or have outliers.

TL;DR:

Mean (x̄) averages all values; median is the middle value after sorting; mode is the most frequent value. Mean uses every number but is sensitive to outliers. Median is robust to outliers. Mode captures the most common category/value and can be more than one (or none).

What Is Mean, Median, Mode?

Why do we need “central tendency”?

Data sets can be long lists of numbers: test scores, delivery times, heights, or daily temperatures. Looking at a whole list makes it hard to compare groups or describe what’s normal.

A measure of central tendency compresses a data set into a single representative value. But “representative” can mean different things:

•Balance point of the data (mean)
•Middle of the data (median)
•Most common value (mode)

These are not interchangeable; they answer different questions.

The three definitions (with intuition)

Suppose we have a data set of n values:

x₁, x₂, …, xₙ

1) Mean (arithmetic mean)

•Idea: Share the total equally among all data points.
•Symbol: x̄ (“x-bar”) for the sample mean.
•Interpretation: If the data were weights on a number line, the mean is the balance point.

2) Median

•Idea: The “middle” observation after sorting.
•Interpretation: Half the data are ≤ median, and half are ≥ median.

3) Mode

•Idea: The most frequent value(s).
•Interpretation: What value shows up the most often?

When they agree—and when they don’t

If a distribution is perfectly symmetric and has no extreme outliers, mean ≈ median ≈ mode.

But with skew or outliers:

•The mean is pulled toward extreme values.
•The median stays near the center of the ordered list.
•The mode depends only on frequency.

Quick comparison table

Measure	What it uses	What it represents	Sensitive to outliers?	Can be multiple?
Mean (x̄)	All values	Balance point / average	Yes	No
Median	Order only	Middle value	No (very robust)	No
Mode	Counts/frequency	Most common value	Not in the same way	Yes (bimodal, multimodal)

Important note about data types

•Mean and median generally require numeric data.
•Mode works for numeric or categorical data (e.g., most common color).

Core Mechanic 1: The Mean (x̄) and Summation (∑)

Why the mean is useful

The mean answers: “If I had to replace the whole data set with a single number that preserves the total, what number would that be?”

If you sum all values and then distribute that sum equally across n items, each would get the mean.

Definition

For values x₁, x₂, …, xₙ, the sample mean is

x̄ = (1/n) ∑ᵢ₌₁ⁿ xᵢ

Where:

•n is the number of data points
•∑ (“Sigma”) means “add up a bunch of things”

What ∑ means (unpacking the notation)

The expression

∑ᵢ₌₁ⁿ xᵢ

means:

x₁ + x₂ + x₃ + … + xₙ

So the mean is:

x̄ = (x₁ + x₂ + … + xₙ) / n

A key property: the mean preserves the total

If x̄ is the mean, then

n · x̄ = ∑ᵢ₌₁ⁿ xᵢ

This is a simple but powerful fact: using the mean keeps the sum consistent.

Mean as a balance point (intuition)

Imagine each data value xᵢ as a weight placed on a number line. The mean is where the system balances.

Another way to say this uses deviations from the mean:

Let dᵢ = xᵢ − x̄.

Then the deviations sum to zero:

∑ᵢ₌₁ⁿ (xᵢ − x̄) = 0

Show the work:

∑ᵢ₌₁ⁿ (xᵢ − x̄)

= ∑ᵢ₌₁ⁿ xᵢ − ∑ᵢ₌₁ⁿ x̄

= ∑ᵢ₌₁ⁿ xᵢ − n·x̄

= ∑ᵢ₌₁ⁿ xᵢ − ∑ᵢ₌₁ⁿ xᵢ

= 0

This is one reason the mean appears everywhere in statistics and machine learning: it’s the “center” that makes positive and negative deviations cancel.

The mean and outliers

Because the mean uses every value, one extreme point can change it a lot.

Example intuition:

•Data: 10, 10, 10, 10
•Mean = 10
•Add an outlier 1000:
•New mean = (10+10+10+10+1000)/5 = 208

Even if 4 out of 5 values are still 10, the mean shifts dramatically.

When the mean is the right tool

Use the mean when:

•Outliers are rare or not meaningful
•You care about totals (e.g., average revenue)
•The distribution is roughly symmetric

Be cautious when:

•There are extreme values (income, house prices)
•The distribution is strongly skewed

Core Mechanic 2: The Median and Mode (Robustness and Frequency)

Why median and mode exist

Sometimes “typical” should not be affected by extremes.

•If one billionaire moves into a neighborhood, the mean income can jump a lot.
•But the median income usually barely changes.

Likewise, for categorical data (like shirt sizes), mean may not even make sense—but mode still does.

Median

Definition

1) Sort the data from smallest to largest.

2) If n is odd: the median is the single middle value.

3) If n is even: the median is the average of the two middle values.

Let the sorted values be:

x₍₁₎ ≤ x₍₂₎ ≤ … ≤ x₍ₙ₎

•If n is odd, median = x₍(n+1)/2₎
•If n is even, median = (x₍n/2₎ + x₍n/2+1₎)/2

Why sorting matters

Median depends on order, not the actual magnitudes beyond their relative position.

So if you change an extreme value (like the max) but it remains at the far end, the median often stays the same.

Robustness to outliers

Because median is determined by the middle position(s), outliers have limited influence.

Example idea:

•Data: 2, 3, 4, 5, 1000
•Median is 4 (the middle)
•Even though 1000 is huge, it does not pull the median upward.

When the median is the right tool

Use the median when:

•You want a “typical” value in skewed distributions (income, time-to-complete)
•Outliers are meaningful but shouldn’t dominate the summary

Mode

Definition

The mode is the value (or values) that occur most frequently.

•If one value occurs most frequently: unimodal
•If two values tie: bimodal
•If many: multimodal
•If all values occur equally often: sometimes said to have no mode

Why mode is useful

Mode answers: “What is the most common outcome?”

This is especially useful for:

•Categorical data (most common color, most common major)
•Discrete numeric data (most common number of siblings)

Mode can be “messy”

Mode can be unstable if:

•The data are continuous with many unique values (e.g., exact measured heights)
•Small sample sizes create accidental ties

In practice, for continuous data we often bin values (histograms) and talk about the most common range rather than the exact value.

Putting mean, median, mode together

A common pattern in right-skewed data (long tail to the right) is:

mode ≤ median ≤ mean

Because the mean is pulled toward the tail.

In left-skewed data:

mean ≤ median ≤ mode

These are not laws, but good intuition checks.

Application/Connection: Choosing the Right Measure in Real Problems

The real skill: choosing what “center” should mean

In practice, computing mean/median/mode is easy. The hard part is choosing the right one for the question and the data.

Common scenarios

1) Salaries / wealth (skew + outliers)

•A few extremely high values
•Mean can be misleading as a “typical” salary
•Median is often reported (median household income)

2) Delivery times / latency (skew + occasional spikes)

•Most deliveries are fast; some are delayed heavily
•Median gives a typical experience
•Mean captures the impact of delays (important for cost)

Often teams report both:

•Median latency (user experience)
•Mean latency (overall performance/cost)

3) Test scores in a class (often roughly symmetric)

•Mean works well if no extreme anomalies
•Median is a backup if one student has an unusual score due to absence/cheating/etc.

4) Product sizes / categories

For categorical data like “small/medium/large,” mean is not meaningful.

•Mode tells you the most common size ordered.

A decision table

If your data look like…	And you care about…	Prefer…
Symmetric, few outliers	Overall average level	Mean (x̄)
Skewed or has big outliers	Typical central experience	Median
Categories or repeated discrete values	Most common outcome	Mode
You want a full picture	Different notions of “typical”	Report mean + median (and sometimes mode)

Connection to later statistics

Mean, median, and mode are the first step toward:

•Variance/standard deviation (spread around the mean)
•Quantiles/percentiles (generalizing the median)
•Distributions (shape explains why mean and median differ)

If you remember one guiding idea: mean uses magnitudes, median uses order, mode uses counts.

Worked Examples (3)

Compute mean, median, and mode for a small data set

Data (minutes to finish a task): 5, 7, 7, 9, 12

Mean (x̄):
Compute the sum using ∑:
∑ᵢ₌₁⁵ xᵢ = 5 + 7 + 7 + 9 + 12 = 40
Then divide by n = 5:
x̄ = (1/5)·40 = 8
Median:
The data are already sorted: 5, 7, 7, 9, 12
n = 5 is odd, so the middle position is (n+1)/2 = 3
Median = the 3rd value = 7
Mode:
Count frequencies:
5 occurs 1 time
7 occurs 2 times
9 occurs 1 time
12 occurs 1 time
Mode = 7

Insight: Here the mean (8) is higher than the median (7) because the larger values (9, 12) pull the average upward a bit. The mode matches the median because 7 is both common and centrally located.

How an outlier changes mean vs median

Original data: 10, 11, 11, 12, 13

Add an outlier: 100

Original mean:
Sum = 10 + 11 + 11 + 12 + 13 = 57
n = 5
x̄ = 57/5 = 11.4
Original median:
Sorted data are the same.
Middle value (3rd) = 11
Median = 11
New data set with outlier: 10, 11, 11, 12, 13, 100
New mean:
New sum = 57 + 100 = 157
n = 6
x̄ = 157/6 ≈ 26.1667
New median:
With n = 6 (even), median is the average of the 3rd and 4th values.
3rd value = 11, 4th value = 12
Median = (11 + 12)/2 = 11.5

Insight: One outlier changed the mean from 11.4 to about 26.17 (a huge shift), but the median moved only from 11 to 11.5. This is why median is often preferred for skewed data.

Mode with ties (bimodal example)

Data (number of messages received per day for 8 days): 2, 2, 3, 3, 4, 5, 5, 7

Count frequencies:
2 occurs 2 times
3 occurs 2 times
4 occurs 1 time
5 occurs 2 times
7 occurs 1 time
Identify the maximum frequency:
The highest count is 2
List all values with that frequency:
Mode(s) = 2, 3, 5 (multimodal)

Insight: Mode isn’t always a single number. When multiple values tie for highest frequency, the data can be multimodal—useful for spotting multiple common behaviors.

Key Takeaways

✓
Mean (x̄) is the arithmetic average: x̄ = (1/n) ∑ᵢ₌₁ⁿ xᵢ.
✓
Median is the middle value after sorting (or the average of the two middle values when n is even).
✓
Mode is the most frequent value; it can be one value, multiple values, or (in some cases) none.
✓
Mean uses magnitudes and is sensitive to outliers; median uses order and is robust to outliers.
✓
Mode is especially useful for categorical data where mean/median may not be meaningful.
✓
For right-skewed data, a common pattern is mode ≤ median ≤ mean (mean pulled right).
✓
A good summary often reports more than one measure (e.g., mean and median together).

Common Mistakes

✗
Computing the median without sorting the data first.
✗
Forgetting that when n is even, the median is the average of the two middle values (not one of them).
✗
Assuming the mode must be unique; ties can create multiple modes.
✗
Using the mean as “typical” for highly skewed data and getting a misleading result.

Practice

easy

Compute mean, median, and mode for: 1, 2, 2, 2, 9

Hint: Mean uses the sum; median is the 3rd value after sorting (since n = 5); mode is the most frequent.

Show solution

Sorted data: 1, 2, 2, 2, 9

Mean: (1+2+2+2+9)/5 = 16/5 = 3.2

Median: 3rd value = 2

Mode: 2

easy

Find the median of: 4, 10, 6, 8

Hint: Sort, then average the two middle values because n is even.

Show solution

Sorted: 4, 6, 8, 10

Median = (6 + 8)/2 = 7

medium

A data set has mean x̄ = 12 for n = 5 values. The first four values are 10, 12, 12, 14. Find the fifth value.

Hint: Use n·x̄ = ∑ xᵢ to get the total sum, then subtract the known values.

Show solution

Total sum = n·x̄ = 5·12 = 60

Known sum = 10 + 12 + 12 + 14 = 48

Fifth value = 60 − 48 = 12

Connections

Next concepts you can unlock from here:

Quality: A (4.5/5)

← back to tree browse all →