Exercise 10: Median

Statisticians like to tell others about the median of a set of data because the median is a robust estimator of the data. That is, it's a handy tool for condensing a bunch of data into a single value.

To calculate the median, you first arrange the data from least to greatest. Then, pick the middlemost value. If there is an odd number of values, such as a count of 3, 9, or 789, it's really easy. If there is an even number of values, such as a count of 4, 12, or 2,000, then the median is the average of the middlemost pair of values.

The built-in sort() command is helpful when calculating median.

A Tale of Two Medians

Determine the median of Oxford's crew (with coxswain), then determine the median without the coxswain.

# Data for all 18 crew members
oxfordWeights = [186, 184.5, 204, 184.5, 195.5, 202.5, 174, 183, 109.5]

# Sort the data, lowest value to highest value
oxfordWeights.sort()

# Display the sorted data
print oxfordWeights

# How to find the median:
print "Finding Median #1:"
print "The value in position", (len(oxfordWeights) + 1)/2.0, "of this list is the median."
print ""

# Data for just 16 crew members, without coxswains
oxfordWeightsNoCox = [186, 184.5, 204, 184.5, 195.5, 202.5, 174, 183]

# Sort the data
oxfordWeightsNoCox.sort()

# Display the sorted data
print oxfordWeightsNoCox

# How to find the median:
print "Finding Median #2:"
print "The value in position", (len(oxfordWeightsNoCox) + 1)/2.0, "of this list is the median."

Save the program as oxford-median.py.

What you should get

After clicking Run, you should get this:

[109.5, 174, 183, 184.5, 184.5, 186, 195.5, 202.5, 204]
Finding Median #1:
The value in position 5.0 of this list is the median.

[174, 183, 184.5, 184.5, 186, 195.5, 202.5, 204]
Finding Median #2:
The value in position 4.5 of this list is the median.

To find median #1, we have to count five items in, from the left or the right. What is the median? It is 184.5 pounds.

To find median #2, we have to count four and a half items in. Since there can't be a fourth-and-a-half item, just calculate the average between the item that is fourth from the left, 184.5, and the item that is fourth from the right, 186. So, what's the median? It is (186 − 184.5)/(2) = 185.25.

Notice how close in value median #1 (184.5) and #2 (185.25) are. The median tool is not affected much by outliers like a coxswain, so it is a robust estimator of data.

Study Drill

  • Calculate the means for Oxford's crew with and without the coxswain. How different are the two means? Reflect on why the mean tool is not a robust estimator like median.

Creative Commons License
Learn Stats in 10,000 Hours by Jonathan B. Miller is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
comments powered by Disqus