Mean, Median, and Cutpoint Percentages?Posted: August 8, 2012
My class is doing some projects on NH this fall, infographic things, like incidence of melanoma in NH. And one thing you have to do with such things of course is look at the state demographic profile — we’re #1 in melanoma in the country (per capita basis), but we’re also an elderly state in terms of demographics.
Or so I thought. It turns out that when we use the cutpoint of 65+ we’re actually the 37th most elderly state (2000 data):
So where’d I get the idea we were an elderly state? Because I keep hearing that in median age we’re in the top 10 “oldest” states in the . And we are — we’re number 7:
So what’s going on here? It’s obvious once you think about it — the median age is far more affected by the fertility rate, which varies state to state, is highly impacted by culture, and skews the distribution to the right. A lot of times this will line up with your cut-point elderly — Utah is the youngest state on a median basis because they have big families in Utah, and they also have a very small percentage of people over 65.
But in New Hampshire we have the lowest fertility rate in the country and a migration inflow that consists more of mid-career professionals than young adults. So that tends to reduce the expected population skew.
Cutpoint percentages are a really simple analytical tool, and like mean, median, and mode they can be expressed as single number summaries. For example: you can say things like 15% of the population is 65+, or that only 0.2% of undergraduates graduate with over $100,000 in debt. (By the way, you read that right, despite those student debt examples every newspaper article on the subject leads off with, the actual incidence that sort of thing is about 2 in 1,000. That’s about as representative of the graduate population as a 6 foot 6 inch tall male would be of the male population. Perhaps the reporters should also interview Kobe Bryant to find out what it’s like to be average height?).
In short cutpoint percentages are incredibly useful tools for quick and dirty analysis of a distribution, and they are used all the time in business and policy analysis. And given a cutpoint and a set of data they aren’t that much harder to compute than the median. So why aren’t we placing them next to mean, median, and mode in our student toolboxes?