Since the statistics class I teach is supposed to be integrative — that is, to show connections between various disciplines and other aspects of life — I’m always on the lookout for ways to jury-rig an understanding from one domain to understand another. I think I just found a neat example.
But first, look at these two different stories of the Obama record on jobs:
To the average viewer these may seem like incompatible stories. In the top graph, Obama begins to pull us out of the recession on day one of his presidency, slowing job losses and eventually moving us to job gains — digging us out of the hole that Bush 43 got us into.
In the bottom graph, Obama takes office, unemployment skyrockets to a historic level, and even now Obama has not returned us to the point we were at on inauguration day. He still hasn’t cleaned up his mess.
Which brings me to the medical stats terms incidence and prevalence.
From http://tirgan.com/incidence.htm :
Incidence refers to the frequency of development of a new illness in a population in a certain period of time, normally one year. When we say that the incidence of this cancer has increased in past years, we mean that more people have developed this condition year after year, i.e.:, the incidence of thyroid cancer has been rising, with 13,000 new cases diagnosed this year.
Prevalence refers to the current number of people suffering from an illness in a given year. This number includes all those who may have been diagnosed in prior years, as well as in the current year. The incidence of a cancer is 20,000 year with a prevalence of 80,000 means that there are 20,000 new cases diagnosed every year and there are 80,000 people living in the United states with this illness, 60,000 of whom were diagnosed in the past decade and are still living with the disease.
I think you see where I’m going with this. If you apply the terminology of epidemiology, the unemployment rate is a prevalence measure. It’s influenced heavily by how long a person who gets a condition (in this case, the state of being unemployed) stays in that condition. Prevalence is a helpful measure of the social and economic impact of a disease.
The jobs creation numbers, on the other hand, are a measure of incidence, in this case measured month by month, year by year.
Which measure you use is related to what you are trying to figure out, but in general, when attacking diseases at least, it is the incidence rate that is looked at most closely — if you can make progress on the incidence rates the prevalence problem will take care of itself eventually. Meanwhile, prevalence can be unreliable — a deadly disease has less prevalence because it is killing people faster (and taking them off the books) — just as the unemployment rate does not include people who have stopped looking. On the other side of the equation, prevalence can often under-represent positive change which can be dwarfed by a large backlog of cases. If we start to make progress on diabetes, for example, it won’t be easily seen in a prevalence chart until many, many years later.
There are other problems with the charts, absolutely. I’m not calling the game for Obama here — that chart tells lies in some other ways. The chart does not control for population growth — a couple hundred thousand jobs are needed just to account for new people coming into the economy. A lot of the positive looking growth is treading water.
And I may have just massacred econometrics — I don’t know. I am sure they have some of their own terms that deal with these things. But I think these sorts of approaches should be at the heart of an integrative statistics course, encouraging people to try insights from one domain and seeing if they have explanatory power in another.