I knew about the poll in 1936 that changed everything — where the two million responses collated by the Literary Digest were dead wrong while the 50,000 responses scientifically selected by George Gallup were right. If you need a Wikipedia refresher on that, here you go:
In 1936, [Gallup’s] new organization achieved national recognition by correctly predicting, from the replies of only 5,000 [sic?] respondents, that Franklin Roosevelt would defeat Alf Landon in the U.S. Presidential election. This was in direct contradiction to the widely respected Literary Digest magazine whose poll based on over two million returned questionnaires predicted that Landon would be the winner. Not only did Gallup get the election right, he correctly predicted the results of the Literary Digest poll as well using a random sample smaller than theirs but chosen to match it.
What I didn’t know was that the data he had collected on the non-response bias of that poll was still available. The chart above might make a good addition to a class on non-response bias, as it shows how non-response tends to exaggerate extreme values — in this case, anti-incumbency feelings.
The chart is from this article, which is worth a read. It also provides a chart dealing with the sampling bias issue: