I want to do this in a class….

What a neat way of combining two textbooks to get a novel course design (which meshes with current theories of interleaving):

In an effort to maximize spacing and encoding variability, Robert Bjork once taught an honors introductory psychology course twice in one term. Up to the point of the midterm, the basic concepts of introductory psychology were covered using a textbook that adopted a history of psychology approach and emphasized the contributions of key individuals in the history of psychology, such as Pavlov, Freud, and Skinner. After the midterm exam, the basic concepts were covered again, this time using a textbook that adopted a brain mechanisms approach. The goal was to have key concepts come up in each half of the course (spacing) and from a different standpoint (variation).

From here.

Divided Attention During Lecture

I’ve been having some fun reading Bjork and his followers on elements of instruction. It’s good stuff! This comes from Successful Lecturing: Presenting Information in Ways That Engage Effective Processing by  Patricia Ann de Winstanley & Robert A. Bjork:

In addition to its having a strong negative impact on encoding, divided attention has been shown to have much larger effects on direct, or explicit, tests of memory than on indirect, or implicit, tests of memory (MacDonald and MacLeod, 1998; Szymanski and MacLeod, 1996). The implication is that divided attention during a lecture may leave students with a subsequent sense of familiarity, or feeling of knowing, or perceptual facilitation for the presented material but without the concomitant ability to recall or recognize the material on a direct test of memory, such as an examination. As a consequence, students may misjudge the amount of time needed for further study.

Dividing students’ attention during a lecture therefore poses a double threat. First, information is learned less well when attention is divided. Second, one’s feeling of knowing or processing facility remains unaffected by divided attention, which may result in the assumption that information is learned well enough and no further study time is needed (see Bjork, 1999, and Jacoby, Bjork, and Kelley, 1994, for reviews of the literature on illusions of comprehension and remembering).

Concept Inventories and Dan Meyer’s Linear Modeling Exercise

I’ve talked a bit in the past about good concept inventory questions — questions that address difficult conceptual questions but have black and white answers and don’t require any special vocabulary to answer.

Dan Meyer’s Linear Modeling exercise [PDF] is a good example. The first question has a specific answer, and answering it requires the right set of intuitions about linear processes, but it doesn’t matter what terms you are using, and the student does not need to intuit what you are trying to assess to get it right. 

I’ll add the exercise has one other mark of a great inventory question — apart from the title, it contains no hints that this is an application of linear modeling. This jives with what we know from processes like interleaving — that the decision of which model to apply is as important as the model itself. 

One final thing — I can’t help to notice that like many ConceptTests and like many questions on the FCI it is a prediction question. There’s something very powerful about prediction in the way it focuses the mind. More on that later.

Comparing Electoral Behavior

From the Utne Reader, in an article showing that we ” are segregating [our]selves politically and geographically” in the U.S. :

In 1992, 38 percent of Americans lived in counties decided by landslide elections; by 2004, that figure was 48 percent.”

One thing that jumps out at me immediately is that elections are very hard to compare to one another. In this case, 1992 represented the unseating of an incumbent in a three-way race (remember Perot?) — whereas 2004 was the two-way reaffirmation of an incumbent president with no real third party presence. 

How might this affect things? Well, let’s say we define a landslide as getting 60% or more of the vote. In a three way race (like 1992), that would be difficult. Clinton won the election with 43% of the vote (to Bush’s 38, and Perot’s 19). Assuming some deviation off of that average from county to county, you are still unlikely to get a 60% landslide — even a county 50% more Democratic than the average county is still barely breaking the landslide barrier (0.43 * 1.5 = 0.64).

In a two-way race the dynamics are different. In the Kerry/Bush 50/50 split, a candidate that wins in a 50% more Democratic county is going to win by 75%.

The second problem (which we ignored in the above calculation) is that counties are not a good unit of measurement. The majority of counties are small, Republican entities, even though voters are roughly split nationwide (Democrats live in more populous counties). I imagine too, that because the majority of counties are Republican that Republican wins will look polarizing (look at all those deep red counties!) whereas Democratic wins will look less polarizing (the counties that go dark blue will be fewer, but more populous, whereas the Dem votes will eat into the red counties). 

In any case, why not compare something more comparable — like 1984 and 2004? That has problems, but a whole lot less I think. Or compare mid-terms, where the national politics is less confounding. 



Fireside Tutorials and Punk Economics

What do we call this genre of videos, these informal explanations by Khan Academy, RSA:Animate, Common Craft, Vi Hart, and others — these sit across the desk from you and talk things through? I have no idea. But I’m fascinated with the form, and how rethinking video this way makes a lecture seem more like tutoring — even when (as in the case of RSA:Animate) the material is often an adapted lecture.

Anyway, this is my most recent find — Punk Economics, by David McWilliams:

If you have other great examples, shoot me an email or post in the comments.

Hill’s nine criteria for causal association

Sir Austin Bradford Hill’s classic article on the characteristics of a causal relationship is well worth a read, and is still one of the most concise lists of what to look for in any research you read. Here’s a summary of what helps us make the leap from association to causation:

  1. Strength (is the risk is large)
  2. Consistency (the results have been replicated, by different researchers in different situations)
  3. Specificity (the predictor is not related to a broad array of outcomes)
  4. Temporality (predictor always precedes outcome)
  5. Biological gradient (also known as a dose-response: the more predictor involved, the more the outcome is involved)
  6. Plausibility (there is a plausible mechanism — we have a credible theory of how the causal relationship might work)
  7. Coherence (the association is consistent with the history of the disease)
  8. Experimental evidence (experimental interventions show results consistent with the association)
  9. Analogy (there are  similar results that we can draw a relationship to)

It’s worth noting that, as Fung points out in Numbers Rule Your World, there’s an awful lot of situations where we don’t need causality. You can work with strong association in places where you only need to predict (insurance rates, at-risk determinations), and rely on causality only when you have to determine effective interventions. 

The biggest problem I find with students and causality is not that they over-assign causality to situations, but that they see causality as a binary concept. In the minds of many students, there are two buckets — “caused” and “not-caused”. The idea that one association is more likely to be causal than another, that it is probably more likely that diets high in animal fat increase heart disease risk than it is that coffee cures Alzheimer’s, but that neither of these are proved beyond a doubt sort of escapes them — causality is seen as a finish line that is crossed, usually once and for all. 

Problems of Definition: Elsevier’s Prices

The recent boycott of Elsevier provides us with a great quote for use in a statistical literacy class. People are boycotting for a number of reasons, particularly because of the high cost of the “bundles” Elsevier sells.

Claiming that their journals are some of the cheapest in the industry, an Elsevier rep states:

“Over the past 10 years, our prices have been in the lowest quartile in the publishing industry,” said Alicia Wise, Elsevier’s director of universal access. “Last year our prices were lower than our competitors’. I’m not sure why we are the focus of this boycott, but I’m very concerned about one dissatisfied scientist, and I’m concerned about 2,000.”

Form the perspective of definition of terms, this may initially seem pretty straightforward, but it’s anything but. What does “our prices” mean?

  • Mean or median price computed by total offerings? In which case Elsevier could offer hundreds of free and worthless journals that no one uses or orders individually. This would pretty handily offset higher priced offerings.
  • Mean or median price computed by individual sales? This would be a good measure — because it only counts the journals people use, and doesn’t count the junk they carry. But it is impossible to compute this number this way because of their practice of bundling.

This last point is pretty important. Imagine you have two cable companies. One charges you for only the channels you want, ala carte. You get BBC America, SyFy, and PBS for $12.

The other cable company makes you buy a package to get these channels, and it cleverly organizes it so no cheaper package includes all three of these. So you get your BBC America, SyFy, and PBS, but you have to buy the Super-Mega Package to get them. You therefore get 120 channels for $120.

Which cable company offers channels for the cheapest price? From your perspective you are getting charged $4 a channel by Company A, and $40 a channel by company B.

But since that information (what you were actuallytrying to order) is recorded nowhere, any public number is more likely going to be a function of the price you paid divided by the channels you bought. In this case Company A is charging you $4 a channel, whereas Company B is charging you $1 a channel. Company B (the grifters) are the cheapest.

What’s the point? Having the “lowest” prices in this case is a symptom of the bundling problem, not an excuse for it. The fact that Elsevier’s prices are in the lowest quartile is most likely a sign of excessive bundling, not of a functional market.

Possibly worth some class time on the cable TV example.