Comparison of the Day: BMI/Mortality J-Curve

My new favorite term from epidemiology: J-Curve.

There’s a lot of things that increase your mortality in a more-or-less linear way. The more you smoke, the greater your all-cause mortality risk, for example. This isn’t to say you increase your chance of death by 100% moving from one pack a day to two. But on average, your mortality goes up for each additional cigarette you smoke a day. Ten cigarettes is not going to be better for you than five, ever.

Some things, though, don’t work like that. It’s harmful to be overweight, but it’s harmful to be underweight too. Some studies claim alcohol is like this — having no alcohol correlates with a higher mortality than having a drink or two a day, but once you get past a drink or two a day mortality climbs again. The curve is shaped like a “J”, hence the name.

Understanding that things can work this way is important. Vitamin E deficiencies have been correlated with increased cancer mortality, so a lot of people take vitamin E supplements, assuming it’s a linear relationship. But vitamin E supplements have been correlated with increased cancer risk.

Likewise, a lot of health gurus today will point to the harmful effects of over-consumption of sugar, gluten, or dairy (or heck, even fat/oils), and act as though this proves elimination of this thing will dramatically increase your health. It might — if it is a linear relationship. But if it’s a J-curve, you could end up doing as much harm as good.

Pro-Privacy Viruses

The Silicon Valley conception of privacy isn’t working for anyone except Silicon Valley. We know that. Charlie Stross, who is one smart dude, points out that if you follow the corporate-driven push to overshare to its logical conclusion your phone becomes a handy-dandy genocide machine, or, in the near term, the perfect device for this year’s Rufie-carrying girl stalker. Moreover, this is not some bizarre side-effect of social software, but is a flaw built into to how the software thinks about you, the product it is serving up to others.

That seems shrill and alarmist, but lately I don’t think it is. There are a lot of benefits to sharing, but also a lot of drawbacks, as any college grad who has missed out on a job due to a red solo cup picture can tell you. And because we get our media from the entities that came up with this system, we tend to see the benefits as systemic and the downsides as localized. But think about that for a minute or two and you realize that that can’t possibly be right.

Anyway, I’ve been thinking how it all ends lately. I don’t think it ends with us all running our own open source servers, going off the corporate surveillance grid. I don’t think we’ll be switching to Diaspora. We’re locked into these services.

So what’s the next vector? I think what we’ll be seeing soon are pro-privacy viruses. Imagine a “benevolent virus” that, instead of keylogging your credit card number, resets all your Facebook settings to the most private settings and sets your homepage to instructions for reopening up permissions (if that’s what you want to do). Or a virus that sits resident in memory and corrupts cross-site tracking cookies in real-time. Or one that shows you every bit of information that is retreivable about you on the internet, and asks if you are good with that.

I don’t think these should be created — there’d be a lot of unforseen side effects. But I think they are coming, and I think they are more likely to have a broader impact on privacy than scattered DIY projects.

In the end, I imagine they will fail — but it will be an interesting phase of this drama…

Does More Books Mean More Titles or More Editions? (A critique of that graph going around)

This has been one of the most interesting charts of the week, but it is also one generating a lot of wrong pronouncements I think:

The buzz around this is it shows the influence of copyright — and it definitely does — far less of the 2500 books sampled come from the period of copyright. But the question is what sort of effect of copyright it is demonstrating. For instance, I’ve seen almost all the commentators suggest this indicates that there is a massive gap in our copyright era offerings — the claim is copyright is making titles much less available.

But that’s not necessarily the case. It all comes down to what you mean by title.

Meaning this — when something is in copyright it is published usually by one publisher — maybe a couple publishers if there are overseas agreements. If it’s an absolute classic,  there may be more, but not that many. There are three Kindle versions of Hemingway’s For Whom the Bell Tolls on the U.S. Amazon site, and one is in Bulgarian, and another in Portuguese. There are five paperback versions listed as “new”, and only one of them actually appears to be in print currently.

On the other hand, there appear to be almost a hundred Kindle versions of Jane Eyre, each with its own ISBN. Go to paperback, and there are 400 versions of Jane Eyre. There’s 298 hardcovers of it.

And it’s not just popular works — Eliot’s forgotten masterpiece Silas Marner has 301 versions, whereas Wolfe’s 1980s classic Bonfire of the Vanities has three.

Want to really freak out? There are almost 5,000 “new” editions of the work of Dickens available. (Again,  these searches are including some out of print works in mint condition — I can’t seem to filter these out — but the point holds). You’d have to lump-sum a city’s worth of single-publisher authors for several years to get to a figure like that.

I can’t see any way that you could conceivably control for this in a random sample, at least given how Amazon’s search is constructed,  so I’m going to assume it wasn’t controlled for — in which case the graphic tells us nothing at this point. Copyright may also be reducing availability of titles — it would make sense that it was, to some extent. But this graph doesn’t tell you anything about that.

Comparison of the Day: Barefoot Running

A decent point about comparison that’s often missed: comparing like-to-like means that interventions must be executed at the same level of proficiency as controls:

For the past few years, proponents of barefoot running have argued that modern athletic shoes compromise natural running form. But now a first-of-its-kind study suggests that, in the right circumstances, running shoes make running physiologically easier than going barefoot.

The study, conducted by researchers at the University of Colorado in Boulder, began by recruiting 12 well-trained male runners with extensive barefoot running experience. “It was important to find people who are used to running barefoot,” says Rodger Kram, a professor of integrative physiology, who oversaw the study, which was published online in the journal Medicine & Science in Sports & Exercise.

“A novice barefoot runner moves very differently than someone who’s used to running barefoot,” Dr. Kram says. “We wanted to look at runners who knew what they were doing, whether they were wearing shoes or not.”

Specifically, he and his colleagues hoped to determine whether wearing shoes was metabolically more costly than going unshod. In other words, does wearing shoes require more energy than going barefoot?

You see this a lot in educational research — the teachers involved are either more trained in the intervention or the control, which can foul the results quite a bit, even in a cross-over design.

There’s actually lots more great stuff in this article — what the researchers found was that the lack of the weight of shoes was actually a confounding variable in judging the efficiency of other aspects of barefoot running — basically the like-to-like comparison they designed compared ultralight running shoes to barefoot + small weighted band-aids, and once the variable of shoe weight was controlled for in this way the efficiency association was reversed…another reminder that it’s usually more about the definitions than the stats.

I should add that this study probably addresses the concerns of only a small amount of barefoot runners — not everybody cares about efficiency.

Blackboard, Moodle, and the Commodity LMS

I haven’t seen this graph referenced in the recent discussion around Blackboard’s latest purchase, which is strange, because it explains almost everything:

A while back, Blackboard decided that the saturation and commodification of the LMS market meant that the path to greater profitability was not more contracts, but a higher average contract price. Under such a model, Blackboard Basic was seen as cannibalizing potential sales of the enterprise product, and they began running a series of special offers to move people off of the basic product and into the enterprise one. And they were successful to a point. In the data we have, enterprise licenses increase, and total licenses fall slightly, indicating that at least initially the higher contracts may have offset customer loss (though even this is debatable given the deals they ran for upgrades, and the percentage of that bump that is really the Angel acquisition).

We can’t tell what happened after that, though, since Bb has not released data on licensing. The best estimate I’ve seen indicates that Bb lost between 150-400 licenses a year from 2009 through 2011.

That’s a problem for Blackboard, and not just because of the loss of core LMS revenue. Why? Because Blackboard sees the future of contracts as selling add-on modules and other higher education services. Take a look at their front page:

The Learn product is placed first, but the message is clear — the transaction system, analytics system, and campus messaging system are products in their own right. And the future is selling these products — if you don’t believe me, just look at these figures Michael Feldstein put together a couple years back:

I’m sorry this is out of date — since Blackboard went private in 2011, there has not been much data released about such matters, so we have to rely on old snapshots.  But the idea here is clear:

  • Use the LMS as a foot in the door, and make a profit off the enterprise version of it
  • Purchase other companies to get a foot in the door to sell the add-on services

The purchase of ANGEL and WebCT were not about winning the “LMS wars”. Blackboard sees the LMS at this point as a commodity product. The purchases were about getting a seat at the table to sell other more profitable products: analytics, collaboration add-ons, early warning systems, financial transaction systems. Compared to the LMS, the price to support cost of an analytics,  transaction, or emergency communication system is a dream. The LMS, on the other hand, is a headache — high support, low-margin. But it’s a perfect foot in the door to make other sales.

There was only one problem with this — as Blackboard acquired more customers via purchases of WebCT and Angel, they realized there was a leak — Moodle. Customers that were gained through acquisition were moving to Moodle, and current customers pressured into the enterprise product were also bailing.

Buying Moodlerooms plugs that leak, and keeps Blackboard at the table selling the more profitable building access, commerce,   human resources, and donor management systems that they make their money on.

What Moodle is to Blackboard is a way to keep a foot in the door of the more price-sensitive customers while not cannibalizing sales of Learn.  Ultimately, this allows them to maximize profit on the LMS side while staunching the customer bleed they have been experiencing the past three years. And this preserves what has been their planned path for a while — to move beyond what they see as a dead LMS market and into more profitable products in less saturated areas.

Blackboard bought Moodlerooms for the same reason it bought Angel and WebCT: it’s not an LMS company anymore. It’s really that simple.

Comparison of the Day: Gas Prices

When things have a seasonal cycle, it’s often difficult to make direct comparisons. Ideally you compare to last year this time, or the ten year average of this time last year, but what people really want is a sense of how high it will go. This article does a decent job with that — look, it’s already $3.90, it’s going to keep rising until May or so — it’s hard to see absent some large event how we won’t at least break last years peak of $3.98, and get close to the record of $4.11.

This is also an example of where controlling for inflation is probably superfluous. We’re talking against last years price on the one hand and a date four years back on the other, through some pretty lean years in terms of inflation — we can probably live without it in a presentation like this. (Although interestingly we were almost paying as much for gas in 1980 in inflation adjusted dollars as we were the past couple of years — a recession will do that to prices….)

That Millennial Study and Baselines

By now you’ve seen or heard about the APA study on millennials and civic-mindedness. Turns out that millenials are not as civic-minded as Howe and others have claimed. Fair enough.

But another thing caught my eye — all the stories tended to compare Millennial numbers to a baseline Boomer figure — leading to everyone to blame the self-focus on coddling parents and Barney songsmithing.

But if you look at the figure above, you’ll see the jump only continues the radical jump made by the Gen Xers. And the Gen Xers were the latchkey generation.

So all these explanations that finger a cause located in a 1990’s childhood? Think again. There’s definitely something interesting going on here, but it’s been going on a lot longer than most articles indicate…

The Golden Rule of Comparison and the ACA

The golden rule of comparison, we tell our students, is simple:

Compare like-to-like where this is possible; account for differences where it is not.

Honestly, if you just apply this one rule religiously to anything billed as a comparison, you’ll outperform most people in evaluating comparisons.

Case in point, the Congressional Budget just published an update of its analysis of the Affordable Care Act. In the document they state:

CBO and JCT now estimate that the insurance coverage provisions of the ACA will have a net cost of just under $1.1 trillion over the 2012–2021 period—about $50 billion less than the agencies’ March 2011 estimate for that 10-year period [Emphasis mine].

That’s a good comparison. They are comparing the 2012-2021 estimate they made previously to the new estimate for 2012-2021. It’s the same agency, and we assume it’s the same analytical framework, but with updated data. It’s like to like as you tend to get in life.

Yet this was the response from Tom Price, of the House Republican Policy Committee:

House Republican Policy Committee Chairman Tom Price, M.D. (R-GA) issued the following statement regarding the Congressional Budget Office’s (CBO) updated cost estimate of the president’s health care law. The new CBO projection estimates that the law will cost $1.76 trillion over 10 years – well above the $940 billion Democrats originally claimed.

Why this discrepancy? There’s multiple reasons. But I find this one, which Ezra Klein points out,  is the most interesting:

One other thing that’s confused some people is that this estimate is looking at a different timeframe than the original estimates. The CBO’s first pass at the bill looked at 2010-2019. But years have passed, and so now they’re looking at 2012-2021. That means they have two fewer years of implementation, when the bill costs almost nothing, and two more years of operation, when it costs substantially more.

The idea is that since the ACA doesn’t *really* kick in until 2014, a 2010-2019 estimate is a 6-year cost, and a 2012-2021 estimate is an 8-year cost. There are other issues as well, perhaps even more important, but it occurs to me that this is a pretty common parlor trick people play with numbers a lot.

As far as how to solve the question of whether the 2010 estimate was high or low, Ezra correctly suggests the easiest way to do it is to ignore totals and look at the revisions. The net effect of the revisions identified by the CBO is negative — the bill costs less than initially thought.

Compare like-to-like where this is possible; account for differences where it is not.

Plenary Workshop at NELIG: What is Critical Thinking, and Why Is It So Hard to Teach?

I call this a plenary workshop, but as I learned after I agreed to do it, it was not only a plenary session, but it was the only session. Apparently NELIG, at least in its quarterly meetings, is structured as one giant workshop. No pressure there, then… 😉

In any case, I think it worked out (The abstract is here). This was a reformulation of some of the material covered in the Critical Skills Workshop over break, but redirected to issues of information literacy. If there’s one big idea in it, it’s that when we think critically we don’t often do the computation-intense sort of processing we tend to conceptualize as critical thinking. The most important pieces of critical thinking (as practiced in daily life) happen before you start to “think” — they come from the conceptual frameworks that formulate our intuitive responses. To address problems in critical thought, you have to understand the conceptual frameworks in use by students, work with the student to actively deconstruct them, and provide more useful frameworks to replace them.

If you can’t do that — if your idea is that the students will just learn to think harder — you’re lost.

The participants were great — actively engaged, great thinkers asking all the right questions. I want library faculty in all my presentations from now on: you really can’t do better. In the activity, they identified the differences between the conceptual frameworks librarians use to parse results lists, and the frameworks used by students — students use “familiarity” and “match” as their guideposts — to them, the act of choosing a resource is like that of choosing a puzzle piece. Librarians look at genre and bias — what sort of document is this (journal article, news story, conference proceeding, blog post) and what markers of bias can we spot (URL, language, title, etc). For librarians, this is an exercise of seeking out construction materials, not finding puzzle pieces.

We talked a little about how to students these processes may appear the same: librarians talk about bias, and students hear “use familiar sources”. Librarians talk about genre, and students hear “fit” or “match” — “How many journal articles do I need to collect? How many news stories?”, which is really just a different way of asking what shape the puzzle piece should be in. Until you address the underlying conceptual misunderstanding directly through well-structured activities, students will continue to plug what you teach them into a conceptual framework that undermines the utility of the new knowledge.

Slides are here. There’s some good stuff in there, but much is incomprehensible without the activities and narration.

To all NELIG participants, thanks for a great Friday morning. It was a pleasure to talk with you all!