Does More Books Mean More Titles or More Editions? (A critique of that graph going around)

This has been one of the most interesting charts of the week, but it is also one generating a lot of wrong pronouncements I think:

The buzz around this is it shows the influence of copyright — and it definitely does — far less of the 2500 books sampled come from the period of copyright. But the question is what sort of effect of copyright it is demonstrating. For instance, I’ve seen almost all the commentators suggest this indicates that there is a massive gap in our copyright era offerings — the claim is copyright is making titles much less available.

But that’s not necessarily the case. It all comes down to what you mean by title.

Meaning this — when something is in copyright it is published usually by one publisher — maybe a couple publishers if there are overseas agreements. If it’s an absolute classic,  there may be more, but not that many. There are three Kindle versions of Hemingway’s For Whom the Bell Tolls on the U.S. Amazon site, and one is in Bulgarian, and another in Portuguese. There are five paperback versions listed as “new”, and only one of them actually appears to be in print currently.

On the other hand, there appear to be almost a hundred Kindle versions of Jane Eyre, each with its own ISBN. Go to paperback, and there are 400 versions of Jane Eyre. There’s 298 hardcovers of it.

And it’s not just popular works — Eliot’s forgotten masterpiece Silas Marner has 301 versions, whereas Wolfe’s 1980s classic Bonfire of the Vanities has three.

Want to really freak out? There are almost 5,000 “new” editions of the work of Dickens available. (Again,  these searches are including some out of print works in mint condition — I can’t seem to filter these out — but the point holds). You’d have to lump-sum a city’s worth of single-publisher authors for several years to get to a figure like that.

I can’t see any way that you could conceivably control for this in a random sample, at least given how Amazon’s search is constructed,  so I’m going to assume it wasn’t controlled for — in which case the graphic tells us nothing at this point. Copyright may also be reducing availability of titles — it would make sense that it was, to some extent. But this graph doesn’t tell you anything about that.

Comparison of the Day: Barefoot Running

A decent point about comparison that’s often missed: comparing like-to-like means that interventions must be executed at the same level of proficiency as controls:

For the past few years, proponents of barefoot running have argued that modern athletic shoes compromise natural running form. But now a first-of-its-kind study suggests that, in the right circumstances, running shoes make running physiologically easier than going barefoot.

The study, conducted by researchers at the University of Colorado in Boulder, began by recruiting 12 well-trained male runners with extensive barefoot running experience. “It was important to find people who are used to running barefoot,” says Rodger Kram, a professor of integrative physiology, who oversaw the study, which was published online in the journal Medicine & Science in Sports & Exercise.

“A novice barefoot runner moves very differently than someone who’s used to running barefoot,” Dr. Kram says. “We wanted to look at runners who knew what they were doing, whether they were wearing shoes or not.”

Specifically, he and his colleagues hoped to determine whether wearing shoes was metabolically more costly than going unshod. In other words, does wearing shoes require more energy than going barefoot?

You see this a lot in educational research — the teachers involved are either more trained in the intervention or the control, which can foul the results quite a bit, even in a cross-over design.

There’s actually lots more great stuff in this article — what the researchers found was that the lack of the weight of shoes was actually a confounding variable in judging the efficiency of other aspects of barefoot running — basically the like-to-like comparison they designed compared ultralight running shoes to barefoot + small weighted band-aids, and once the variable of shoe weight was controlled for in this way the efficiency association was reversed…another reminder that it’s usually more about the definitions than the stats.

I should add that this study probably addresses the concerns of only a small amount of barefoot runners — not everybody cares about efficiency.

Blackboard, Moodle, and the Commodity LMS

I haven’t seen this graph referenced in the recent discussion around Blackboard’s latest purchase, which is strange, because it explains almost everything:

A while back, Blackboard decided that the saturation and commodification of the LMS market meant that the path to greater profitability was not more contracts, but a higher average contract price. Under such a model, Blackboard Basic was seen as cannibalizing potential sales of the enterprise product, and they began running a series of special offers to move people off of the basic product and into the enterprise one. And they were successful to a point. In the data we have, enterprise licenses increase, and total licenses fall slightly, indicating that at least initially the higher contracts may have offset customer loss (though even this is debatable given the deals they ran for upgrades, and the percentage of that bump that is really the Angel acquisition).

We can’t tell what happened after that, though, since Bb has not released data on licensing. The best estimate I’ve seen indicates that Bb lost between 150-400 licenses a year from 2009 through 2011.

That’s a problem for Blackboard, and not just because of the loss of core LMS revenue. Why? Because Blackboard sees the future of contracts as selling add-on modules and other higher education services. Take a look at their front page:

The Learn product is placed first, but the message is clear — the transaction system, analytics system, and campus messaging system are products in their own right. And the future is selling these products — if you don’t believe me, just look at these figures Michael Feldstein put together a couple years back:

I’m sorry this is out of date — since Blackboard went private in 2011, there has not been much data released about such matters, so we have to rely on old snapshots.  But the idea here is clear:

  • Use the LMS as a foot in the door, and make a profit off the enterprise version of it
  • Purchase other companies to get a foot in the door to sell the add-on services

The purchase of ANGEL and WebCT were not about winning the “LMS wars”. Blackboard sees the LMS at this point as a commodity product. The purchases were about getting a seat at the table to sell other more profitable products: analytics, collaboration add-ons, early warning systems, financial transaction systems. Compared to the LMS, the price to support cost of an analytics,  transaction, or emergency communication system is a dream. The LMS, on the other hand, is a headache — high support, low-margin. But it’s a perfect foot in the door to make other sales.

There was only one problem with this — as Blackboard acquired more customers via purchases of WebCT and Angel, they realized there was a leak — Moodle. Customers that were gained through acquisition were moving to Moodle, and current customers pressured into the enterprise product were also bailing.

Buying Moodlerooms plugs that leak, and keeps Blackboard at the table selling the more profitable building access, commerce,   human resources, and donor management systems that they make their money on.

What Moodle is to Blackboard is a way to keep a foot in the door of the more price-sensitive customers while not cannibalizing sales of Learn.  Ultimately, this allows them to maximize profit on the LMS side while staunching the customer bleed they have been experiencing the past three years. And this preserves what has been their planned path for a while — to move beyond what they see as a dead LMS market and into more profitable products in less saturated areas.

Blackboard bought Moodlerooms for the same reason it bought Angel and WebCT: it’s not an LMS company anymore. It’s really that simple.

Comparison of the Day: Gas Prices

When things have a seasonal cycle, it’s often difficult to make direct comparisons. Ideally you compare to last year this time, or the ten year average of this time last year, but what people really want is a sense of how high it will go. This article does a decent job with that — look, it’s already $3.90, it’s going to keep rising until May or so — it’s hard to see absent some large event how we won’t at least break last years peak of $3.98, and get close to the record of $4.11.

This is also an example of where controlling for inflation is probably superfluous. We’re talking against last years price on the one hand and a date four years back on the other, through some pretty lean years in terms of inflation — we can probably live without it in a presentation like this. (Although interestingly we were almost paying as much for gas in 1980 in inflation adjusted dollars as we were the past couple of years — a recession will do that to prices….)

That Millennial Study and Baselines

By now you’ve seen or heard about the APA study on millennials and civic-mindedness. Turns out that millenials are not as civic-minded as Howe and others have claimed. Fair enough.

But another thing caught my eye — all the stories tended to compare Millennial numbers to a baseline Boomer figure — leading to everyone to blame the self-focus on coddling parents and Barney songsmithing.

But if you look at the figure above, you’ll see the jump only continues the radical jump made by the Gen Xers. And the Gen Xers were the latchkey generation.

So all these explanations that finger a cause located in a 1990’s childhood? Think again. There’s definitely something interesting going on here, but it’s been going on a lot longer than most articles indicate…

The Golden Rule of Comparison and the ACA

The golden rule of comparison, we tell our students, is simple:

Compare like-to-like where this is possible; account for differences where it is not.

Honestly, if you just apply this one rule religiously to anything billed as a comparison, you’ll outperform most people in evaluating comparisons.

Case in point, the Congressional Budget just published an update of its analysis of the Affordable Care Act. In the document they state:

CBO and JCT now estimate that the insurance coverage provisions of the ACA will have a net cost of just under $1.1 trillion over the 2012–2021 period—about $50 billion less than the agencies’ March 2011 estimate for that 10-year period [Emphasis mine].

That’s a good comparison. They are comparing the 2012-2021 estimate they made previously to the new estimate for 2012-2021. It’s the same agency, and we assume it’s the same analytical framework, but with updated data. It’s like to like as you tend to get in life.

Yet this was the response from Tom Price, of the House Republican Policy Committee:

House Republican Policy Committee Chairman Tom Price, M.D. (R-GA) issued the following statement regarding the Congressional Budget Office’s (CBO) updated cost estimate of the president’s health care law. The new CBO projection estimates that the law will cost $1.76 trillion over 10 years – well above the $940 billion Democrats originally claimed.

Why this discrepancy? There’s multiple reasons. But I find this one, which Ezra Klein points out,  is the most interesting:

One other thing that’s confused some people is that this estimate is looking at a different timeframe than the original estimates. The CBO’s first pass at the bill looked at 2010-2019. But years have passed, and so now they’re looking at 2012-2021. That means they have two fewer years of implementation, when the bill costs almost nothing, and two more years of operation, when it costs substantially more.

The idea is that since the ACA doesn’t *really* kick in until 2014, a 2010-2019 estimate is a 6-year cost, and a 2012-2021 estimate is an 8-year cost. There are other issues as well, perhaps even more important, but it occurs to me that this is a pretty common parlor trick people play with numbers a lot.

As far as how to solve the question of whether the 2010 estimate was high or low, Ezra correctly suggests the easiest way to do it is to ignore totals and look at the revisions. The net effect of the revisions identified by the CBO is negative — the bill costs less than initially thought.

Compare like-to-like where this is possible; account for differences where it is not.