Assignment: Sourcing a Quote

So this is not a photo assignment (reverse image search will get you nowhere!). But here’s a photo anyway, for aesthetic reasons:

DG4MmcVU0AQY2-i

It’s a quote from current U.S. Defense Secretary General Mattis emblazoned on a coffee mug:

“I come in peace. I didn’t bring artillery. But I’m pleading you with tears in my eyes: if you fuck with me, I’ll kill you all.”

So the question are:

  • Did he say this?
  • Who heard him say this?
  • When and where did he say it?
  • What publication was the original reporting source for this quote?

So let me say a few things about tracking down quotes (and about novice behavior when tracking down quotes). What a novice will do is this: they’ll do a web search like this for [[Mattis “i come in peace”]]:

novice

And they’ll find a good solid publication that sources the quote:

snip

And maybe that’s enough for daily use, but it’s not what we need here for this assignments. Quotes are some of the most bungled information on the planet. Back about ten years ago, in fact, I showed how a Washington Post story complaining about the web getting quotes wrong actually had it reversed — the web was right and the Post was wrong. (Incidentally, reviewing that post I find that it articulates pretty much what I am pushing today about web literacy — I’d forgotten how long I’d been beating this drum).

My advice for getting quotes right is the same as for everything else. Your two choices are:

  • Get as close as you can to the time and place of the quote. The original reporting, or the first reporting of the reporting. For the most part, the further you get from a quote the more it changes.
  • Alternatively, get a not just solid, but rock-solid source (such as Quote Investigator, a reputable monthly like The Atlantic, or a scholarly/major book publisher) that you know does the hard work of tracking a quote to an original source.

In case you haven’t noticed, these two options map to two of our moves:

  1. Check for previous work, and
  2. Go upstream

Anyway, go to it. Get as far upstream as you can, or to a source you consider to be rock-solid.

 

Assignment: Titanic Photograph

OK, I’m still on a photos kick, but this showed up in Twitter this morning:

picture of titanic

Here’s the photo alone:

DCNDf1xXkAIr3cE

And the tweet:

Was this really taken aboard the Titanic? What else can you tell us about the photo? Go to it.

As usual, don’t read the comments until after you do the assignment.

I’m going to do some text and stat based assignments next week, for folks sick of the pictures.

LazyWeb: Why Did Trust In Press Collapse in the mid-80s/early-90s?

I have a question I’d love others to answer for me.

So I was looking at longer term declines in trust in the press. And what I expected to see was a long steady fall-off from the peak trust of Watergate and if you look at some charts you see that.

trustchartTumblr

But when you look closer on those charts they really elide the 1980s,

If you get granular, and look at the 1980s, the charts look like this:

20561968_10100935173801327_1783419035_n

That’s a dramatic drop, which coincides with Clinton’s election, and maybe with the rise of AM radio news hosts? But man — that’s really steep.

rise

Again, you can peg the launch of Fox in here at the end (mid that elision at the end of the graph), but here there is even a significant uptick through 1985-88 (Iran-Contra?)

543-1

Here’s a different story. And again, I think about Iran-Contra for that late 1980s dive, but it just seems naive to lay it totally at the feet of one story like that. And again, AM radio is important — Limbaugh gets syndicated in 1988 and has 5 million listeners within two years. Is that enough? Or is his rapid growth more of a symptom of underlying causes here?

Let me be clear — I’m not at a loss for explanations — in fact I have many that can adequately explain this. But I’m curious if I am missing something. I’m particularly curious after reading some literature on the impact of Hollywood on press perceptions if anyone can remember popular media which may have fed this viewpoint — there’s a case to be made we don’t pay enough attention to the impact of popular media on perceptions — one can make a good argument that police have maintained good reputations despite evidence that should undermine that reputation due to a relatively sympathetic portrayal on television.

Jay Rosen looked into this a couple years back, not sure where he eventually landed (and I note that in the explanations featured AM radio, Iran-Contra, or Hollywood are not featured, so maybe I’m in the weeds here).

And of course, maybe it’s noise! It could be. But if it corresponds with something in particular that would be one route into investigation.

UPDATE: A Gallup study from the period supports the idea that at least a chunk of it was due to Iran-Contra. Brendan Nyhan — more famous recently for work on cognitive science of correction — had a solidly sourced article from 2007 arguing this is part of a pattern, with coverage of major scandals eroding trust of the governing party, and trust not returning to trend.

 

Converting a Word Doc into a Digipo Article

I had some Word documents students had produced that needed to go up on Digipo. I could have directly copy and pasted them, but they were in a slightly different document template. So I pasted and then pasted the separate pieces into the right places. I sped up the video below slightly, but in reality the whole process still took under two minutes, including rephrasing some text to meet current project norms.

Take a look:

The thing I find neat is that since we strip out over-formatting, all the little annoying style mismatches disappear when the static page generator produces the page:

download (1).png

This heavy stripping of over-formatting saves time and effort when converting: no fonts, font-sizes, colors, or other advanced styling is pushed to the site, so we can just leave it as is.

I’m sure we’ll get hit eventually by some non-standard formatting needs, but for the moment I’m pleasantly surprised by how much it simplifies the process.

By the way, our Neuro 490 articles are now in their own directory here. We give course directories to whoever want them, provided they can live by the Digipo Code.

A Call to Info-Environmentalism

When I was at Keene State College, we had a student life group that was heavily into environmentalism, and lots of extra-curricular activities and student learning were structured around making the local environment better. As an example, we’d have a clean up day each year where the students would pick a target — a local river, or park, etc — and clean up the area. You’d start by picking up trash but eventually end up pulling a defunct TV set out of a stream somewhere because people are awful.

It was enjoyable because a) you could see how messed up things were, and b) you could see the impact you had after you were done. Some things feel good, some things do good. This did both.

As we look at our broken information environment, I wonder if this is a way we can think about student engagement. Take for example this Google search:

avocados

Now, I love avocados. I do. And they have health benefits. But Dr. Mercola and Natural Health 365 are the equivalent of a broken TV set sitting in your public reservoir. The fact that they top this search is criminal. Both these sites mix the hyping of small inconclusive studies with the sale of “health products”.

While it may not be all that harmful to convince people to put more faith in eating avocados than they really should, these sites lure readers into a web of anti-science conspiracy thinking which has real impacts. As an example, the Mercola site is implicated in originating a rumor that Swine Flu vaccine was dangerous:

In her book On Immunity, an examination of vaccination fears, Eula Biss wrote about Mercola as one of the originating disinformation agents. She traced a persistent online rumor that the vaccine for H1N1, otherwise known as the swine flu, contained a chemical called squalene to an article written by Mercola, “Squalene: The Swine Flu Vaccine’s Dirty Little Secret Exposed.”

“The reproductions of Mercola’s article that proliferated across the Web early in the pandemic were then, and still remain, uncorrected. But by the time I traced them to the version on his website in the fall of 2009, the original article already included a correction in the header clarifying that none of the H1N1 vaccines distributed in the United States contained squalene. This was not a minor point of correction, but the article had gone viral before being corrected,” Biss wrote. “Like a virus, it had replicated itself repeatedly, overwhelming more credible information about the vaccine.”

Dissuading people from taking a needed vaccine (especially those most at risk) has deadly consequences. While the virus was milder than anticipated, it was also much more viral. Sixty million people contracted it. Over 270,000 people were hospitalized with H1N1 in the U.S. and more than 12,000 people died.

And yet this source is at the top of Google’s search results.

This isn’t an isolated case. Until recently, if you hit the “I’m Feeling Lucky” button on Google the question “Did the Holocaust happen?” led you to Stormfront, a white supremacist site sporting a page explaining why the Holocaust was a myth. “What happened to dinosaurs?” produced a creationist site in Google Snippets.

what-happened-to-dinosaurs-search-800x411

 

And of course, from the political fringe we have this gem, in response to “presidents in the Ku Klux Klan”

presidents-in-the-klu-klux-klan-search

Protip: None of these presidents are known Ku Klux Klan members.

Again, these issues are not harmless. People forgo treatments that could save them, or make them less sick. People fear threats that are minor (Islamic terrorism in Topeka, clown killings, Ebola in America) while not believing in things that are real threats (climate change, declining productivity gains, painkiller addiction). To some extent the bad information environment online is a result of our bad information environment elsewhere. But unlike the nightly news, the web is still a collectively maintained and produced environment. We can clean it up. We can pull those TVs and shopping carts and plastic bags out of our shared information streams and Google results.

We can actually change things, one search result at a time. We can provide better answers, more balanced answers, more useful answers. We can stem the erosion of faith in science by selecting sources more carefully than the latest clickbait. We can debunk the scams and fakes that litter people’s feeds.

I started on my own project somewhat recently (Digipo), and was amazed by how easy it was — if you targeted a particular phrasing of a question — to have an impact on search results. Here’s a more balanced treatment of music and IQ that some WSU neuroscience students wrote up, making it’s way to the top Google results:

music

What I love about this is that the third result is *also* student work, in this case from a common course blog by a student at Penn State. Imagine a world where all our students wrote different variations on questions like this, and fed them into wiki-like communities that carried forward the research and maintained it? Could we clean up our information environment? Save lives? Make for more enlightened discourse?

And if we can do that, why aren’t we doing that?

 

60-Second Check: Aircraft Waste Hits Cruise Ship

When I say you can fact check a lot of things in one to two minutes, I mean, literally, one to two minutes. Here’s an example:

 

You can sit around and think critically about whether this is possible all day, of course. But the easiest way to debunk this is to discover that the pictures are lifted from a different context, and to do that you need web skills, not what we traditionally call “critical thinking”.

Information Underload

For many years, the underlying thesis of the tech world has been that there is too much information and therefore we need technology to surface the best information. In the mid 2000s, that technology was pitched as Web 2.0. Nowadays, the solution is supposedly AI.

I’m increasingly convinced, however, that our problem is not information overload but information underload. We suffer not because there is just too much good information out there to process, but because most information out there is low quality slapdash takes on low quality research, endlessly pinging around the spin-o-sphere.

Take, for instance, the latest news on Watson. Watson, you might remember, was IBM’s former AI-based Jeopardy winner that was going to go from “Who is David McCullough?” to curing cancer.

So how has this worked out? Four years later, Watson has yet to treat a patient. It’s hit a roadblock with some changes in backend records systems. And most importantly, it can’t figure out how to treat cancer because we don’t currently have enough good information on how to treat cancer:

“IBM spun a story about how Watson could improve cancer treatment that was superficially plausible – there are thousands of research papers published every year and no doctor can read them all,” said David Howard, a faculty member in the Department of Health Policy and Management at Emory University, via email. “However, the problem is not that there is too much information, but rather there is too little. Only a handful of published articles are high-quality, randomized trials. In many cases, oncologists have to choose between drugs that have never been directly compared in a randomized trial.”

This is not just the case with cancer, of course. You’ve heard about the reproducibility crisis, right? Most published research findings are false. And they are false for a number of reasons, but primary reasons include that there are no incentives for researchers to check the research, that data is not shared, and that publications aren’t particularly interested in publishing boring findings. The push to commercialize university research has also corrupted expertise, putting a thumb on the scale for anything universities can license or monetize.

In other words, there’s not enough information out there, and what’s out there is generally worse than it should be.

You can find this pattern in less dramatic areas as well — in fact, almost any place that you’re told big data and analytics will save us. Take Netflix as an example. Endless thinkpieces have been written about the Netflix matching algorithm, but for many years that algorithm could only match you with the equivalent of the films in the Walmart bargain bin, because Netflix had a matching algorithm but nothing worth watching. (Are you starting to see the pattern here?)

In this case at least, the story has a happy ending. Since Netflix is a business and needs to survive, they decided not to pour the majority of their money into newer algorithms to better match people with the version of Big Momma’s House they would hate the least. Instead, they poured their money into making and obtaining things people actually wanted to watch, and as a result Netflix is actually useful now. But if you stick with Netflix or Amazon Prime today it’s more likely because you are hooked on something they created than that you are sold on the strength of their recommendation engine.

Let’s belabor the point: let’s talk about Big Data in education. It’s easy to pick on MOOCs, but remember that the big value proposition of MOOCs was that with millions of students we would finally spot patterns that would allow us to supercharge learning. Recommendation engines would parse these patterns, and… well, what? Do we have a bunch of superb educational content just waiting in the wings that I don’t know about? Do we even have decent educational research that can conclusively direct people to solutions? If the world of cancer research is compromised, the world of educational research is a control group wasteland.

We see this pattern again and again — companies coming along to tell us that their platform will help us with the firehose of content. But the big problem is not that it’s a firehose, but that it’s a firehose of sewage. It’s all haystack and no needle. And the reason this happens again and again is that what we so derisively call “content” nowadays is expensive to produce, and gets produced by a large number of well-paid people who in general have no significant marketing arm. To scale up that work is to employ a lot of people, but it doesn’t change your return on investment ratio. To make a dollar, you need to spend ninety cents, and that doesn’t change no matter how big you get. And who wants to spend ninety cents to make a dollar in today’s world?

Processing and promotion platforms, however, like Watson or MOOCs or Facebook, offer the dream of scalability, where there is zero marginal cost to expansion. They also offer the potential of monopoly and lock-in, to drive out competitors. And importantly, that dream drives funding which drives marketing which drives hype.

And this is why there is endless talk about the latest needle in a haystack finder, when what we are facing is a collapse of the market that funds the creation of needles. Netflix caught on. Let’s hope that the people who are funding cancer research and teaching students get a clue soon as well. More money to the producers of valuable content. Less to platforms, distributors, and needle-finders. Do that, and the future will sort itself out.


I’m guessing if you are reading this you already know this, but if you are interested in this stuff, make sure to read Audrey Watters’ This Week In Robots religiously, as  well her writing in this area, which has been very influential on me.