QAnon and Pinterest Is Just the Beginning

I have been talking about Pinterest as a disinformation platform for a long time, so this article on QAnon memes on Pinterest is not surprising at all:

Many of those users also pinned QAnon memes. The net effect is a community of middle-aged women, some with hundreds of followers, pinning style tips and parfait recipes alongside QAnon-inspired photoshops of Clinton aide John Podesta drinking a child’s blood. The Pinterest page for a San Francisco-based jewelry maker sells QAnon earrings alongside “best dad in the galaxy” money clips.

Pinterest’s algorithm automatically suggests tags with “ideas you might love,” based on the board currently being viewed. In a timely clash of Trumpist language and Pinterest-style relatable content, board that hosts the Podesta photoshop suggests viewers check out tags for “fake news” and “so true.”

The story is a bit more complex than that, of course. It’s not clear to me that the users noted here are not spammers (as we’ll see below). It’s quite possible many of these accounts are people mixing memes and merchandise as a marketing amplification strategy. We don’t know anything about real reach, either. There are no good numbers on this.

But the threat is real, because Pinterest’s recommendation engine is particularly prone to sucking users down conspiracy holes. Why? As far as I can tell, it’s a couple of things. The first problem is that Pinterest’s business model is in providing very niche and personalized content. It’s algorithm is designed to recognize stuff at the level of “I like pictures of salad in canning jars”, and as Zeynep Tufekci has demonstrated with YouTube, engines of personalization are also engines of radicalization.

But it’s more than that: it’s how it goes about recommendation. The worst piece of this, from a vulnerability perspective, is that it uses “boards” as a way to build its model of related things to push to you, and that spammers have developed ways to game these boards that both amplify radicalizing material and and provide a model for other bad actors to emulate.

How Spammers Use Pinterest Boards as Chumbuckets

The best explanation of how this works comes from Amy Collier at Middlebury,  whose post on Pinterest radicalization earlier this year is a must-read for those new to the issue. Drawing on earlier work on Pinterest manipulation, Collier walks through the almost assuredly fake account of  Sandra Whyte, a user who uses boards with extreme political material to catch the attention of users. Here’s her “American Politics” board:

Screen-Shot-2018-03-13-at-10.28.08-AM.png

These pins flow to other users’ home pages with no context, which is why the political incoherence of the board as a whole is not a problem for the user. People are more likely to see the pins through the feed than the board as a whole.

Once other users like that material, they are more likely to see links to TeeSpring T-shirts this user is likely selling:

Screen-Shot-2018-03-13-at-10.24.39-AM.png

The T-Shirts are print-on-demand through a third-party service, so hastily designed that the description can’t even be bothered to spell “Mother” right.

teespring

So two things happen here. When Moms like QAnon content, they get t-shirts, which provides the incentive for spammers to continue to make these boards capitalizing on inflammatory content. Interestingly, when Moms like the T-shirts, they get QAnon content. Fun, right?

How Pinterest’s Aggressive Recommendation Engine Makes This Worse

About a year ago I wrote an article on how Pinterest’s recommendation engine makes this situation far worse.  I showed how after just 14 minutes of browsing, a new user with some questions about vaccines could move from pins on “How to Make the Perfect Egg” to something out of the Infowarverse:

after.png

What was remarkable about this process was that we got from point A to B by only pinning two pins on a board called vaccination.

I sped up the 14 minute process into a two and a half minute explanatory video. I urge you to watch it, because no matter how cynical you are it will shock you.

I haven’t repeated this experiment since then, so I’m unable to comment on whether Pinterest has mitigated this in the past year. It’s something we should be asking them, however.

I should note as well that the UI-driven decontextualization that drove Facebook’s news crisis is actually worse here. Looking at a board, I have no idea why I am seeing these various bits of information at all, or any indication where they come from.

pinterest

Facebook minimized provenance in the UI to disastrous results. Pinterest has completely stripped it. What could go wrong?

Pinterest Is a Major Platform and It’s Time to Talk About It That Way

Pinterest has only 175 million users, but 75 million of those users are in the United States. We can assume a number of spam accounts pad that number, but even accounting for that, this is still a major platform that may be reaching up to a fifth of the U. S. population.

So why don’t we talk about it? My guess is that its perceived as a woman’s platform, which means the legions of men in tech reporting ignore it. And the Silicon Valley philosopher-king class doesn’t bring it up either. It just sounds a bit girly, you know? Housewife-ish.

This then filters down to the general public. When I’ve talked about Pinterest’s vulnerability to disinformation,  the most common response is to assume I am  joking. Pinterest? Balsamic lamb chops and state-sponsored disinfo? White supremacy and summer spritzers?

Yup, I say.

I don’t know how compromised Pinterest is at this point. But everything I’ve seen indicates its structure makes it uniquely vulnerable to manipulation. I’d beg journalists to start including it in their beat, and researchers to throw more resources into its study.

A Provocation for the Open Pedagogy Community

Dave Winer has a great post today on the closing of blogs.harvard.edu. These are sites run by Berkman, some dating back to 2003, which are being shut down.

My galaxy brain goes towards the idea of federation, of course. The idea that everything referencing something should store a copy of what it references connected by unique global identifiers (if permissions and author preferences permit), and that we need a web that makes as many copies of things as the print world did, otherwise old copies of the Tuscaloosa News will outlast anything you are reading today on a screen. Profligate copying, as Ward Cunningham has pointed out, is biology’s survival strategy and it should be ours as well.

(I know, nature is not teleological. It’s a metaphor.)

But my smaller provocation, perfectly engineered for Friday twitter outrage at me and my sellout-ness, is this:

All my former university hosted sites are gone. We built up a WPMU instance at Keene in 2010, and the lack of broad adoption meant when I left in 2013 we shut it down. I ran some wiki on university servers here and at Keene, and those are gone too.

All my self-hosted sites are corrupted from hacks or transfer errors in imports. Go back into this blog and you’ll find sparse posting schedule for some years between 2010 and 2012 and it’s because those posts got nuked in a 2012 hack. I had to go out to the Wayback Machine and reconstruct the important ones by hand.

Transfer errors, let me tell you:  Go back to 2007 and look at all the images that failed imports and moves on this blog when it was self hosted. There’s also this weird “” character that pops up in all of them like this:

Hold on, you say, these Metro signs look different! There’s no BRAND!

The entire Blue Hampshire community I co-founded, over 15,000 posts and 100,000 comments, originally self-hosted on SoapBlox and then WordPress? Gone. It’s probably OK, I said a lot of stupid stuff. But of course it was also a historically important site, one of the most successful state political blogging communities, one of the first communities to be syndicated by Newsweek, one of the first to feature news stories that cross-posted — as news stories — to Huffington Post. One of the first sites to get individual statements from all the Democratic presidential candidates in a weekly forum. Gone, gone, gone.

I know, this doesn’t seem to be provocative, but here’s the thing:

My Blogger sites from 2005 forward? They’re up and they are pristine.

meh

I mean, I’m not sure that’s a great thing — it was where I put little experiments too little to be worth setting up another BlueHost domain. But it also did me a solid in Keene Scene, where the 12-year old images of Keene life have stayed up unmolested and without any maintenance. (I’d quite forgotten about it, really).

ice

Same holds — as I’ve mentioned before — for projects students put up on Google Sites. The BlueHost server (and later the Rackspace account) was long ago shut down but Google Sites is still up.

I’m not making a specific case here. But I do want to point out a big reason I moved to self-hosted and institutional solutions was this idea that commercially hosted stuff was too fickle. In 2006, it seemed that every week a new site shut down. For better or worse (mostly worse) monopoly consolidation has changed that dynamic a bit. There are other good reasons for self-hosting or doing institutional hosting, but durability is more downside than upside of these options, and we might want to let our students know that if they want something to stay up, self-hosting may not be the best choice.

Newspapers On Wikipedia Update: Initial Wikidata Pass

Thanks to initial work by folks at Wellesley and Wikidata work from 9of99 on Wikipedia, the Newspapers on Wikipedia project has both created an initial Wikidata set of extant U.S. Newspapers and mapped that to needs for page and infobox creation.

The full set is here and can be queried in multiple ways:

http://tinyurl.com/yb6sng9e

Visually these maps overstate needs in high density areas, since the red dots (needs page) take precedent over blue dots (has page) in a conflict, and the data has a geolocation that is only as granular as the town (hence Chicago has one geolocation). And the data will need continued cleanup — I’ve spotted a few issues just screenshotting regions. But this initial data set will be developed alongside the rest of the project, and even when papers don’t make it into Wikipedia, we’ll make sure the Wikidata on them is accurate, and try to match them with other sets of data as we go forward.

According to the data here (which again, is imperfect) the current counts are:

  • Has Wikipedia page and Infobox: 957
  • Needs Infobox: 84
  • Needs Page: 3775

(We’ve already put a dent in some of the work before this, so we’ll go back and manually tally up a baseline.)

Anyway, some maps. Keep in mind this is very preliminary.

A Note about Cognitive Effort and Misinfo (Oh, and also I’m a Rita Allen Misinformation Solutions Forum Finalist)

So I forgot to report this, but I put together a team and submitted a proposal to the Rita Allen Misinformation Solutions Forum contest, and our project was chosen out of all the submissions as one of five finalists. I’ll be going to D.C. in October to pitch it in a competition for one of two prizes.

The project is named  Let me fact-check that for you”: a semi-automated, personalized guide generator for the “Wait! You’re wrong about that!” responder.

The tool is meant to empower current Snope-sers to not just post links to alternative articles, but to post short, customized guides that show how they went about fact-checking the particular link, story, or image. Too often when someone in a comment thread debunks or contextualizes something, it’s just dueling links. No one learns how to check things any better. Our hope is to make a website service where you plug in a URL or image and click through a couple decisions. Out the other end comes a sharable little five second, screenshot-based guide showing how you might check that specific link or image.

The idea is loosely (very loosely) inspired  by the old joke site “Let me Google that for you”, where you could plug in a question someone had asked you and it would create a little video of the process to Google the answer. The idea with LMGTFY was partially to shame people into checking Google before bother people, but the other piece of it was to demonstrate that the time cost of consulting Google first was minimal. People were overestimating the cost of consulting Google, the little links were reminders.

People outside misinfo may not be aware of this, but there is a critique of the “people won’t fact check because they love their own point of view” that posits that people — to some extent — aren’t just choosing things they agree with because they like being right, but because it just requires less effort than engaging in more accuracy-oriented behavior. Gordon Pennycook and David Rand, for instance, have an interesting paper on this idea, showing that people that score high on cognitive reflection (an appetite for effort) also show better headline discernment, even when headlines are ideologically aligned.

I’m not necessarily sold on the Pennycook and Rand version of this idea, but I’m interested in the broader insight. I know it doesn’t explain the worst offenders, but I’ve found with those I work with that cynicism (“Pick what you want, it’s all bullshit!”) is often driven by the cognitive exhaustion of sorting through conflicting information. This insight also aligns with Hannah Arendt’s work — totalitarianism wins the information war by deliberately overwhelming the capacity of a population to reconcile endless contradictions. The contradictions are a tool to increase the cost of pursuing truth relative to other options.

If this is the case, one approach might be to encourage people to be more effortful when looking at online media. (Meh.) But the approach I favor is to reduce both the real and perceived cost of sorting through the muck through finding cheap, good enough methods and popularizing them. Doing that — while fostering a culture that values accuracy — might cause a few more people to regard the cost of checking something to be worth it relative to other seemingly more economical options like partisan heuristics, conspiracy thinking, or cynical nihilism.

As such, the methods that our tool will demonstrate will be useful (at decreasing real cost, since our methods fall back on some cognitively inexpensive methods). But the bigger impact is just letting people see that they probably imagine the cost of weeding out the worst information as being much higher than it actually is. By resetting these expectations, we can influence the behavior they choose.

As they say, it’s a theory. Anyway, let me know if you’ll be at the forum in October. I’d love to meet up. And if you’re working on something similar, let me know.

Unintended Consequences to Google Context Cards on Conspiracy Videos?

I was putting together materials for my online media literacy class and I was about to pull this video, which has half a million views and proposes that AIDS is the “greatest lie of the 21st century.” According to the video, HIV doesn’t cause AIDS, retrovirals do (I think that was the point, I honestly began to tune out).

But then I noticed one of the touches that Google has added recently: a link to a respectable article on the subject of AIDS. This is a technique that has some merit: don’t censor, but show clear links to more authoritative sources that provide better information.

At least that’s what I thought before I saw it in practice. Now I’m not sure. Take a look at what this looks like:

hiv

I’m trying to imagine my students parsing this page, and I can’t help but think without a flag to indicate this video is dangerously wrong that students will see the encyclopedic annotation and assume (without reading it of course) that it makes this video more trustworthy.  It’s clean looking, it’s got a link to Encyclopedia Britannica, and what my own work with students and what Sam Wineburg’s research has shown is that these features may contribute to a “page gestalt” that causes the students to read this as more authoritative, not less — even if the text at the link directly contradicts the video. It’s quite possible that the easiness on the eyes and the presence of an authoritative link calms the mind, and opens it to the stream of bullshit coming from this guy’s mouth.

Maybe I’m wrong. It seems a fairly easy thing to test, and I assume they tested it. But it’s also possible that when these things get automated the things you thought were edge conditions turn out to be much more the norm than anticipated. In this case, the text that forms that paragraph from Britannica is on “AIDS”, not “AIDS denialism”, and as such the text rebuttal probably has less impact than the page gestalt.

I get the same feeling from this one about the Holocaust:

holocaust2

What a person probably needs to know here is not this summary of what the Holocaust was. The context card here functions, on a brief scan, like a label, and the relevant context of this video is not really the Holocaust, but Holocaust denialism, who promotes it, and why.

Again, I hope I’m wrong. Subtle differences in implementation can matter, and maybe my gut on this is just off. It really could be — my job involves watching a lot of people struggle with parsing web pages, and that might warp my perspective.

But it should be easy enough for a researcher to take these examples and see how it works in practice, right? Does anyone know if someone has done that?