Cultural Resistance

Fuzzy Notepad posted Twitter’s Missing Manual today, noting that obscure UI interactions in Twitter often drive people away.

Reading through the list they have compiled, however, I don’t think this stuff has much to do with lack of Twitter uptake. If the worst thing users have to deal with is the difference between “@” and “.@” you’re doing pretty well.

Twitter’s real learning curve is cultural, and it’s interesting to consider why Twitter’s cultural rules are so developed. Here some things you might encounter in Twitter when you first open up your feed, for instance:

  1. Tweetstorms
  2. Subtweeting vs. mentioning
  3. Weird Twitter
  4. Live-tweeting
  5. Tweet-ups
  6. ASCII art
  7. Bots
  8. Hashtag activism
  9. Tweet-stealing
  10. Reputation stealing (e.g. using “RT” vs. retweeting)
  11. Sea-lioning
  12. Hashtag meta-jokes (e.g. #sorrynotsorry)
  13. Screenshotting text to share it

All of these things are culturally complex. When you livetweet a TV show or debate, for instance, you have to walk a complicated balance that does not overload your non-interested followers while engaging with your fan subgroup. Subtweets appear bizarre to people who are not familiar with the practice. So does screenshotted text. These things are handled by cultural norms and related user innovations.

It reminds me that Twitter, despite its problems, is truly a *community* whereas Facebook is a piece of software. Twitter has a cultural learning curve and Facebook doesn’t, but that’s mostly because Facebook has little culture to speak of.

And here, it’s Facebook that’s the oddball, not Twitter: from the early PLATO online communities to Usenet to LiveJournal to Friendster to MySpace these online spaces developed community identities, conventions, and norms that grew increasingly complex and rich over time. Online communities look exactly like Twitter after they grow four or five years. It’s practically a law of physics.

But Facebook seems, more or less, to have avoided that. There’s little to no user innovation in the space, and about as much culture as an Applebee’s. You don’t log into Facebook one week and find everyone is experimenting with animated gif avatars, or that people have found a workaround that allows them to do ASCII art.There’s no deciphering Shruggie (¯\_(ツ)_/¯), there’s no Horse_ebooks, no bots or psuedobots.

And so the answer to the question “Why is Twitter so culturally complex?” is that it’s the wrong question. It’s Facebook that is the weird thing here, a community that doesn’t develop an overall culture overtime.

I wonder what’s going on? Why is Facebook so culture-resistant? And what does it say if it’s community culture that is getting in the way of Reddit, Twitter, and Tumblr from getting the valuations they want?

The Tragedy of the Stream

I think on my most popular day on this blog I got about 14,000 hits on a post. Most posts get less than that, but getting 600-800 visitors over the first week is pretty usual, and the visitors are generally pretty knowledgeable people.

Yesterday I got a lot of hits on my post asking for examples of the best blog-based classes in higher education that people could look at, with a caveat that I’d love to get beyond the usual examples we use — I’m looking for variety over innovation in some ways. The result was crickets.

I just need a list that I can show faculty that describes the class, the methods used, and links to it. I want to share it with faculty. I’d like the list to be up-to-date. I’d like someone to have checked the links and make sure they are not linking to spam sites at this point. Maybe someone could also find the best example of a student post from the class and link to that. Maybe it could be ordered by discipline.

Does such a thing exist? I don’t know. Maybe. I sure as hell can’t find it, and I’ve been a part of this movement a decade now.

Do individual pages on these these sorts of experiences exist? Absolutely. I’ve read blog posts for the past ten years on this or that cool thing someone was doing. But as far as I can tell, no one has chosen to aggregate these things into a maintained or even semi-maintained list. We love to talk. Curate, share, and maintain? Eh.

This is the Tragedy of the Stream, folks. The conversations of yesterday, which contain so much useful information, are locked into those conversations, frozen in time. To extract the useful information from them becomes an unrewarding and at times impossible endeavor. Few people, if any, stop to refactor, rearrange the resources, gloss or introduce them to outsiders. We don’t go back to old pieces to add links on them to the things we have learned since, or rewrite them for clarity or timelessness.

And so it becomes little more than a record of a conversation, a resource to be mined by historians but not consulted by newbies. You want an answer to your question? Here’s eighteen hours of audio tape. If you play it from the beginning it makes sense. Have fun!

There are some things which survive better than others: Quora answers, Stack Exchange replies and the like.

But in our community at least I see a whole body of knowledge slowly rotting and sinking back into the sea. Perhaps it might be time to focus less on convincing and more on documenting our knowledge?

What Are the Close-to-Best Examples of Blog and Wiki-Based Classes in Each Discipline?

We’re making a push here on both blog and wiki use in classes, but finding that while there’s many posts on this and that blog/wiki project in higher education, that

  • There’s not many lists compiled that show a variety of examples across many disciplines and institutions.
  • Many of the examples we continue to use are quite old, giving the appearance of a wave that that broke around 2011.

I’d like to compile a list for my faculty of the best examples from each discipline of blogs and wikis utilized as a core part of traditional for-credit classes. I know the big ones that we all talk about all the time; I want the ones a level below that.

Can you help me out in the comments? Just a link to your favorite project that needs more recognition, and write a line or two about what you like about it.

All projects welcome, but bonus points for: projects in the hard sciences, projects involving data gathering, projects that engage with a local community, projects involving first-year students, cross-course projects, anything that wasn’t at UMW (we love UMW, but too many examples from UMW raise concerns that the model is not generally transferable).

Can Blogs and Wiki Be Merged?

I’ve been thinking lately about the architecture underlying blogs and wiki, how different these architectural choices are (RSS, revision histories, title-as-slug, etc), and whether it’s worthwhile to imagine a world where data flows seamlessly across them. It might not be. They are very different things, with different needs.

Wiki and blogs have two different cultures, two different idioms, two different sets of values.

Blogs are, in many ways, the child of BBS culture and mailing lists. They are a unique innovation on that model, allowing each person to control their part of the conversation on their own machine and software while still being tied to a larger conversation through linking, backlinks, tags, and RSS feeds.

Blogs value a separation of voices, the development of personalities, new posts over revision of old posts. They are read serially, and the larger meaning is created out of a narrative that expands and elaborates themes over time, becoming much more than the sum of its parts to the daily reader.

Through reading a good blogger on a regular basis, one is able to watch someone of talent think through issues, to the point that one is able to reconstruct the mental model of a blogger as a mini Turing machine in one’s head. I have been reading people like Jim Groom, Josh Marshall, Digby, Atrios, and Stephen Downes for years, watching how they process new information, events, and results. And when thinking through an issue, I can, at this point, conjure up a rough facsimile of how they would go about analyzing a thing.

Wiki is perhaps the only web idiom that is not a child of BBS culture. It derives historically from pre-web models of hypertext, with an emphasis on the pre. The immediate ancestor of wiki was a Hypercard stack maintained by Ward Cunningham that attempted to capture community knowledge among programmers. Its philosophical godfather was the dead-tree hypertext A Pattern Language written by Christopher Alexander in the 1970s.


Alexander’s A Pattern Language used bolded, numbered in-text links to connect “patterns” that were related ideas. The sections were formed around these patterns and written to be approached from many different directions.

What wiki brought to these models, which were personal to start with, was collaboration. Wiki values are often polar opposites of blogging values. Personal voice is meant to be minimized. Voices are meant to be merged. Rather than serial presentation, wiki values treating pages as nodes that stand outside of any particular narrative, and attempt to be timeless rather than timebound reactions.

Wiki iterates not through the creation of new posts, but through the refactoring of old posts. It shows not a mind in motion, but the clearest and fairest description of what that mind has (or more usually, what those minds have) arrived at. It values reuse over reply, and links are not pointers to related conversations but to related ideas.

These are, in many ways, as different as two technologies can be.

Yet, the recent work of Ward Cunningham to create federated wiki communities moves wiki a bit more towards blogging. Voices are still minimized in his new conception, but control is not shared or even negotiated. I write something, you make a copy, and from that point on I control my copy and you control yours. In Federated Wiki (the current coding project) what you fork can be anything: data, code, calculations, and yes, text too.

As I’ve been working on Wikity (my own federated wiki inspired project) I’ve been struggling with this question: to what extent is there value in breaking down the wall between blogging and wiki, and to what extent are these two technologies best left to do what they do best?

The questions aren’t purely theoretical. Ward has designed a data model perfectly suited for wiki use, which represents the nature and necessities of multiple, iterative authorship. Should Wikity adopt that as the model it consumes from other sites?

Or should Wikity tap into the existing community of bloggers and consume a souped up version of RSS, even though blog posts make lousy wiki pages?

Should Wikity follow the wiki tradition of supplying editable source to collaborators? Or the web syndication model of supplying encoded content. (Here, actually, I come down rather firmly on the source side of the equation — encoded content is a model suited for readers, not co-authors).

These are just two of many things that come up, and I don’t really have a great answer to these questions. In most cases I’d say it makes sense for these to remain two conceptually distinct projects, except for the big looming issue which is with the open web shrinking it might helpful for these communities to join common cause and solve some of the problems that have plagued both blogging and wiki in their attempt to compete with proprietary platforms.

Again, no firm answers here. Just wanted to share the interesting view I’ve found at the intersection of these two forms.

Why Learning Can’t Be “Like a Video Game”

One of the projects I’m working on with French colonial history scholar Susan Peabody this semester at WSU is building a virtual, wiki-based museum with her students in a history course. We’re using a Wikity-based WordPress template to do it, and while we’re not utilizing the forking elements in it, we’re actually finding the Markdown + card-based layout combo to be super easy for the students to master. It’s honestly been encouraging to see that while I still don’t have the forking and subscription in Wikity quite where I want it, it actually makes a kick-ass WordPress wiki. I should probably write about that at some point.

But what I wanted to talk about today was an excellent article Sue shared with me. It’s one she had her students read, and it helped crystallize some of my ambivalence around virtual reality.


The 2004 article, Forum: history and the Web: From the illustrated newspaper to cyberspace by Joshua Brown describes a series of virtual history projects he co-designed in the 1990s. An early project he worked on for Voyager, Who Built America?, was an enhanced textbook HyperCard stack that included more than twenty times as much supplementary written material as the main narrative text, and, more importantly from the author’s point of view, a host of audio-visual artifacts:

That said, the contrast between the informational capacity of a book compared to a CD-ROM was startling. The original four chapters of the book comprised 226 pages of text, 85 half-tone illustrations and 41 brief documents. The Who Built America? CD-ROM, in addition to chapter pages, contained 5000 pages of documents; 700 images; 75 charts, graphs, maps and games; four-and-a-half hours of voices, sounds and music from the period; and 45 minutes of film.

This led to another project, the George Mason University site History Matters, which did away with the narrative thread altogether in favor of what the author calls a “database” approach.

Having obtained the “immersiveness” of the encyclopedic approach, the author and his co-developers reached for another type of immersion: 3-D environments.

Working in Softimage, a wire-frame 3-D modelling program, and the flexible animation and navigation features offered by Flash, a prototype website called The Lost Museum: Exploring Antebellum Life and Culture ( finally went public in 2000. Entering the site (, users encounter the Museum’s main room after the building has closed for the day, where they can engage with its various exhibits and attractions, experiencing its mixture of entertainment and education that would influence popular institutions up to the present (Figure 10).


At the same time, they also can look for evidence pointing to possible causes of the fire that destroyed the building in July 1865. Moving through the American Museum’s different environments and attractions, users search for one of a number of possible ‘arsonists’ who might have set the fire. These suspects represent some of the political organizations and social groups that contended for power, representation and rights in antebellum and Civil War America—for example, abolitionists (anti-slavery activists) and Copperheads (northern supporters of the Civil War Confederacy). In the process of searching for clues that point to these and other suspects, users also learn information about how P. T. Barnum and his museum expressed and exploited the compromises and conflicts of the mid-nineteenth century.

Ultimately, however, Brown felt the work failed in its pedagogical aims. Why? Because in the desire to make the game “immersive” and “seamless” they had also welded the pieces together, not allowing students to make new, unexpected uses of them:

As University of Chicago art historian Barbara Maria Stafford has pointed out, there are actually two types of digital information, or approaches to organizing collections of information: ‘one type lends itself to integration, the other to linkage.’ The distinction, Stafford argues, is crucial. The difference between systematically blending or collapsing individual characteristics (analogous to a seamless, immersive interactive virtual environment like The Lost Museum exploration) and maintaining separate entities that may be connected or rearranged (such as the fragmented multimedia scrapbook) has farreaching repercussions. In the former case, the immersive, its ‘operations have become amalgamated and thus covert’, preventing users ‘from perceiving how combinations have been artificially contrived’, while the latter is ‘an assemblage whose man-made gatherings remain overt, and so available for public scrutiny’. In Stafford’s estimation, the immersive fosters passive spectatorship while the assemblage promotes active looking (Stafford 1997, p. 73).

And ultimately this is a problem that is acutely felt in 3-D environments: the pieces of the environment and the way they react to the participant gain their immersive quality only by being parts of a coherent whole that can’t be broken down into its constituent parts.

I don’t mean to imply that these issues couldn’t be overcome, to some extent, by brilliant design or user persistence. The many uses that Minecraft has been put to amaze me, for instance.

But Minecraft is not, I think, what the people promoting VR to teachers have in mind. The pitch most often involves exactly the type of locked-down virtual environments that invite inspection but resist deconstruction and rearrangement.

Something to think about as we hand out all those Google Cardboard glasses for virtual field trips, no?


Connected Copies, Part Two

This is a series of posts I’ve finally decided to write on the subject of what I call “connected copies”, an old pattern in software that is solving a lot of current problems. Part one of the series is here.

It’s really a bit of a brain dump. But it’s my first attempt to explain these concepts starting at a point most people can grasp (e.g. people who don’t have git or Named Data Networking or blockchain or federated wiki as a starting point). Hopefully publishing these somewhat disorganized presentations will help me eventually put together something more organized and suitable for non-technical audiences. Or maybe it will just convince me that this is too vast a subject to explain.


A Gym Full of People

Let’s think about connection a bit. I want to talk about how connection on the web happens. Because I don’t want to get too into the weeds I’m not going to talk about packets, or too much about routing either. I’m sure many people will think my story here is inadequate for understanding connection on the web, but I think it will work for our purposes.

Imagine you are in a gym full of people, and you’re only allowed to talk to the people next to you. The web sort of works like this.

First, you have to know the domain name or IP of the server that has the thing you want. As we noted in the last post, this is really crucial. The web, as designed, starts with “where, not what.

For a simple example, let’s say what you really want in that gym is a physical book. Let’s say we’re in that gym, and everyone has a stack of books with them. You want a copy of the Connie Willis classic To Say Nothing of the Dog. The first thing you have to know is where that book is.

There’s huge implications to this fundamental fact of net architecture, but we’ll skip over them for the time being.

So the first thing you do is some research to find out who has this book. Your guess is that Jeff Bezos probably has it because he seems to have copies of *all* the books, as part of this little side business he runs called Amazon.

So you look to see whether any of the people around you are Jeff. But they’re not, so you say to a neighbor hey, here’s an envelope with a message in it — can you get this to Jeff? And in the envelope you put a message that says “Send me To Say Nothing of the Dog” and on the outside you write your name and address as the return address and Jeff’s name in the “To:” field.

In any case, the person you give the envelope to looks around and see if any of the people standing next to them are Jeff, and if they’re not they figure out who they can pass it to who has the most likelihood of getting it to Jeff.

After five or six people pass it, it ends up at Jeff, and Jeff opens it and reads your “Send TSNOTD” message. So he makes a copy of that book and put the copy in an envelope (or, for sticklers, puts pieces of the book series of separate envelopes) and then passes it back across the gym to me using the same method “Are you Mike Caulfield’s computer? No? Hmmm. Can you get a message to it then?”

At the risk of boring you, I want to reiterate some points here.

First, before we ask for something on the web — Jennifer Jones, The Gutenberg Bible, the most recent census data — we have to know *where* it is. When what we really wanted was “access to Stanford’s SRI mainframe” or “a videocall connection to Utah” this was an easy problem. What you wanted and the server you wanted it from were inextricably tied together.

But as the Internet (and eventually the web) grew, this became a major problem. Most things we wanted were things where we didn’t know the location.

So the first major giants of the web sprouted up: search engines and directories. They could translate your real question (what you wanted) into the form the web could understand (where to get it from). Essentially, they’d figure out the address to put on the outside of the envelope, so we can mail our request.

You’ll also notice this scheme privileges big sites, because the hardest thing is knowing where things are, and big sites containing everything solve your “from what to where” problem.

What Would a Content-Centric System Look Like?

There are alternate ways of thinking about networking that are based around content instead of location.  These ideas are not just theoretical: they are the basis of things like torrenting, Named Data Networking, and Content-Centric Networking. The priniciples behind the idea were outlined, as many brilliant ideas were, by Ted Nelson many years ago.

To get a content-centric implementation of networking, we ask: “What if instead of asking people around us if they could get a message to Jeff, we instead asked them ‘Do you or anyone you know of have a copy of To Say Nothing of the Dog‘?”

And then each person turned to other people and asked that until either Jeff or someone else said “Yes, I have it right here, let me send a copy to you!”

On the positive side, you’d probably get the book from someone closer to you. Books would flow from people to people instead of always from Amazon to people (and payment systems could be worked out — this doesn’t assume that these would be free).

On the negative side, this sort of protocol would be pretty time intensive. For every request we’d have to ask everyone through this telephone game, they’d have to check for it and so on. In even the gym it’d be a disaster, never mind on the scale of the Internet.

But but this is where connection comes in. Imagine that you had the Connie Willis book The Domesday Book in your own library, which is part of the same series. And let’s imagine that you open the cover of that book and inside the cover is the entire copy history:

“Antje H. copied this from Marcus Y. who copied it from Kin L. who copied it from Martha B. who copied it from Mike C. who copied it from Jeff B.”

Well, if these people have one book in the series, they might have another, right? So you start with a location based request. But you still ask the content question, because you don’t care where it comes from:

“Do you, or anyone you know, have a copy of To Say Nothing of the Dog?”

You notice that Martha B. is standing right next to you, so you ask her. It turns out that she does not have a copy of this anymore. But she used to have a copy, and she made a copy for Pedro P., so she asks him if he still has a copy. He does, so he makes a copy, and passes it to Martha who passes it to you.

You just got a copy of something without knowing where it was. Congratulations!

More importantly, you just saw the power of connected copies. Connected copies are copies that know things about other copies. The connected piece is the “list of previous owners” inside the cover of that book you got, the knowledge that both TSNOTD and Domesday Book are by the same author, and the system that allows you to act on that knowledge.

The Big Lesson

I think Content-Centric Networking (CCN) and its variants are very cool, and I hope they get traction. The Named Data Networking project, for example, was named as one of the NSF’s Future of the Internet projects, and feels to me like the early net, with a bunch of research university running nodes on it. The CCNx work at PARC is fascinating. Maelstrom, a BitTorrent-based browser is an interesting experiment as well. (h/t Bill Seitz for pointing me there)

But CCNx or NDN as an architecture for the entire web has an uphill climb, because it would destroy almost every current Silicon Valley business out there. Who needs Google when you can just broadcast what you want and the network finds it for you? Who need Dropbox, when every node on the web can act like a BitTorrent node? Who needs server space or a domain, when locations no longer matter? Who needs Amazon’s distribution network when you can just ask for a film from your neighbors and pay for a decryption key?

So while these schemes will happen (see Part One for why CCN is inevitable), I don’t think they are coming in the next few years. But, importantly, you don’t have to rejigger the whole Internet to get better with content. You just have to think about the ways in which our location-centrism is contributing to the problems we are hitting, from the rise of Facebook, to the lack of findability of OER, to the Wikipedia Edit Wars.

In other words, the reason I spend time talking about the networking element above is that our location-centrism is so baked into our thinking about the web that it’s invisible to us. We think it’s very normal to have to know whose servers something is on in order to read it. We assume it’s good for things to be in one place and only one place, because we’ve structured the web in a way which doesn’t make use of multiple locations very well.

And crucially, we tend to think of documents as having definitive locations and versions (like the whiteboards) rather than being a set of things in various revisions with various annotations (like when I talk about a book like “The Great Gatsby” or play like “Hamlet”, which covers a wide range of slightly different objects). It’s that piece I want to talk about next.

Next Post: Ward Cunningham’s Idea of a ‘Chorus of Voices’ and Other Stuff I Mostly Got From Ward





Amazon, OER, and SoundCloud

So Amazon is getting into the Open Educational Resources market. What do we think about that? If you read these pages regularly, you can probably predict what I’ll say. It’s the wrong model.

For over a decade and a half we’ve focused on getting OER into central sites that everyone can find. Or developing “registries” to index all of them. The idea is if “everyone just knows the one place to go” OER will be findable.

This has been a disaster.

It’s been a disaster for two reasons. The first is that it assumes that learning objects are immutable single things, and that the evolution of the object once it leaves the repository is not interesting to us. And so Amazon thinks that what OER needs is a marketplace (albeit with the money removed). But OER are *living* documents, and what they need is an environment less like and more like GitHub. (And that’s what we’re building at Wikity, a personal GitHub for OER).

So that’s the first lesson: products need a marketplace but living things need an ecosystem. Amazon gives us yet another market.

The second mistake is that it centralizes resources, and therefore makes the existing ecosystem more fragile.

I talked about this yesterday in my mammoth post on Connected Copies. While writing that post I found the most amazing example demonstrating this, which I’ll repeat here.

Here’s a bookseller list from a particular bookshop from 1813:


How many of these works do you think are around today?

Answer: all of them. They all survived, hundreds of years. In fact, you can read almost all of them online today:



Now consider this. SoundCloud, a platform for music composers that has tens of millions of original works is in trouble. If it goes down, how many of those works will survive? If history is a guide, very few. And the same will be true of Amazon’s new effort. People will put much effort into it, upload things and maintain them there, and then one day Amazon will pull the plug.

Now you might be thinking that what I’m proposing then is that we put the OER we create on our own servers. Power to the People! But that sucks as a strategy too, because we’ve tried that as well, and hugely important works disappear because someone misses a server payment, gets hacked, or just gets sick of paying ten bucks a month so that other people can use their stuff. We trade large cataclysmic events for a thousand tiny disasters, but the result is the same.

So I’m actually proposing something much more radical, that OER should be a system of connected copies. And because I finally got tired of people asking if they needed to drop acid to understand what I’m talking about, I’ve started to explain the problem and the solutions from the beginning over here. It’s readable and understandable, I promise. And it’s key to getting us out of this infinite “let’s make a repository” loop we seem to be in.

Honestly, it’s the product of spending a number of weekends thinking how best to explain this, and the results have been, well, above average:


I haven’t gotten to how copies evolve yet, but I actually managed to write something that starts at the beginning. I’ll try and continue with the middle, and maybe even proceed to an end, although that part has always eluded me.

See: Connected Copies, Part One


Connected Copies, Part One

This is a series of posts I’ve finally decided to write on the subject of what I call “connected copies”, an old pattern in software that is solving a lot of current problems.

It’s really a bit of a brain dump. But it’s my first attempt to explain these concepts starting at a point most people can grasp (e.g. people who don’t have git or Named Data Networking or blockchain or federated wiki as a starting point). Hopefully publishing these somewhat disorganized presentations will help me eventually put together something more organized and suitable for non-technical audiences. Or maybe it will just convince me that this is too vast a subject to explain.


The Party

Let’s imagine you are planning a birthday party for an office mate. And for both musical and technical reasons, let’s imagine you were doing this in 1983.

(Insert cassette of David Bowie’s Let’s Dance in 3… 2… 1…)

You’re planning a party for Sheila, and you want everyone to bring something. You talk to everybody and write up a list:

Cliff: Plastic silverware
Norm: Vanilla ice cream
Sam: Chocolate cake with cream cheese frosting
Rita: Card, streamers
Diane: Ice cream fixings

You write out this list, and you photocopy it. Each person gets a copy on Monday so they can prep Friday’s party.

Programmers call this sort of copying passing by value and it has a lot of advantages. Each person has a copy they can carry around. They can change that copy, annotate it. If Cliff loses his copy, Norm doesn’t lose his. And so on.

Unfortunately this party hits a snag. Norm is talking to Sheila and finds out that Sheila is lactose intolerant. On his copy he crosses out the ice cream, and replaces it with lactose-free cookies, making a note of her intolerance.

But because these sheets are copies, no one else ever sees his update. He comes with cookies, but Diane is still bringing ice cream fixings, and frankly we’re not quite sure whether that cream cheese frosting cake is such a good idea either. The party is a disaster.

Here’s another approach we could use for the party. Instead of making copies, we say “sign up on the kitchen whiteboard”.

If we run this with the office whiteboard, we get a different, and better result. When Norm finds out about the sensitivity, he changes his item to the cookies, and writes a note about the issue. Diane, seeing this, skips the ice cream fixings in favor of getting soda, and Sam reconsiders his cream-cheese cake, finding something more palatable. As each make their changes, they note them on the board.

Saying where the definitive list is (e.g. “The kitchen whiteboard”) instead of making copies of its value is what programmers call passing by reference. As you’ll note, it has certain advantages.

The Web Is a Series of Whiteboards

In our kitchen whiteboard example, of course, it has one big disadvantage, which is you can only access it when you’re in the kitchen. Of course, with the dawn of the Internet, this problem was largely solved. A web page is like a whiteboard you can read from anywhere. A calendar feed, a video link — these are all protocols which say, more or less, go find the set of values currently at this location.

The web is in fact built around this structure. It understands locations, not objects or copies.

What, for example, is the URL for “find me a copy of Moby Dick”? There is none. There is only a URL for “get me a copy of whatever is currently at the address that this one instance of Moby Dick is supposed to be at”.

It is important to note that the web doesn’t have to be structured this way. Torrents, for example, use a different approach. When you attempt to retrieve something through torrenting, your computer isn’t concerned with where the one true location is. Instead, your torrenting client asks for an object, and  any locations that have that object can return pieces of it to you.

Programmers have developed ingenious ways to make requests for locations to act sort of like requests for objects (proxy servers are a simple example) but at the heart of the web as we currently use it is this assumption: things are defined by the location from which they are served.

Again, this model is an amazing model, and one that should be embraced in a lot more circumstances. As Jon Udell has noted, people can use a system of passing by reference to maintain tight control over their personal information: if I share a Word Document with you of my calendar, I lose the ability to update that; if I share my calendar feed, I retain central control.

But just as there are many circumstances where we are still passing around copies when we should be passing around references, there are also cases where we are not fully understanding some of the benefits copies provided.

In Which Whiteboards Go White

I used to have a hobby of collecting old reference works. Ancient textbooks, encyclopedias, atlases, books of trivia from the 1800s; I loved them all.


What’s amazing to me (and what’s been amazing for a long time) is how durable these books were. Not as physical books, mind you, but as an information strategy.

Here, for example, is a random page of a used bookseller’s catalog of used works available in 1812 or so:


These are individual issues, available for resale, from what I can tell. But look at the dates. Even at that time you have stuff from 200 years prior being sold in that store. What do we have on the web that will survive in 200 years?

Ah, you say, it’s just survival bias. These are the works that survived; we don’t know what didn’t.

Except, no.We can take this set of books as our little experimental cohort, start the clock in 1812, and then see how many make it through to today.

And when we do that we find the answer is to how many survived is… all of them. In fact, take a look, most of these are now digitized on the web:

The only works I can’t find here are the two works in Latin, and I’m guessing that’s just due to them not being digitized (or perhaps even due to me botching the search terms).

In other words, on a random page of a random bookseller’s sell-sheet from 1812, every English language work identified has survived.

How do we compare today?

Well, here’s a screenshot of part of a page from in one of its reboots around 2001.


If you don’t know what Suck was, you can read its history here, but what it was is kind of irrelevant in this example. What you need to know is that they aren’t linking out to fanzines here. They are linking out to major stories and major websites of the time.

There’s about 30 links on the whole page. Three of them work. Three. One of the links that works goes to another page on, one goes to the Honolulu Advertiser, and one to a article.

Go to any archived page of about the same time and you’ll find a similar pattern.

Random references to books that are 400 years old are stable. But references to web works that are 15 year old? You’ve got a one in ten chance that it’s still around. You’ve literally got a worse chance clicking a link on a fifteen year old article than you do scratching off a scratch ticket.

It’s like you were told the answer you need is on the work kitchen whiteboard, but when you go there, the whiteboard is erased. Or missing. Or filled with porn.

What the heck is going on?

Read Something, Host Something

The biggest reason that all the works referenced by the 1812 bookseller survive is that books are copies.

The original proofs of these books did not survive. The specific physical books that were in that bookseller’s shop that year are also less likely to have survived. But some of each of the maybe 1,000 books printed in these runs did survive.

Books have a weird mode of replication. The more people that read a book, the more people host that book in their personal bookshelves.

It’s not one-to-one, of course. I might lend you a book to read. You might read a book in the local library, or buy one secondhand. But the nature of the system of physical books is that if 10,000 people read a book there will be 1,000 books hosted at various locations. And if more people start reading that more books are produced and distributed to yet more locations to be hosted.

You read something, you host it. When you share with others, you share your own copy.

Compare this to the web model. In the web model, a million people can read a story and yet there is one single site that hosts it. The person that hosts it has to take on the expense of hosting it. When it no longer serves their interest they take the copy down, and that work is gone, forever.

It’s worth noting that the web model is is less scalable than the book model. In the book model, the distribution and hosting function is spread out among current and former readers: as the book becomes popular the weight of hosting it and sharing it is dispersed. In the web system, something suddenly popular is a burden to the publisher, who can’t keep up with spikes in demand as they hit that specific URL.

You can imagine another system for the web fairly easily, where everyone has a small server of their own, the way everyone once had a bookshelf. When you share a link with a friend, you share a link to your copy of something on your bookshelf, hosted on your own server.

That friend reads the page and decides to share it with others. In doing so they make a copy to their server, and they share from there to more friends. As the work propagates out, server load is dispersed, and copies proliferate.

These are some of the ideas underlying recent inventions such as the Interplanetary File System and federated wiki (although each of those is much more than just that idea).

Corporate Copies

These ideas may seem far out, but they’re not at all. In fact, because of the way web use has evolved, it’s inevitable that we’ll move in the direction of copies. It’s just a matter of whether we get the corporate version of this vision or a more radical reader-centered vision.

The corporate version is coming, without a doubt. Consider, for example, the day that Jessica Jones, a ten episode Netflix series was released. Over that weekend I’m guessing that several thousand people in my zipcode binge watched those episodes. All streaming it from Netflix which was how many miles away.

Which, of course, makes no sense.

I’ve downloaded all 10 episodes. But when my neighbor gets the urge to binge, my neighbor’s set-top box doesn’t go 20 feet to get those files. It goes thousands of miles to Netflix’s servers (or hundreds of miles to Netflix’s distributed content network).

The Internet wasn’t really built for this sort of thing. The assumption the Internet was built on was “I want a certain remote machine somewhere to do something, send it a message and get the result.” When we look at content over the web, this translated to “Tell the remote computer to send me the file at this location or with this id”. But central to the protocol is we know the where and the where gives us the what.

This makes sense when we are trying to get specific information from specific computers. It starts to make less sense when we want a thing that could be in any of thousands of places, not a response from a particular machine.

Eventually this will fall away. Thirty-six percent of Internet traffic in the U.S. is Netflix shipping you videos you could more easily grab from your neighbors.

Of course, if corporations implement it, it will suck. Which is why there’s a whole bunch of others thinking through the implications of connected copies. And it’s a reason that you should be thinking about it too, if you want the next iteration of the web to be awesome and not suck.

The current system has us “know the where and get the what”. The next system will have us “know the what and get the where”. But thinking through the implications of that beyond Netflix delivery is what interests me, because it opens up a new way to think about copies entirely.

Next: Connected Copies, Part Two


SoundCloud and Connected Copies

SoundCloud, a music publishing site which holds millions of original works not held elsewhere (and over a hundred million works total), may be in trouble. And if it is in trouble, we’ll lose much of that music, forever.

The situation is, of course, ridiculous.

I know I sound like a broken, um, MP3 file on this, but theres a simple solution to this problem: connected copies.

In such a scheme, I might share the file to SoundCloud (perhaps from my server, perhaps not). As other people share it, copies are made to their servers. These copies are connected by an ID and protocol that allows fail-over: if the copy cannot be found on the SoundCloud server, it tracks it down to the other locations (the “connected” in “connected copies” means that each copy points to the existence of other known copies).

Again, this is the underlying principle of lots of cool things not yet on people’s radar, like the Interplanetary File System, Named Data Networking, federated wiki, and Amber. It’s the future of the web, the next major evolution of it.

It’s worth thinking about.


P. S. While SoundCloud is still around you should check out my awesome playlist of new darkwave and neo-psychedelia tunes from little known artists.

An End to the Den Wars?

As you doubtless know by now, in 2010 I went and gave a plenary at UMW on the Liberal Arts in an era of Just-In-Time Learning, drank more than any human really should at the various parade of after-events, predicted the coming onslaught of xMOOCs in a drunken vision, and ended up going back to Jim Groom’s place at like 3 a.m. to drink some more.

It was at that point, where my liver was already in the process of packing its bags and moving to find a home in a saner individual that Jim Groom asked me a fateful question.

“So,” he said, “What do you think of my den?”

I looked up and noticed we were sitting in a den.

“It’s nice,” I said.

I don’t remember much more about that night, but I did wake up back in the hotel, so that’s good.

It would be the last time I would drink to that level, because, no joke, I was hungover for three days. Welcome to age 40, time to stop acting like a teenager.

I forget the next time I saw Jim in person, but it was at one conference or another. Someone tried to introduce us, not knowing we go way back (2007?) and Jim said something to the effect of “Oh, I know Mike, we go way back. Last time he was over my house he insulted my den.”

I tried to piece together fragments of that night.

“I said it was nice,” I said.

“Yes,” he said, “Nice. You said it was ‘nice’. I worked hard on that den.”

This was the beginning of #denwars, which would come up anew at every conference.

But this week, heading to ELI, I realized there is another side to the story.

On the way back from that 2010 UMW event I took Amtrak, and I started writing an album called Double Phantasm. (The title is a pun involving John Lennon’s Double Phantasm album and the Derridian notion of hauntology. This is why I’m currently outselling Taylor Swift).

The first song on it, Queen of America, was loosely based on an events of that week. And until yesterday I thought that was the only song that bore any relation to that trip. The rest of the album is a science fiction concept album that details the dying romance of a man and a woman during a post-apocalyptic future. (Again, these hot song ideas are the secrets to my stunning success).

But giving it my first relisten in several years on the plane, I was suddenly struck by the last song, “Like a Great Big Meteor“. It’s the final scene of the four song apocalyptic romance.

And there it was, in the setting of the final scene of that album. It was Jim’s den:

If all we have is now,
then what did we have then
as she sat across me nervously
in what used to be a den?

Here, in this verse, looking for a scene which would contrast former suburban opulence with post-apocalyptic decay I could find no better setting than Jim’s den. The world is over in this song, but what we miss is the den.

The song is rough, as all my songs are (I spend hours writing and tinkering with synth settings and textures, and then basically record a demo-level version of the song in thirty minutes, because I like writing and synths but hate production and singing). Sometimes this style of production works, and sometimes it doesn’t. It sort of half-works here.

But if you listen to my rough vocal on that track you can hear the raw and powerful emotions that Jim’s den stirred in me.

So there you go Jim. I did say your den was “nice”.  But that was only because the depths of my feelings for your den could only be expressed through art. Perhaps we can now lay to rest the den wars.