The Web is Broken and We Should Fix It

Via @roundtrip, this conversation from July:web

There’s actually a pretty simple alternative to the current web. In federated wiki, when you find a page you like, you curate it to your own server (which may even be running on your laptop). That forms part of a named-content system, and if later that page disappears at the source, the system can find dozens of curated copies across the web. Your curation of a page guarantees the survival of the page. The named-content scheme guarantees it will be findable.

It also addresses scalability problems. Instead of linking you to someone’s page (and helping bring down their server) I curate it. You see me curate it and read my copy of that page. The page ripples through the system and the load is automagically dispersed throughout the system.

It’s interesting that Andreessen can’t see the solution, but perhaps expected. Towards the end of a presentation I gave Tuesday with Ward Cunningham about federated content, Ward got into a righteous rant about the “Tyrrany of Paper”. And the idea he was digging at was this model of a web page as a printed publication had caused us to ignore the unique affordances of digital content. We can iteratively publish, for example, and publish very unfinished sorts of things. We can treat content like data, and mash it up in new and exciting ways. We can break documents into smaller bits, and allow multiple paths through them. We can rethink what authroship looks like.

Or we can take the Andreessen path, which as Ted Nelson said in his moving but horribly misunderstood tribute to Doug Englebart, is “the costume party of fonts that swept aside [Englebart’s] ideas of structure and collaboration.”

The two visions are not compatible, and interestingly it’s Andreessen’s work which locked us into the later vision. Your web browser requests one page at a time, and the layout features of MOSAIC>Netscape guarantee that you will see that page as the server has determined. The model is not one of data — eternally fluid, to be manipulated like Englebart’s grocery list — but of the printed page, permanently fixed.

And ultimately this gives us the server-centric version of the web that we take for granted, like fish in water. The server containing the data — Facebook or Blogger, but also WordPress — controls the presentation of the data, controls what you can do with it. It’s also the One True Place the page shall live — until it disappears. We’re left with RSS hacks and a bewildering array of API calls to accomplish the simplest mashups. And that’s because we know that the author gets to control the printed page — its fonts, its layout, its delivery, its location, its future uses.

The Tyrrany of Print led to us gettting pages delivered as Dead Data, which led to the server-centric vision we now have of the web. The server-centric vision led to a world that looked less like BitTorrent and more like Facebook. There’s an easy way out, but I doubt anyone in Silicon Valley wants to take it.


Ward Cunningham’s explanation of federation (scheme on right) — one client can mash together products of many servers. Federation puts the client, not the server, in control. 


It seems we got front-paged at Hacker News. So for those that don’t follow the blog I thought I’d add a one minute video to show how Smallest Federated Wiki uses a combination of JSON, NodeJS, and HTML5 to accomplish the above model. This vid is just about forking content between two different servers, really basic. Even neater stuff starts to happen when you play with connecting pages via names and people via edit journals, but leave that to another day.

This and more videos and explanations are available at the SFW tag.

21 thoughts on “The Web is Broken and We Should Fix It

  1. I’m with you brother Mike, and have done more than my share of crying about dead (sometimes deceased sometimes killed) links.

    Yet I wonder about the federations being uneven. Would people in this new web just be forking pages all the times as curation? conservation? Dp people adopt interest areas? Do we all become cottage industry internet archives?

    Or are you saying it becomes part of our flow as much as bookmarking, which one could say SFW is as a by product.

    Does that mean federating all the media? That’s one of the fallings of the Internet Archive, much as I love it- it is achiving the HTML but the media needs to be on the old site (or am I wrong?)

    And its not your solution you provoke, but why do not browsers/web server error codes not generate possible links for busted ones back to the Internet Archive? Surely a dude that made a web browser knows how to build this.

    • collects all files that can be fetched by a web crawler. So (after respecting noindex, nofollow, and disallow requests) they save everything that’s linked, including media files, and files returned by server-side scripts as the result of direct links. But they can’t get the actual server-side scripts or databases.

    • It just occurred to me that you probably meant streaming media coming from another site like YouTube. I think you’re right; they don’t save that. Maybe it works if the media link is still good, but I’m not even sure about that.

  2. Forking behavior is a social phenomenon, and therefore unpredictable. But at a basic level, we’d expect you’d fork most things you to which you linked, one degree out. You’d also curate the sort of page that you currently feed to Twitter or Delicious. The other stuff you just read.

    Between those behaviors, that’s probably enough to keep the pages alive that we need alive.

    The data structure of SFW does a pretty neat thing with images, and embeds them in the page as JSON strings. Right now that comfortably supports smaller lower quality images that are at least readable (up to 15K). The idea is behind that there will eventually be an asset pipeline — the small compressed image will always stay with the page but will also provide a link to an additional out of page resource.

    Linked YouTube vids and the like are more problematic, but rome wan’t buit in a day I guess.

    I agree that in the meantime we should be building some hacks on top of the current system, it’s amazing to me how we tolerate the 404s as just something normal, which is why Bret Victor’s tweet hit home for me. Where else would this system be considered OK?

  3. I hope to start an online community of lifelong learners to trade notes and resources, so you can pick and choose the ones you like best to expand and refine your own ideas. The federated wiki sounds like exactly the kind of tool I’m looking for. So i’m glad I found this post. Thank you.

    To answer another question of CogDog’s: Yes, it should be easy for a browser to try the WayBack machine when a page is missing. There are at least two Chrome extensions that say they do that. Firefox plugins might exist, too.

    • Thanks for the tips, but those extensions only send you to Wayback versions of a page you are looking at (and the reviews on one says it only takes you to the home page).

      Extensions/plugins not needed, I have an old school wayback bookmarklet that does the job


      easy peasy

      • Cool, thanks. For anyone else who hasn’t used bookmarklets, you can just “Add” a bookmark, paste the javascript in as a URL, and name it anything you want. ***BUT*** Replace those smart quotes with plain text single quotes, or it won’t work. (Took me a while to notice. I thought Chrome was being stubborn. No loss, since I found some bookmarklet tutorials as I was looking.)

  4. What is the legal status of such curating? By copying a website and serving it to other people, would one be violating copyright?

    • Words evolve as they move to new contexts.

      I am pretty sure when you drive your car you are not “urging an animal forward, as with oxen”. When you call someone on your phone you are not shouting to them to gain attention. And when you “browse” the web, you may in fact not be “browsing” in the traditional sense at all. Web pages are in fact not pages, feeds are not feeds, memes are only examples of a specific type of meme. MOOCs are not open. Bookmarks have nothing to do with books.

      In the world of the social web, curate has come to mean the selection, arrangement, storage, indexing, presentation, and redistribution of matter of interest found on the web. It has meant that for at least 10 years. That is, in fact, what you do with federated wiki. Time to move on, right?

  5. I would love to see a prototype of a web browser and hyperlink scheme based on torrent hashes so that instead of using HTTP you could use the BT protocol. That way pages would stick around for years even when the original author deletes them. It’s not as far fetched as it sounds central seeding servers would make it lightening fast.

  6. Thank you a bunch for sharing this with all people you actually recognize what
    you are speaking approximately! Bookmarked. Kindly also discuss with my site =).
    We can have a link exchange arrangement among us

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s