Hapgood

Mike Caulfield's latest web incarnation. Networked Learning, Open Education, and Online Digital Literacy

August 14, 2014

The Web is Broken and We Should Fix It

Via @roundtrip, this conversation from July:

There’s actually a pretty simple alternative to the current web. In federated wiki, when you find a page you like, you curate it to your own server (which may even be running on your laptop). That forms part of a named-content system, and if later that page disappears at the source, the system can find dozens of curated copies across the web. Your curation of a page guarantees the survival of the page. The named-content scheme guarantees it will be findable.

It also addresses scalability problems. Instead of linking you to someone’s page (and helping bring down their server) I curate it. You see me curate it and read my copy of that page. The page ripples through the system and the load is automagically dispersed throughout the system.

It’s interesting that Andreessen can’t see the solution, but perhaps expected. Towards the end of a presentation I gave Tuesday with Ward Cunningham about federated content, Ward got into a righteous rant about the “Tyrrany of Paper”. And the idea he was digging at was this model of a web page as a printed publication had caused us to ignore the unique affordances of digital content. We can iteratively publish, for example, and publish very unfinished sorts of things. We can treat content like data, and mash it up in new and exciting ways. We can break documents into smaller bits, and allow multiple paths through them. We can rethink what authroship looks like.

Or we can take the Andreessen path, which as Ted Nelson said in his moving but horribly misunderstood tribute to Doug Englebart, is “the costume party of fonts that swept aside [Englebart’s] ideas of structure and collaboration.”

The two visions are not compatible, and interestingly it’s Andreessen’s work which locked us into the later vision. Your web browser requests one page at a time, and the layout features of MOSAIC>Netscape guarantee that you will see that page as the server has determined. The model is not one of data — eternally fluid, to be manipulated like Englebart’s grocery list — but of the printed page, permanently fixed.

And ultimately this gives us the server-centric version of the web that we take for granted, like fish in water. The server containing the data — Facebook or Blogger, but also WordPress — controls the presentation of the data, controls what you can do with it. It’s also the One True Place the page shall live — until it disappears. We’re left with RSS hacks and a bewildering array of API calls to accomplish the simplest mashups. And that’s because we know that the author gets to control the printed page — its fonts, its layout, its delivery, its location, its future uses.

The Tyrrany of Print led to us gettting pages delivered as Dead Data, which led to the server-centric vision we now have of the web. The server-centric vision led to a world that looked less like BitTorrent and more like Facebook. There’s an easy way out, but I doubt anyone in Silicon Valley wants to take it.

Ward Cunningham’s explanation of federation (scheme on right) — one client can mash together products of many servers. Federation puts the client, not the server, in control.

UPDATE:

It seems we got front-paged at Hacker News. So for those that don’t follow the blog I thought I’d add a one minute video to show how Smallest Federated Wiki uses a combination of JSON, NodeJS, and HTML5 to accomplish the above model. This vid is just about forking content between two different servers, really basic. Even neater stuff starts to happen when you play with connecting pages via names and people via edit journals, but leave that to another day.

This and more videos and explanations are available at the SFW tag.

Mike Caulfield

federated wiki, federation, sfw

Posted by:

mikecaulfield

The infolit guy.

21 responses to “The Web is Broken and We Should Fix It”

CogDog

August 14, 2014 at 10:29 am

I’m with you brother Mike, and have done more than my share of crying about dead (sometimes deceased sometimes killed) links.

Yet I wonder about the federations being uneven. Would people in this new web just be forking pages all the times as curation? conservation? Dp people adopt interest areas? Do we all become cottage industry internet archives?

Or are you saying it becomes part of our flow as much as bookmarking, which one could say SFW is as a by product.

Does that mean federating all the media? That’s one of the fallings of the Internet Archive, much as I love it- it is achiving the HTML but the media needs to be on the old site (or am I wrong?)

And its not your solution you provoke, but why do not browsers/web server error codes not generate possible links for busted ones back to the Internet Archive? Surely a dude that made a web browser knows how to build this.

Reply
1. daveh70
  
  August 14, 2014 at 9:05 pm
  
  Archive.org collects all files that can be fetched by a web crawler. So (after respecting noindex, nofollow, and disallow requests) they save everything that’s linked, including media files, and files returned by server-side scripts as the result of direct links. But they can’t get the actual server-side scripts or databases.
  
  Reply
2. daveh70
  
  August 15, 2014 at 7:00 am
  
  It just occurred to me that you probably meant streaming media coming from another site like YouTube. I think you’re right; they don’t save that. Maybe it works if the media link is still good, but I’m not even sure about that.
  
  Reply
mikecaulfield

August 14, 2014 at 11:02 am

Forking behavior is a social phenomenon, and therefore unpredictable. But at a basic level, we’d expect you’d fork most things you to which you linked, one degree out. You’d also curate the sort of page that you currently feed to Twitter or Delicious. The other stuff you just read.

Between those behaviors, that’s probably enough to keep the pages alive that we need alive.

The data structure of SFW does a pretty neat thing with images, and embeds them in the page as JSON strings. Right now that comfortably supports smaller lower quality images that are at least readable (up to 15K). The idea is behind that there will eventually be an asset pipeline — the small compressed image will always stay with the page but will also provide a link to an additional out of page resource.

Linked YouTube vids and the like are more problematic, but rome wan’t buit in a day I guess.

I agree that in the meantime we should be building some hacks on top of the current system, it’s amazing to me how we tolerate the 404s as just something normal, which is why Bret Victor’s tweet hit home for me. Where else would this system be considered OK?

Reply
The Future of the Internet? (and how it works as a learning tool) | þoht-hord

August 14, 2014 at 2:30 pm

[…] Over the last couple of days, several items have surfaced in my social feeds about how the Internet doesn’t work and what to do about that. First came Ethan Zuckerman’s description of advertising as “the Internet’s original sin.” which was inspired to some degree by Maciek Ceglowski’s The Internet with a Human Face. Today Mike Caulfield wrote about link rot and data impermanence. […]

Reply
daveh70

August 14, 2014 at 9:43 pm

I hope to start an online community of lifelong learners to trade notes and resources, so you can pick and choose the ones you like best to expand and refine your own ideas. The federated wiki sounds like exactly the kind of tool I’m looking for. So i’m glad I found this post. Thank you.

To answer another question of CogDog’s: Yes, it should be easy for a browser to try the WayBack machine when a page is missing. There are at least two Chrome extensions that say they do that. Firefox plugins might exist, too. https://chrome.google.com/webstore/search/archive.org%20wayback%20machine

Reply
1. CogDog
  
  August 14, 2014 at 10:04 pm
  
  Thanks for the tips, but those extensions only send you to Wayback versions of a page you are looking at (and the reviews on one says it only takes you to the home page).
  
  Extensions/plugins not needed, I have an old school wayback bookmarklet that does the job
  
  javascript:location.href=’http://web.archive.org/web/*/’+document.location.href;
  
  easy peasy
  
  Reply
  1. daveh70
    
    August 14, 2014 at 11:43 pm
    
    Cool, thanks. For anyone else who hasn’t used bookmarklets, you can just “Add” a bookmark, paste the javascript in as a URL, and name it anything you want. ***BUT*** Replace those smart quotes with plain text single quotes, or it won’t work. (Took me a while to notice. I thought Chrome was being stubborn. No loss, since I found some bookmarklet tutorials as I was looking.)
2. J
  
  August 23, 2014 at 1:02 pm
  
  https://addons.mozilla.org/en-US/firefox/addon/resurrect-pages/
  
  is pretty great.
  
  Reply
  1. CogDog
    
    August 23, 2014 at 1:21 pm
    
    Thanks for that link– woah, this might be a good reason to get back to Firefox.
PurpleRails

August 23, 2014 at 11:46 am

I’ve been thinking about this for a while and https://www.purplerails.com/ is my take on a solution to this problem. Pages are automatically saved in the background while you browse as usual.

Reply
thought

August 23, 2014 at 11:54 am

What is the legal status of such curating? By copying a website and serving it to other people, would one be violating copyright?

Reply
R.S.

August 23, 2014 at 12:08 pm

Please stop misusing the word “curate”.

Thanks.

Reply
1. mikecaulfield
  
  August 23, 2014 at 3:37 pm
  
  Words evolve as they move to new contexts.
  
  I am pretty sure when you drive your car you are not “urging an animal forward, as with oxen”. When you call someone on your phone you are not shouting to them to gain attention. And when you “browse” the web, you may in fact not be “browsing” in the traditional sense at all. Web pages are in fact not pages, feeds are not feeds, memes are only examples of a specific type of meme. MOOCs are not open. Bookmarks have nothing to do with books.
  
  In the world of the social web, curate has come to mean the selection, arrangement, storage, indexing, presentation, and redistribution of matter of interest found on the web. It has meant that for at least 10 years. That is, in fact, what you do with federated wiki. Time to move on, right?
  
  Reply
Neil Ellis (@neilellis)

August 23, 2014 at 12:38 pm

I would love to see a prototype of a web browser and hyperlink scheme based on torrent hashes so that instead of using HTTP you could use the BT protocol. That way pages would stick around for years even when the original author deletes them. It’s not as far fetched as it sounds central seeding servers would make it lightening fast.

Reply
Dave

August 23, 2014 at 10:49 pm

ccnx
https://www.ccnx.org/what-is-ccn/

Reply
8in8

August 24, 2014 at 4:50 am

Thank you
Blog fantastic
Good luck
…………………….
http://www.8ii.in

Reply
kontraktor pameran

January 21, 2015 at 5:21 am

Thank you a bunch for sharing this with all people you actually recognize what
you are speaking approximately! Bookmarked. Kindly also discuss with my site =).
We can have a link exchange arrangement among us

Reply
kontraktor booth

May 4, 2015 at 7:59 am

nice article…

Reply
kosmetisk tandvård

August 25, 2015 at 10:38 pm

If you are going for most excellent contents like myself,
simply go to see this website every day since it offers feature
contents, thanks

Reply
You Are Not the Hero of This Story | Hapgood

March 18, 2017 at 10:37 am

[…] a huge fan of peer-to-peer sharing systems. The whole idea of federated content takes much of its inspiration from platforms like BitTorrent, and I’ve repeatedly argued here […]

Reply

About Me

My current work focuses on how students and citizens can use AI for “co-reasoning”, learning to tap into the power of LLMs to both model and critique arguments.

As creator of the SIFT methodology, I have taught thousands of teachers and students how to verify claims and sources through his workshops. My book with Sam Wineburg, Verified: How to Think Straight, Get Duped Less, and Make Better Decisions about What to Believe Online, was published by the University of Chicago Press in November 2023.

The Web is Broken and We Should Fix It

21 responses to “The Web is Broken and We Should Fix It”

Leave a comment Cancel reply