Reclaim Hackathon

Kin and Audrey have already written up pretty extensive summaries about the Reclaim event in Los Angeles. I won’t add much.

Everything was wonderful, and I hope I don’t upset people by choosing one thing over another. But there were a few things for me that stood out.

Seeing the Domain of One’s Own development trajectory. I’ve seen this at different points, but the user experience they have for the students at this point is pretty impressive.

JSON API directories. So I really like JSON, as does Kin. But at dinner on Friday he was proposing that the future was that the same way that we query a company for its APIs we would be able to query a person. I’d honestly never thought of this before. This is not an idea like OAuth, where I delegate some power/data exchange between entities. This is me making a call to the authoritative Mike Caulfield API directory and saying, hey how do I set up a videochat? Or where does Mike post his music? And pulling back from that an API call directly to my stuff. This plugged into the work he demonstrated the next day, where he is painstakingly finding all his services he uses, straight down to Expedia, and logging their APIs.  I  like the idea of hosted lifebits best, but in the meantime this idea of at least owning a directory of your APIs to stuff in other places is intriguing.

Evangelism Know-how. I worked for a while at a Services-Oriented Architecture obsessed company as an interface programmer (dynamically building indexes to historical newspaper archives using Javascript and Perl off of API-returned XML). I’m newer to GitHub, but have submitted a couple pull requests through it already. So I didn’t really need Kin’s presentation on APIs or GitHub. But I sat and watched it because I wanted to learn how he did presentations. And the thing I constantly forget? Keep it simple. People aren’t offended getting a bit of education about what they already know, and the people for whom it’s new need you to take smaller steps. As an example, Kin took the time to show how JSON can be styled into most anything. On the other hand, I’ve been running around calling SFW a Universal JSON Canvas without realizing people don’t understand why delivering JSON is radically different (and more empowering) than delivering HTML (or worse, HTML + site chrome).

Known. I saw known in Portland, so it wasn’t new to me. But it was neat to see the reaction to it here. As Audrey points out, much of day two was getting on Known.

Smallest Federated Wiki. Based on some feedback, I’ve made a decision about how I am  going to present SFW from now on. I am astounded by the possibilities of SFW at scale, but you get into unresolvable disagreements about what a heavily federated future would look like. Why? Because we don’t have any idea. I believe that for the class of documents we use most days that stressing out about whether you have the the best version of a document will seem as quaint as stressing out about the number of results Google returns on a search term (remember when we used to look at the number of results and freak out a bit?). But I could be absolutely and totally wrong. And I am certain to be wrong in a lot of *instances* — it may be for your use case that federation is a really really bad idea. Federation isn’t great for policy docs, tax forms, or anything that needs to be authoritative, for instance.

So my newer approach is to start from the document angle. Start with the idea that we need a general tool to store our data, our processes, our grocery lists, our iterated thoughts.  Anything that is not part of the lifestream stuff that WordPress does well. The stuff we’re now dropping into Google Docs and emails we send to ourselves. The “lightly-structured data” that Jon Udell rightly claims makes up most of our day. What would that tool have to look like?

  • It’d have to be general purpose, not single purpose (more like Google Docs than Remember the Milk)
  • It’d have to support networked documents
  • It’d have to support pages as collections of sequenced data, not visual markup
  • It’d have to have an extensible data format and functionality via plugins
  • It’d have to have some way to move your data through a social network
  • It’d have to allow the cloning and refactoring of data across multiple sites
  • It’d have to have rich versioning and rollback capability
  • It’d have to be able to serve data to other applications (in SFW, done through JSON output)
  • It’d have to have a robust flexible core that established interoperability protocols while allowing substantial customization (e.g. you can change what it does without breaking its communication with other sites).

Of those, the idea of a document as  a collection of JSON data is pretty important, and the idea of federation as a “document-centered network” is amazing in its implications. But I don’t need to race there. I can just start by talking about the need for a general use, personal tool like this, and let the networking needs emerge from that. At some point it will turn out that you can replace things like wikis with things like this or not, but ultimately there’s a lot of value you get before that.







Gruber: “It’s all the Web”

Tim Owens pointed me to this excellent piece by John Gruber. Gruber has been portrayed in the past as a bit too in the Apple camp; but I don’t think anyone denies he’s one of the sharper commentators out there on the direction of the Web. He’s also the inventor of Markdown, the world’s best microformat, so massive cred there as well.

In any case, Gruber gets at a piece of what I’ve been digging at the past few months, but from a different direction. Responding to a piece on the “death of the mobile web”, he says:

I think Dixon has it all wrong. We shouldn’t think of the “web” as only what renders inside a web browser. The web is HTTP, and the open Internet. What exactly are people doing with these mobile apps? Largely, using the same services, which, on the desktop, they use in a web browser. Plus, on mobile, the difference between “apps” and “the web” is easily conflated. When I’m using Tweetbot, for example, much of my time in the app is spent reading web pages rendered in a web browser. Surely that’s true of mobile Facebook users, as well. What should that count as, “app” or “web”?

I publish a website, but tens of thousands of my most loyal readers consume it using RSS apps. What should they count as, “app” or “web”?

I say: who cares? It’s all the web.

I firmly believe this is true. But why does it matter to us in edtech?

  • Edtech producers have to get out of browser-centrism. Right now, mobile apps are often dumbed-down version of a more functional web interface. But the mobile revolution isn’t about mobile, it’s about hybrid apps and the push of identity/lifestream management up to the OS. As hybrid apps become the norm on more powerful machines we should expect to start seeing the web version becomeing the fall-back version. This is already the case with desktop Twitter clients, for example — you can do much more with Tweetdeck than you can with the Twitter web client — because once you’re freed from the restrictions of running everything through the same HTML-based, cookie-stated, security-constrained client you can actually produce really functional interfaces and plug into the affordances of the local system. I expect people will still launch many products to the web, but hybrid on the desktop will become a first class citizen.
  • It’s not about DIY, it’s about hackable worldware. You do everything yourself to some extent. If you don’t build the engine, you still drive the car. If you don’t drive the car, you still choose the route. DIY is a never-ending rabbit-hole as a goal in itself. The question for me is not DIY, but the old question of educational software vs. worldware. Part of what we are doing is giving students strategies they can use to tackle problems they encounter (think Jon Udell’s “Strategies for Internet citizens“). What this means in practice is that they must learn to use common non-educational software to solve problems. In 1995, that worldware was desktop software. In 2006, that worldware was browser-based apps. In 2014, it’s increasingly hybrid apps. If we are commited to worldware as a vision, we have to engage with the new environment. Are some of these strategies durable across time and technologies? Absolutely. But if we believe that, then surely we can translate our ideals to the new paradigm.
  • Open is in danger of being left behind. Open education mastered the textbook just as the battle moved into the realm of interactive web-based practice. I see the same thing potentially happening here, as we build a complete and open replacement to an environment no one uses anymore.

OK, so what can we do? The first thing is to get over the religion of the browser. It’s the king of web apps, absolutely. But it’s no more pure or less pure an approach than anything else.

The second thing we can do is experiment with hackable hybrid processes. One of the fascinating things to me about file based publishing systems is how they can plug into an ecosystem that involves locally run software. I don’t know where experimentation with that will lead, but it seems to me a profitable way to look at hybrid approaches without necessarily writing code for Android or iOS.

Finally, we need to hack apps. Maybe that means chaining stuff up with IFTTT. Maybe it means actually coding them. But if we truly want to “interrogate the technologies” that guide our daily life, you can’t do that and exclude the technologies that people use most frequently in 2014. The bar for some educational technologists in 2008 was coding up templates and stringing together server-side extensions. That’s still important, but we need to be doing equivalent things with hybrid apps. This is the nature of technology — the target moves.




Teaching the Distributed Flip [Slides & Small Rant]

Due to a moving-related injury I was sadly unable to attend ET4Online this year. Luckily my two co-presenters for the “Teaching the Distributed Flip” presentation carried the torch forward, showing what recent research and experiementation has found regarding how MOOCs are used in blended scenarios.

Here are the slides, which actually capture some interesting stuff (as opposed to my often abstract slides — Jim Groom can insert “Scottish Twee Diagram” joke here):


One of the things I was thinking as we put together these slides is how little true discussion there has been on this subject over the past year and a half. Amy and I came into contact with the University System of Maryland flip project via the MOOC Research Initiative conference last December, and we quickly found that we were finding the same unreported opportunities and barriers they were in their work. In our work, you could possibly say the lack of coverage was due to the scattered nature of the projects (it’d be a lousy argument, but you could say it). But the Maryland project is huge. It’s much larger and better focused than the Udacity/SJSU experiment. Yet, as far as I can tell, it’s crickets from the industry press, and disinterest from much of the research community.

So what the heck is going on here? Why aren’t we seeing more coverage of these experiments, more sharing of these results? The findings are fascinating to me. Again and again we find that the use of these resources energizes the faculty. Certainly, there’s a self-selection bias here. But given how crushing experimenting with a flipped model can be without adequate resources, the ability of such resources to spur innovation is nontrivial. Again and again we also find that local modification is *crucial* to the success of these efforts, and that lack of access to flip-focussed affordances works against potential impact and adoption.

Some folks in the industry get this — the fact the the MRI conference and the ET4Online conference invited presentations on this issue shows the commitment of certain folks to exploring this area. But the rest of the world seems to have lost interest when Thrun discovered you couldn’t teach students at a marginal cost of zero. And the remaining entities seem really reluctant to seriously engage with these known issues of local use amd modification. The idea that there is some tension between the local and the global is seen as a temporary issue rather than an ongoing design concern.

In any case, despite my absence I’m super happy to have brought two leaders in this area — Amy Collier at Stanford Online and MJ Bishop at USMD — together. And I’m not going to despair over missing this session too much, because if there is any sense in this industry at all this will soon be one of many such events. Thrun walked off with the available oxygen in the room quite some time ago. It’s time to re-engage with the people who were here before, are here after, and have been uncovering some really useful stuff. Could we do that? Could we do that soon? Or do we need to make absurd statements about a ten university world to get a bit of attention?

Connexions News: New Editor, Big Announcement on March 31

I’ve become interesting in how forking content could help OER. The two big experiments in OER forking I know of come from WikiEducator and Connexions. (There may be others I’m forgetting; you can correct me in the comments). Connexions, in particular, has been looking at this issue for a very long time.

In an effort not to be Sebastian Thrun I’m trying to understand the difficulties these efforts have encountered in the past before building new solutions. It turns out Connexions may still have a trick or two up its sleeve — passing the information onto you. There appears to be an announcement coming up next week, and there is a new editor coming out as well:


One note about OER — this editing thing has always been a bear of a problem. You want editing to be easy for people, which means WYSIWYG. At the same time, since content has to be ported into multiple contexts you want markup to be semantic. Semantic and WYSIWYG have traditionally been oil and water, and so you end up either with a low bar to entry and documents that are a pain to repurpose or portable documents that no one can really edit without a mini-course. There’s multiple ways to deal with this (including just giving up on module level reuse entirely), but I’m interested to see the new editor. We have invested far too little money in the tools to do this right.

Why I Don’t Edit Wikis (And Why You Don’t Either, and What We Can Do About That)

Back in the heady days of 2008, I was tempted to edit a Wikipedia article. Tempted. Jim Groom had just released EDUPUNK to the world, and someone had put up a stub on Wikipedia for the term. Given I was involved with the earlier discussions on the term, I thought I’d pitch in.

Of course, what happened instead was a talkpage war on whether there sufficient notability to the term. Apparently the hundred or so blog posts on the term did not provide notability, since they did not exist in print form. Here’s the sort of maddening quote that followed after Jim got on the page and had granted CC-BY status to a photo so Wikipedia could use it. Speaking as a Wikipedia regular, one editor argues vociferously against the idea EDUPUNK deserves a page on the site:

This is clearly a meme. No one agrees what it means, its nice that a group of educators are so fond of wikipedia but it shouldnt be used for the purpose of promoting a new website and group. Even in this talk page this becomes clear, the poster boy says “Hey Enric, both of these images are already licensed under CC with a 2.0 nc-sa”Attribution-Share Alike 2.0 Generic.” It wouldn’t be very EDUPUNK if they weren’t ” then goes on to change the copyright of his own image to include it in this article, this is not ideology, this is a marketing campaign.

There’s a couple things to note here. First, the person whining above is not wrong, per se. This article is a public billboard of sorts, vulnerable to abuse by marketers, and vigilance makes sense. But ultimately his — and given Wikipedia’s gender bias it’s almost certainly a he — his protestations end up being ridiculous. EDUPUNK ends up a few months later being chosen as one of the words of the year by the New York Times, at the same time Wikipedia is unable to agree if it rises to the dizzying notability heights of fish finger sandwich.

But the most telling part of that comment is this:

No one agrees what it means, its nice that a group of educators are so fond of wikipedia but it shouldnt be used for the purpose of promoting a new website and group.

No one agrees what it means. Ward Cunningham, the guy who invented wikis, has been talking a while about the problem with this assumption — that we must agree immediately on these sorts of sites — and believes it to be the fundamental flaw of wikis. The idea that people should engage with one another and try to come to common understanding is a good thing, absolutely. The flaw, however, is that wiki format pushes you toward immediate consensus. The format doesn’t give people enough time to develop their own ideas individually or as a subgroup. So an article about fish finger sandwiches can get written (we’re all in agreement, good!) whereas an article on EDUPUNK can’t get written (too many different viewpoints, bad!).

It’s important to note Cunningham’s exact point here. Many people have gone after the culture of Wikipedia in recent years, a culture which is increasingly broken. Cunningham’s point is that the culture is a product of the tool itself, which doesn’t give folks enough alone time. We need to break off, develop our ideas, and come back and reconcile them. And we need a tool that encourages us to do that.

I’ve been thinking this through for a bit, trying to come up with a solution to this problem that has the spirit of Cunningham’s proposed federated wiki but is easier for people to wrap their heads around. Here’s the the basic idea, mostly carried forward from Cunningham, but eliminating a couple more complex concepts, and simplifying concepts and implementation.

  1. I install a wiki on my server, but it’s not empty. It’s a copy of a reference on online learning (or some other reference of interest to me), with all wiki pages transcluded. For the uninitiated, what this means is my wiki “passes through” the existing wiki pages. For the purposes of imagining this, let’s pretend I just pull 2500 articles about learning and networks from Wikipedia, and transclude them on my wiki/server.
  2. I then join a federation. So let’s say I join a federation of a 100 instructional designers and technologists. This changes search for me, because search on my wiki is federated now. I can search across the federation for an article on EDUPUNK. Let’s say it’s 2008 and I’m looking for a quick explanatory link on EDUPUNK to send someone. I pump in that search and find there’s five or six somewhat crappy treatments, and one half decent one by Martin Weller.
  3. I don’t edit it. Or rather, I do, but the minute I edit it, this becomes a fork that only lives on my server. So I fix it up without having to get into long arguments with people about notability, etc. When done, I shoot a link to the person I wanted to send the article to. My selfish needs are met.
  4. Now, however, when anyone goes to their EDUPUNK article in the federation, they see that I’ve written a new version. Some people decide to adopt this as their version. Martin Weller sees my edits, and works about half of them into his version along with some other stuff. Jim comes by and adopt Martin’s new version with some changes. It’s better than my version, so I adopt that one.
  5. Tools start to show a coalescence around the Martin-Me-Martin-Jim version. A wiki gardener in charge of the “hub” version looks at the various versions and pulls them together, favoring the Martin-Me-Martin-Jim version, but incorporating other elements as well. This version will get distributed when new people join the federation, but as before, people can fork it, and existing forks remain intact.

The idea here is that forks preserve information by giving people the freedom to edit egocentrically, but that the system makes reconciliation easy by keeping track of the other versions, so that periodic gardening can bring these versions together back into a more generic whole.

You can think about this from any number of angles — imagine an online textbook, for example, that allowed you to see all the modifications made to that textbook by other instructors — and not edits living on a corporate server owned by Harcourt-Brace, but edits that were truly distributed. Imagine a federated student wiki, where your students could build out their articles in piece during the semester, seeing how other students had forked and modified their articles, but keeping control of their subsite, and not being forced to accept outside edits. The student’s final work would reflect *their* set of decisions about the subject and the critiques of their treatment of it. Or imagine support documentation that kept track of localizations, making it easy to see what things various clients needed to clarify, and making those changes available to all.

Anyway, this is the idea. Encourage forking, but make reconciliation easy. It’s the way things are going, and the implications for both OER production and academic wikis are huge.

Short Notes on the Absence of Theory

Martin Weller, Stephen Downes, and Matt Crosslin have been kicking around the “post-theory” critique of MRI ’13 that came up in a discussion Jim Groom and I had Thursday night in the middle of a bar in the middle of a hotel in the middle of an ice storm.

I thought I might just add a bit of context and my two cents.

First, the conversation came up because Jim was quite nicely (and genuinely) asking an edX data analyst what Big Data was. The answer that analyst gave was that Big Data was data that was big. That’s actually technically correct — the original term was meant to refer to data that was big enough in terabytes/petabytes that it could not be processed through traditional means. If your data was big enough that you were using Hadoop, it was Big Data.

Because I’m generally a person that can’t keep my mouth shut, I interjected that while that was true from a technical standpoint, it didn’t really get at the cultural significance of the Big Data movement, which was captured in Chris Andersen’s “End of Theory” article back in 2008. Here’s a sample:

Google’s founding philosophy is that we don’t know why this page is better than that one: If the statistics of incoming links say it is, that’s good enough. No semantic or causal analysis is required. That’s why Google can translate languages without actually “knowing” them (given equal corpus data, Google can translate Klingon into Farsi as easily as it can translate French into German). And why it can match ads to content without any knowledge or assumptions about the ads or the content.

Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

While my analogy-prone brain sees parallels here to Searle’s Chinese Room problem, it’s probably more correct to see this as behaviorism writ large: where Skinner wanted us to see the mind as a black box determined by inputs and outputs, Big Data asks us to see entire classes of people as sets of statistical probabilities, and the process of research becomes the iterative manipulation of inputs to achieve desired outputs. And the same issues emerge: Chomsky’s “destruction” of behaviorism in his 1959 takedown of B. F. Skinner’s Verbal Behavior is generally overstated, but certain passages in that work seem a relevant critique of the “end of theory”; for instance, where Chomsky criticizes Skinner’s notion of reference: “The assertion (115) that so far as the speaker is concerned, the relation of reference is ‘simply the probability that the speaker will emit a response of a given form in the presence of a stimulus having specified properties’ is surely incorrect if we take the words presence, stimulus, and probability in their literal sense.”

Of course, in the past 50 years we’ve seen this Chomsky-Skinner drama played out anew in linguistics. While Chomsky’s transformational grammar underpinned efforts at computer translation for many years, Google’s translational approach, which sees language as nothing more than a set of probabilities (words are “known” to be the same in two different languages if they have the same probability of occurring in a context), is quickly outstripping the traditional methods. In fact, for a certain class of tasks it becomes increasingly obvious that correlation *is* enough. Google’s translation engine has little to no theory of language, yet adequately serves for a person who needs a quick translation of a web page. And that somewhat atheoretical nature of the engine is in fact its strength — Google’s approach needs only a robust set of web pages from any language to generate the correlations needed to start translating.

So this debate is not really new, and there’s certainly a place for this sort of radical pragmatism. Chomsky’s focus on a system of mental rules that form a universal grammar may have enlarged human knowledge, but it’s turning out to be a really inefficient way to train computers to understand language. Gains in understanding underlying models are not always the shortest route to efficacy.

But such approaches come with a down side as well. Morozov deals with this extensively in his book To Save Everything, Click Here, and in his WSJ review of the book Big Data. After noting that Big Data is very useful in situations where you don’t care what the cause is (Amazon cares not a whit *why* people who buy german chocolate also buy cake pans as long as they get to the checkout buying both), where you do care about cause things are a bit different:

Take obesity. It’s one thing for policy makers to attack the problem knowing that people who walk tend to be more fit. It’s quite another to investigate why so few people walk. A policy maker satisfied with correlations might tackle obesity by giving everyone a pedometer or a smartphone with an app to help them track physical activity—never mind that there is nowhere to walk, except for the mall and the highway. A policy maker concerned with causality might invest in pavements and public spaces that would make walking possible. Substituting the “why” with the “what” doesn’t just give us the same solutions faster—often it gives us different, potentially inferior solutions.

A hardline proponent of a Big Data approach might object to Morozov that you just need more nuanced and informed correlations. But assuming you had no theory about of ultimate causes, how would you even conceive of the possibility? (This is similar to what Michael Feldstein was getting at in his piece about the inadequacy of Big Data for education). A person who does not have a model of what is happening is unlikely to know where to look for inconsistencies. And Big Data is, by definition, big. Theory is your roadmap.

This is why at the workshop on analytics at the conference, I insisted on the “grokability” of analytics-produced guidance to the people who would use it to help students. In a way it comes down to the empowerment of the practitioner (and of the student). If I’m told I have a 50% chance of dropping out based on my “rt-score” of 2145.7, that’s one thing. But the interpretation of what to *do* about that number should depend heavily on what the inputs into it were. Was it prior GPA that pumped that score so high, or socioeconomic status? And the reason those variables are treated differently is that we have models and theories about socioeconomic status and GPA that help us understand its significance as a predictor.

Ultimately, like so many in the field, I’m actually very excited about the promise of data (though I would argue that it is actually “small data” — data that can live in a single spreadsheet — that paired with local use has the greatest potential). Still, if we are to enter this world we have to understand the trade-offs we engage in. Most of the theory-bound could certainly use a better understanding of how powerful a tool statistics can be in overcoming our own theoretical predispositions. It’s useful to understand that theory is not the only tool in the toolbox. But it’s equally true that the new breed of data scientist needs to be far more acquainted with the theories and assumptions that animate the sets of data in front of them. At the very least, they have to understand what theory is good for, why it matters, and why it is not always sufficient to tweak inputs and outputs.

Rediscovering (Semi-)Social Bookmarking

I joined Pinboard, the new, ad-free, pay-once-get-it-forever social bookmarking service a few months ago for an educational tech project I am working on. I’m not new to social-bookmarking — I’d been an early user of delicious, a Diigo migrant, and ultimately became a lapsed bookmarker, confused about why the whole thing hadn’t worked out.

I think I may no longer be confused. The thing is, I was doing bookmarking wrong. I was bookmarking articles I thought were stellar, carefully pruning my tags. I imagined strangers stumbling on my account, and being impressed by the well curated collection, like the man with the owl-eyed glasses in Gatsby’s library before he realizes the pages aren’t cut.

In other words, I thought social bookmarking was about the social element.

Now, with a Pinboard archive account that indexes the whole page text of whatever I bookmark and rock-solid API support, I’ve made social bookmarking about me again. And it’s wonderful. I no longer agonize about what to bookmark. If I read something — anything — on the web that I think I might like to remember at some point I click the toolbar Pinboard link and file it. I come up with some terms to index it, but don’t spend more than a couple seconds on them. The point of bookmarking is now to be a Memex, to turn those moments where I tell someone “I think there was an article about that I read a few months ago” into “Here’s a link to an article on that from a few months ago.” Or, more importantly, to call that article to hand when I need it for my own writing.

There are a couple developments since early social bookmarking that make this approach possible. First of all, Twitter and Tumblr have largely satisfied the “I’m recommending these links to you” market. Rather than bookmark only notable articles, I use an IFTTT script that takes anything tagged “to:haplr” and posts it to Twitter and a Tumblr linklog, along with my comments. Feed-reading is also integrated. With Feedbin, any post I star flows automatically into my Pinboard bookmarks, creating the rich searchable archive that we once had with Google Reader, only this time hosted in a paid service that is less likely to pull the rug out from under the user. The ease of private bookmarking in Pinboard also changes the dynamic — allowing you to bookmark (and archive) material that you might not want to clutter your public bookmarks with.

But perhaps the biggest shift is seeing how unsuited Twitter, Tumblr, and other link-sharing mechanisms can be to certain forms of serious work – the number of times I have found myself paging through my tweetstream trying to find a link to an article I tweeted out that I now need to reference is embarrassing.

In any case, if you were once a bookmarker but abandoned the practice, try giving it another shot with a “bookmark it now and sort it out later” approach. Get an archive account, and start caching pages of what you read. Play around with the IFTTT options. I think you might be surprised to find that the abandoned child of the Web 2.0 revolution is actually what you’ve been yearning for the past couple of years.