The big news right now in social media-land is that a Buzzfeed editor is a plagairist. Here’s coverage on that from TPM:
In one particularly damning example, Johnson allegedly copied a 2009 post on Yahoo! answers.
“Throughout the London Blitz, over a million incendiaries and around 50,000 high explosive bombs were dropped on London,” wrote Yahoo! user Jason B.
Johnson appears to have used identical language. Buzzfeed scrambled to alter that passage in the 2013 post after he was exposed by the Twitter duo.
When they say he used identical language, they are not talking about a larger passage, by the way. They are talking about that sentence.
After being called out on it, Buzzfeed rewrote the sentence:
London withstood a prolonged assault by the Nazis during the Blitz, with various estimates of the explosives dropped on the city ranging in the tens of thousands.
This is apparently success — to avoid plagairism Buzzfeed has replaced a set of useful and specific estimates with some vague hand-waving.
Look, I know how hard it can be to write a sentence, and how much research and thought can lie behind a single clause. But we need to get over this.
Here it’s a Buzzfeed writer. But every day in my own job I type original paragraphs that someone somewhere has written better, and every day in your job you do the same. How much time do we spend trying to find alternate ways to string together two numbers and a conjunction? We do this wheel-reinventing instead of doing work that extends the work of others and solves new problems.
Giving no credit was a dick move on Johnson’s part, absolutely. But writing facts out of the sentence to avoid plagairism is ridiculous. It’s time to create technology that lets credited reuse happen without showing visible stitches to the reader. Paragraph level tracking doesn’t exist yet in SFW, but it could. The Comprehensive Attribution Statement, if outfitted with an attribution primitve for de minimus use could be another way to go about this. I’m sure you can think of more approaches.
But this is a stupid game we’re playing, and it has to stop. It’s time we evolved. If you are reading this on Chrome or Mozilla you are benefitting from thousands of lines of code written by uncredited programmers, many of whom never made a dime. The system works because in the small community of *producers* they can point to their work’s reuse as an indication of their talent or commitment. In this sort of world, it’s hard to understand why mundane sentences about bomb statistics would merit special treatment.
I mentioned earlier that I’d decided to change my explanation of federated wiki from a “top-down” explanation to a “bottom-up” one.
It makes a heck of a difference. I made this video below for one of our faculty, to show how even something as simple as notes becomes an integrative exercise in federated wiki. I don’t know about you, but watching the flow on SFW work is just *enjoyable* to me. I mean it when I say once you get fluidity with it it feels like a direct extension of your own mind.
So what about the federated stuff? The JSON stuff? The Universal Canvas stuff?
It’s still there. Once you start to think of wikis as personal it raises all the sorts of questions these things solve. But it’s better to start bottom-up than top down.
Everything was wonderful, and I hope I don’t upset people by choosing one thing over another. But there were a few things for me that stood out.
Seeing the Domain of One’s Own development trajectory. I’ve seen this at different points, but the user experience they have for the students at this point is pretty impressive.
JSON API directories. So I really like JSON, as does Kin. But at dinner on Friday he was proposing that the future was that the same way that we query a company for its APIs we would be able to query a person. I’d honestly never thought of this before. This is not an idea like OAuth, where I delegate some power/data exchange between entities. This is me making a call to the authoritative Mike Caulfield API directory and saying, hey how do I set up a videochat? Or where does Mike post his music? And pulling back from that an API call directly to my stuff. This plugged into the work he demonstrated the next day, where he is painstakingly finding all his services he uses, straight down to Expedia, and logging their APIs. I like the idea of hosted lifebits best, but in the meantime this idea of at least owning a directory of your APIs to stuff in other places is intriguing.
Known. I saw known in Portland, so it wasn’t new to me. But it was neat to see the reaction to it here. As Audrey points out, much of day two was getting on Known.
Smallest Federated Wiki. Based on some feedback, I’ve made a decision about how I am going to present SFW from now on. I am astounded by the possibilities of SFW at scale, but you get into unresolvable disagreements about what a heavily federated future would look like. Why? Because we don’t have any idea. I believe that for the class of documents we use most days that stressing out about whether you have the the best version of a document will seem as quaint as stressing out about the number of results Google returns on a search term (remember when we used to look at the number of results and freak out a bit?). But I could be absolutely and totally wrong. And I am certain to be wrong in a lot of *instances* — it may be for your use case that federation is a really really bad idea. Federation isn’t great for policy docs, tax forms, or anything that needs to be authoritative, for instance.
So my newer approach is to start from the document angle. Start with the idea that we need a general tool to store our data, our processes, our grocery lists, our iterated thoughts. Anything that is not part of the lifestream stuff that WordPress does well. The stuff we’re now dropping into Google Docs and emails we send to ourselves. The “lightly-structured data” that Jon Udell rightly claims makes up most of our day. What would that tool have to look like?
- It’d have to be general purpose, not single purpose (more like Google Docs than Remember the Milk)
- It’d have to support networked documents
- It’d have to support pages as collections of sequenced data, not visual markup
- It’d have to have an extensible data format and functionality via plugins
- It’d have to have some way to move your data through a social network
- It’d have to allow the cloning and refactoring of data across multiple sites
- It’d have to have rich versioning and rollback capability
- It’d have to be able to serve data to other applications (in SFW, done through JSON output)
- It’d have to have a robust flexible core that established interoperability protocols while allowing substantial customization (e.g. you can change what it does without breaking its communication with other sites).
Of those, the idea of a document as a collection of JSON data is pretty important, and the idea of federation as a “document-centered network” is amazing in its implications. But I don’t need to race there. I can just start by talking about the need for a general use, personal tool like this, and let the networking needs emerge from that. At some point it will turn out that you can replace things like wikis with things like this or not, but ultimately there’s a lot of value you get before that.
If you’ve watched the Mother of All Demos, you know that one of the aha! moments of it is when Englebart pulls out his grocery list. The idea is pretty simple –if you put your grocery list into a computer instead of on a notepad, you could sort it, edit, clone it, categorize it, drag-and-drop reorder it.
That was 1968. So how are we all doing?
If you’re like my family, there’s probably multiple answers to that, but none particularly good. When Nicole shops, she writes it out on a sheet of paper, and spends a good amount of time trying to remember all the things she has to get. I sometimes write it out in an email I send myself, and then spend time trying to look for past emails I can raid for reminders.
Sorting? Cloning? Drag and drop refactoring?
Ha! What do you think this is, the Jetsons?
How the hell did we get here? Your average car’s fuel injector has more computing power than the machine that Englebart demonstrated this on. And it’s not like this problem has been solved elsewhere.
I’d argue that what happened was we moved towards database-driven single use apps. The truth is that there are DOZENS of shopping list apps, from Don’t Forget the Milk, to the creatively titled GroceryList, to whatever LiveStrong is doing this week.
But they each read like GroceryIQ:
Grocery IQ (Free) includes all the features you would expect in a powerful grocery app, including a barcode scanner, list sharing, and integrated coupons. If you are like me and buy the same things every week, the favorites list will help you save time. You can also edit your list online and it will update automatically on the app.
On one level, GroceryIQ works better than anything we see with Englebart. Barcode Scanner! Favorites list! Wow, powerful!
But chances are it’s a worthless piece of junk to you compared to the email method. Why? Well, you need your specific phone to use it. You can’t print it out. The people you share with need accounts too. Your data is being compiled and sent somewhere for god knows what reason. You can choose favorites, but there’s no easy way to save this year’s Thanksgiving shopping list for next year without the programmers building that for you. If you’d like to separate your list by the store you buy it at, you’re also out of luck. You’ve got to learn a whole new interface. If you ever move off of it, you lose all the stuff you put into it — no export/import functionality. And of course the idea that your grocery list app could somehow exchange data with a recipe app or a food log — not possible in the least.
On the other hand, managing my list in email gets me 90% of these things. So why go to another app I’ll use for two weeks and give up on?
The reason we keep using email is that for that set of tasks requiring more than plaintext but less than an app we have nothing. MS Word maybe. Excel. But somewhere along the line we handed the keys to the current model, and so instead of getting general use tools that radically expand our ability to solve problems we get GroceryIQ, followed by whatever piece of crapware we download next week.
The solution is to capture some older priciples of software design. We need to look at that gap between the tools we use for unstructured data (email, Word) and the over-specified locked-in products that comprise single-use computing, or worse yet, enterprise software bloat. As Jon Udell has noted, most of our day consists of dealing with semi-structured data, yet the tools we have access are either too rigid or too loose to be of use. It’s time we fill the gap with networked, generative tools that can take us to the next level.
I’ve borrowed Jon Udell’s term (“universal canvas”) for talking about SFW. In this video I talk about a plugin my brother Ben wrote for SFW earlier this week, and try to show what that means in semi-mechanical terms.
One of the things I think it starts to show is how much of a construction kit SFW is. As Jon Udell noted when discussing the concept of the universal canvas (in 2006!!), most of days are spent in a world semi-structured data, yet most of the products we have access to either have no affordances for structured data at all, or are engineered to a level of precision appropriate to accounting systems.
Over-engineering data collection is not just a waste of resources. It’s a dangerous practice where data is concerned Most times we start collecting data or sharing it, we’re not even really certain what we want to collect — yet modern practice forces us to design tables and subroutines before we collect *anything*. How the heck can lead users move things forward in such an environment?
In practice, of course, no one moves forward at all. Most information that could help us is never logged anywhere, or it is logged in inaccessible, unparseable formats such as Word XML. We it comes to data, we have access to pea-shooters and inter-continental missiles, and little in-between.
A better approach is to create semi-structured data environments that rely more on conventions and culturally adopted techniques to add meaning to data. The data won’t be perfect, but because convention is more fluid than backend schemas, practice can evolve. Despite what a database admin will tell you, the biggest problem we face is not lack of data consistency. The biggest problem we face is the amount of information captured in no way at all. Using flexible JSON documents with front-end plugins starts to address that issue — and we know from history it’s a lot easier to clean up data we have retrospectively than capture data after the fact.
In the example I show here — how would we know that the fivestar plug-in is showing your overall movie rating, and not, for instance, the rough quality of cinematography in the film? Convention. We agree that the first rating is always your overall rating. We agree that unseen films should be rated as “0”.
As that convention solidifies, it gets encoded in a template. Now the template generates these objects with template-determined IDs. So now we don’t need to look for the first “fivestar” to get the rating — we look for object ‘bd3b3ea18244c038′, which is what the template called that top fivestar interface. We walk through the sitemaps in the neighborhood and find all pages named “Pulp Fiction” and average their values, exclusive of the zeros we decided would mean “not rated”.
Is this more laborious than a join? More error-prone than a SQL Stored Procedure? Well, yeah. You’re never going to get a scientific level of precision from this.
But you don’t need it. If your process to find the best movies misses a film because you filled it out pre-template and it didn’t have the right object ID you will still receive more films in your search results than if you had entered no films at all. Those are the problems we tend to have in life and work, and it’s time for a process and software approach that addresses them.
I’ve gotten two calls from reporters in the past week asking me about the “dangers of analytics” in higher education.
I’m always quite careful to say I think there’s a lot of promise for analytics in higher education. I can’t imagine a future where we’re not using analytics extensively to try and improve what we do. And I have no doubt those analytics will guide us on where to apply resources in some high stakes areas — how we spend advising time, where we apply remediation strategies, which students we select for special attention, what gateway courses we should fund.
But it’s precisely the power and potential of analytics that makes it so important we get this right. And what getting this right means, first and foremost, is we do it in the open, and avoid the “secret sauce” mindset we’ve tended to have about such things. As we move out of the institutional experimentation phase and into the commercialization phase of analytics, institutions are increasingly being sold a black box of formulas they can neither share nor explain. And more than anything else, it’s this “magic number” mentality which makes analytics dangerous.
The anti-analytics crowd may object that openness is not enough. We are, in fact, turning governance over partially to formula, and that’s frightening. Sugar-coat it all you like, no matter how much individual agency you preserve in your process of identifying at-risk students you wouldn’t be using analytics if they didn’t somehow reduce or otherwise impact your decisions. You can’t simultaneously claim that analytics are powerful and harmless.
But protestations that such a shift is unprecedented are unfounded. The truth is that we are ruled by formula more than most people would admit right now. The highway fund in the United States is a percentage of the gas tax. Your Social Security benefits are indexed to inflation. Cost of living is used to compute a number of benefits and your eligibility for various forms of student aid is a function of a poverty-level formula developed in the 1960s.
The difference with education analytics as it is being implemented is not that we are turning some of our adjudication over to formulas, but that those formulas are often shielded from our view.
If a set of students who should seem like they should be receiving aid are not, I can look at the poverty level formula and understand the ways in which that formula may be distorting public policy or excacerbating inequality because that formula is part of public record. We can have a debate about that.
But what if I am a student who is not selected for special intervention at a college, even though I clearly need help? If it is on the basis of written policy (say, advisors will contact all students with a first semester GPA of less than 2.3), we can debate that policy. If it is a matter of personal judgment, we can hold the individual responsible for that judgment accountable, and ask them to explain their reasoning.
If it’s a “secret sauce” algorithm on the other hand, what’s our option for public discourse on it? Where is our opportunity to examine it for racial bias, for bias against part-time students, or even for general error?
We’re left with “trust the programmers, they’re objective” in a world where the programmers have repeatedly proved that not only are they not objective, but quite often just plain wrong about the math. See, for example, Purdue’s Course Signals, or the strange and sad story of Google Flu Trends.
There’s absolutely much to be gained by using the data at our disposal to try and serve our students better. But deciding which students to expend extra resources on is not the same as guessing your likely personal rating for Pulp Fiction. Public policy, public money, and the public good require an accountability above and beyond individuals deciding whether they will give their $8 a month to Netflix, and it’s time we came to terms with that.
Analytics, yes. But no more secret sauce analytics. It’s time to open up these formulas to the light of day and get public comment on them. I understand that involves additional work on the part of institutions and corporations engaging in analytics. But it’s difficult to see how we hold institutions accountable to the public if we don’t make them show us the formulas producing their decisions.
“From the perpetual agency of this System, idleness cannot exist… [T]he whole is a beautiful picture of the most animated industry, and resembles the various machinery of a cloth manufactory, completely executing their different offices, and all set in motion by one active engine.” — Rev. Cordiner, describing the popular Madras System of education in 1820.
The Madras System consists in conducting a school, by a single Master, THROUGH THE MEDIUM OF THE SCHOLARS themselves, by an uniform and almost insensibly progressive course of study, whereby the mind of the child is often exercised in anticipating and dictating for himself his successive lessons… a course in which reading and writing are carried on in the same act, with a law of classification, by which every scholar finds his level, is happily, busily, and profitably employed every moment, is necessarily made perfectly acquainted with every lesson as he goes along, and without the use or the need of corporeal infliction, acquires habits of method, order, and good conduct, and is advanced in his learning, according to the full measure of his capacity.
For Bell and his supporters, the “system” was everything, and they saw the parallels with facotry work as a point of pride. Here’s the Reverend Cordiner talking ecstatically about his visit to the Madras School:
From the perpetual agency of this System, idleness cannot exist… [T]he whole is a beautiful picture of the most animated industry, and resembles the various machinery of a cloth manufactory, completely executing their different offices, and all set in motion by one active engine.
Failure occurred, as it always will, when masters were slaves to “the system,” when they were satisfied with mechanical arrangements and routine work, or when they did not study their pupils, and get down to the Priciples of Education.
Our principal aim in each kind of instruction is to induce the young men to think and judge for themselves. We are opposed to all mechanical study and servile transcripts. The masters of our primary schools must possess intelligence themselves in order to be able to awaken it in their pupils otherwise the state would doubtless prefer the less expensive schools of Bell and Lancaster.
I’ve been struggling to explain SFW interface to people.
Which is weird. Because I actually think the interface is one of the stronger features. I have come to wish I could surf the web in SFW instead of say Chrome. It solves a bunch of issues Ted Nelson tried to solve but more elegantly and with less fluff, and stops much of the nasty fraying of our focus the browser saddles us with.
So am I crazy? Inreasingly, it looks like that may be the case.
But it’s funny, I was just pasting a web video into a post, and let’s look at how that looks.
Go to the YouTube page. (How do I know where that is?)
Search for the YouTube you want. Go to the page. (OK).
Click on it. (The Google link) Yes, the Google link.
(Even though there’s a video link above me). Either one. Just get to a video.
Are you on the page? (Yes.) OK, look under the video, there’s a little “share” link.
(How would I guess that that’s where I put a video in a blog?) Because I’m telling you now.
Click on share, then click on embed. (What does embed mean?) It doesn’t matter, click on it.
Select that text in there. Now press Control-C. (Huh?) Control and C at the same time.
(Oh. How on earth would I know that?) Because I told you. Oh wait. Shoot. Put in your size first.
(What size?) 480 width by 320 height.
(Which box is which?) Width is always first.
(How do you know that?) I just do. Now control-A
(How did you know the dimensions?) They just work well.
Now browse back to the tab with your post. (I closed that, sorry.) OK, re-open it.
(Do I log in again?) Yes. (I forgot my password). Click the forgot password link, and get your password.
(OK, firing up email). Ok. Waiting. (Ok, I’m in.)
Browse to the post you were making. (OK). Click “Text” instead of Visual. (OK I see a bunch of garbage).
Does it look like your post? (Yes).
Find where you want to paste it and click Control-V. (OK, it pasted my password in there).
Um. (Is this because I right-click copied my password from my WordPress invite email?)
Yes, maybe. Probably. (Right click copy and control-C go to the same place?)
Yes. (Is this in documentation somewhere? Why isn’t this documented?)
I’m sure it is, somewhere. OK, go back to the YouTube video. Click share, then embed. Set the dimensions. Control-C.
Then back to your blog. Paste it in. Click the tab to visual. Ready?
We do the copy method, because after initial difficulty, copy-pasting ends up being an incredibly quick way to put together a blog post — you copy-paste an embed, but you can also use the technique on URLS and the like.
I’ve tried other blogging products that are more user-friendly in the first sixty minutes, and I’ve always come back to the copy-paste method. Specialized ways to do these sorts of things up your productivity in your first sixty minutes by killing that productivity in your post-sixty minutes work. That’s a good trade for a website, but a lousy trade for a tool.
The question is always what sort of problem is it? Problems that appear sporadically (setup, settings, etc) should be brain-dead easy. Things you do every day should be efficient and flexible, and ideally tie coherently into an overall model of use. The test of the interface is what the average interaction with it feels like.
The first surprise about IndieWebCamp were the people.
This is going to sound like I was expecting Really Bad Things, but you have to remember this is Portland. What’s “indie” in most cities is considered corporate in Portland. I don’t know how to explain this, except to say when I went out to dinner with Gardner Campbell, Tom Woodward, and Jon Becker in Portland a couple weeks ago, on the way there I passed this guy, unicycling between me and the food carts (sans flames that day).
My thoughts? Not bad. But a tad corporate for Portland.
Now if it was a BRONY on a unicycle….
So I had this worry that the camp might be a bit — well, let’s just say the worry was that the zeal/pragmatism balance might tend more towards zeal. My feeling on decentralization personally is that a revolution that expects users to accept less functionality and higher cost in exchange for more freedom is over before it starts. Decentralization has to be made easy, and ideally it has to let me do neat new things that centralization did not allow.
So the first surprise was the introductory presentations. This is a group that was founded as a pragmatic reaction against other attempts at re-decentralization. Not only does it focus on building stuff over talking about protocols, but Amber Case’s presentation actually showed a viable plan for re-taking the web. The plan is to focus first on journalists, who have very real and present problems caused by service Balkanization. By demonstrating value to journalists, they hope to amplify the movement, and push it into the next stages.
Maybe you disagree with that analysis. But it is far from the “And once they see the power of my code, the people will rise up” bullcrap that typically dominates this set of conversations. It also acknowledges that you want to fight an oligarchy of heavily funded corporations you need a marketing plan. It doesn’t have to be oppresive or all-consuming. But the thing that turns a personal project into movement is strategic and tactical thinking about how to advance it. Everyone I saw speak about their code that day had at least a thumbnail sketch of how they could get broader adoption, and that’s an oddity in this realm.
You can see that too in the Why page on the indiewebcamp site. The why is about what changes in the user’s experience of services, not on software as a political expression.
The products I saw were also surprising.
These technologies are things you check in on every year or so. And in general my experience with them has been there’s been very little focus on making them user-friendly. Why, this many years later, do you have to SSH up to your MediaWiki instance and edit a config file to make even the smallest change?
Talk to your average DIY-er, and they’ll tell you that learning to edit config files is good for you. But so is hanging out with my family, riding my bike, getting a drink at Trivia Night, and watching the new series of In The Flesh. And editing config files (and learning the ins and outs of SSH key-files, PUTTY, pico or whatever) takes time away from all those things. I’m not saying applications can’t be difficult — as much as I might curse Illustrator’s spline based interface, it’s the right way to empower many users. But forcing your users to manually handle integrations that you couldn’t figure out and selling it to them as virtue is precisely why this movement often seems stuck in 1993.
Not so here. Every presentation I saw focused *heavily* on user experience, whether it was on IndieBox, or Known, or IndieAuth. Even the discussions of how different products could be integrated were focussed on what the user would see, not on backend geekery.
Demographics & Tone
It was not all young people. It was not all old people. It was not all men or women. It was probably too white (then again, it’s Portland). It had project managers in the mix as well as coders. Some people were launching new companies. Some people were doing this on the side of other higher paying corporate jobs.
People here weren’t allergic to making money, but they weren’t high off their own pitches either.
There’s been some people who have been tossing the idea around of Portland as the anti-Silicon Valley. A culture of experimentation and collaboration without the sickness that the Valley has acquired over the years. I know a number of people at IndieWebCamp were from other places, but I couldn’t help but think about that during this session. It’s the same sort of attitude I’ve found in the small but productive Open Education community here. It’s the same attitude you see in the XOXO community.
Can Portland become the sane counter-balance to the hype of Palo Alto? They have Udacity; we have Lumen Learning. They have TechCrunch’s Disrupt; we have IndieWebCamp and XOXO.
Well, counterbalance is too strong a word, perhaps. But I actually expect great things out of this city.
If nothing else, it highlighted to me that I need to get more involved, and maybe move a tad more south. I’m 15 minutes away from the epicenter of something big, it’s stupid for me not to get down there more.
Ward Cunningham and Federated Wiki
One of the highlights for me was meeting Ward Cunningham in person finally. We ended up having a 90 minute conversation which clarified my thinking on a lot of issues. Particularly, Ward highlighted for me how schemas — and particularly tight, locked-down database schemas — had killed user innovation. I realized that while Ward is the most brilliant coder I know, his heart lies in Non-Programistan. He want’s to get back to the idea of “applications without programmers” so that, in his words, “humanity can move forward”.
In a world where Silicon Valley’s solution to the power they have over our lives is that “people should learn to code”, Ward’s radical idea (well, radical by 2014 standards) is that maybe we should make software that lets people solve novel problems without them knowing how to code. If you’ve been following Jim’s Internet Course you know that this was the original vision of people like Ivan Sutherland (also a current Portland resident) and his Sketchpad:
So it all comes back to this question of whether you ship an application, or do you ship a user innovation toolkit. And the idea of a user innovation toolkit should NOT be “if you can code this source you can innovate”. It has to embrace non-coders, because the vast majority of ideas on how to make the world better come from non-coders, just as the vast majority of ideas on how to make equipment better come from non-engineers.
And the piece that fell into place for me in that conversation was that the choke point for a lot of that innovation is the schema (in practical terms, the tightly structured backend database which locks in assumptions about what you want to do).
Around that issue, we started to talk about a project I am working on to demonstrate the Smallest Federated Wiki as a user innovation toolkit — the Movie Night Demo. And we ended up coming up with a new feature that Ward coded the next day at IndieWebCamp (unfortunately I had prior obligations and couldn’t make Day 2).
The feature allows you to merge edits of any two pages, interleaving them into a single journal timeline. In terms of Movie Night Demo, this means I can come to a list of your top ten films you want to see and with a simple drag and drop gesture merge your list and mine (while preserving the history).
And in classic user innovation toolkit style, the way this functionality is acheived makes possible dozens of other applications as well. Imagine, for example, students doing collaborative note-taking and being able to instantly merge those notes along the timeline of the presentation. Or people merging data collected from multiple locations to produce a crowd-sourced visualization.
Not bad for a weekend, right?
I’ve been struggling to explain to people why federation is necessary. In practice, federation doesn’t get you much until there are people around to federate with.
Worse, it doesn’t get you anywhere until there is valuable material in your federation. Valuable material takes time to produce, and people aren’t going to spend that time making federated content until they see the value. So we have a bit of a Catch-22 here.
Luckily, I’ve come up with an example of solving a simple, pressing problem using federation that does not take much time investment. It’s about family movie night.
This video explains the problem and why a non-federated solution will not work. The next video will show how federation can solve it.