Plagairism and Evolution and Attribution Statements

The big news right now in social media-land is that a Buzzfeed editor is a plagairist. Here’s coverage on that from TPM:

In one particularly damning example, Johnson allegedly copied a 2009 post on Yahoo! answers.

“Throughout the London Blitz, over a million incendiaries and around 50,000 high explosive bombs were dropped on London,” wrote Yahoo! user Jason B.

Johnson appears to have used identical language. Buzzfeed scrambled to alter that passage in the 2013 post after he was exposed by the Twitter duo.

When they say he used identical language, they are not talking about a larger passage, by the way. They are talking about that sentence.

After being called out on it, Buzzfeed rewrote the sentence:

London withstood a prolonged assault by the Nazis during the Blitz, with various estimates of the explosives dropped on the city ranging in the tens of thousands.

This is apparently success — to avoid plagairism Buzzfeed has replaced a set of useful and specific estimates with some vague hand-waving.

Look, I know how hard it can be to write a sentence, and how much research and thought can lie behind a single clause. But we need to get over this.

Here it’s a Buzzfeed writer. But every day in my own job I type original paragraphs that someone somewhere has written better, and every day in your job you do the same. How much time do we spend trying to find alternate ways to string together two numbers and a conjunction? We do this wheel-reinventing instead of doing work that extends the work of others and solves new problems.

Giving no credit was a dick move on Johnson’s part, absolutely. But writing facts out of the sentence to avoid plagairism is ridiculous. It’s time to create technology that lets credited reuse happen without showing visible stitches to the reader. Paragraph level tracking doesn’t exist yet in SFW, but it could. The Comprehensive Attribution Statement, if outfitted with an attribution primitve for de minimus use could be another way to go about this. I’m sure you can think of more approaches.

But this is a stupid game we’re playing, and it has to stop. It’s time we evolved. If you are reading this on Chrome or Mozilla you are benefitting from thousands of lines of code written by uncredited programmers, many of whom never made a dime. The system works because in the small community of *producers* they can point to their work’s reuse as an indication of their talent or commitment. In this sort of world, it’s hard to understand why mundane sentences about bomb statistics would merit special treatment.


Federated Wiki for Distributed Notetaking (and the surprising pedagogical implications of that)

I mentioned earlier that I’d decided to change my explanation of federated wiki from a “top-down” explanation to a “bottom-up” one.

It makes a heck of a difference. I made this video below for one of our faculty, to show how even something as simple as notes becomes an integrative exercise in federated wiki. I don’t know about you, but watching the flow on SFW work is just *enjoyable* to me. I mean it when I say once you get fluidity with it it feels like a direct extension of your own mind.

So what about the federated stuff? The JSON stuff? The Universal Canvas stuff?

It’s still there. Once you start to think of wikis as personal it raises all the sorts of questions these things solve. But it’s better to start bottom-up than top down.


Reclaim Hackathon

Kin and Audrey have already written up pretty extensive summaries about the Reclaim event in Los Angeles. I won’t add much.

Everything was wonderful, and I hope I don’t upset people by choosing one thing over another. But there were a few things for me that stood out.

Seeing the Domain of One’s Own development trajectory. I’ve seen this at different points, but the user experience they have for the students at this point is pretty impressive.

JSON API directories. So I really like JSON, as does Kin. But at dinner on Friday he was proposing that the future was that the same way that we query a company for its APIs we would be able to query a person. I’d honestly never thought of this before. This is not an idea like OAuth, where I delegate some power/data exchange between entities. This is me making a call to the authoritative Mike Caulfield API directory and saying, hey how do I set up a videochat? Or where does Mike post his music? And pulling back from that an API call directly to my stuff. This plugged into the work he demonstrated the next day, where he is painstakingly finding all his services he uses, straight down to Expedia, and logging their APIs.  I  like the idea of hosted lifebits best, but in the meantime this idea of at least owning a directory of your APIs to stuff in other places is intriguing.

Evangelism Know-how. I worked for a while at a Services-Oriented Architecture obsessed company as an interface programmer (dynamically building indexes to historical newspaper archives using Javascript and Perl off of API-returned XML). I’m newer to GitHub, but have submitted a couple pull requests through it already. So I didn’t really need Kin’s presentation on APIs or GitHub. But I sat and watched it because I wanted to learn how he did presentations. And the thing I constantly forget? Keep it simple. People aren’t offended getting a bit of education about what they already know, and the people for whom it’s new need you to take smaller steps. As an example, Kin took the time to show how JSON can be styled into most anything. On the other hand, I’ve been running around calling SFW a Universal JSON Canvas without realizing people don’t understand why delivering JSON is radically different (and more empowering) than delivering HTML (or worse, HTML + site chrome).

Known. I saw known in Portland, so it wasn’t new to me. But it was neat to see the reaction to it here. As Audrey points out, much of day two was getting on Known.

Smallest Federated Wiki. Based on some feedback, I’ve made a decision about how I am  going to present SFW from now on. I am astounded by the possibilities of SFW at scale, but you get into unresolvable disagreements about what a heavily federated future would look like. Why? Because we don’t have any idea. I believe that for the class of documents we use most days that stressing out about whether you have the the best version of a document will seem as quaint as stressing out about the number of results Google returns on a search term (remember when we used to look at the number of results and freak out a bit?). But I could be absolutely and totally wrong. And I am certain to be wrong in a lot of *instances* — it may be for your use case that federation is a really really bad idea. Federation isn’t great for policy docs, tax forms, or anything that needs to be authoritative, for instance.

So my newer approach is to start from the document angle. Start with the idea that we need a general tool to store our data, our processes, our grocery lists, our iterated thoughts.  Anything that is not part of the lifestream stuff that WordPress does well. The stuff we’re now dropping into Google Docs and emails we send to ourselves. The “lightly-structured data” that Jon Udell rightly claims makes up most of our day. What would that tool have to look like?

  • It’d have to be general purpose, not single purpose (more like Google Docs than Remember the Milk)
  • It’d have to support networked documents
  • It’d have to support pages as collections of sequenced data, not visual markup
  • It’d have to have an extensible data format and functionality via plugins
  • It’d have to have some way to move your data through a social network
  • It’d have to allow the cloning and refactoring of data across multiple sites
  • It’d have to have rich versioning and rollback capability
  • It’d have to be able to serve data to other applications (in SFW, done through JSON output)
  • It’d have to have a robust flexible core that established interoperability protocols while allowing substantial customization (e.g. you can change what it does without breaking its communication with other sites).

Of those, the idea of a document as  a collection of JSON data is pretty important, and the idea of federation as a “document-centered network” is amazing in its implications. But I don’t need to race there. I can just start by talking about the need for a general use, personal tool like this, and let the networking needs emerge from that. At some point it will turn out that you can replace things like wikis with things like this or not, but ultimately there’s a lot of value you get before that.

 

 

 

 

 

 


Doug Englebart’s Grocery List

If you’ve watched the Mother of All Demos, you know that one of the aha! moments of it is when Englebart pulls out his grocery list. The idea is pretty simple –if you put your grocery list into a computer instead of on a notepad, you could sort it, edit, clone it, categorize it, drag-and-drop reorder it.

mother-of-all-demos

That was 1968. So how are we all doing?

If you’re like my family, there’s probably multiple answers to that, but none particularly good. When Nicole shops, she writes it out on a sheet of paper, and spends a good amount of time trying to remember all the things she has to get. I sometimes write it out in an email I send myself, and then spend time trying to look for past emails I can raid for reminders.

Sorting? Cloning? Drag and drop refactoring?

Ha! What do you think this is, the Jetsons?

How the hell did we get here? Your average car’s fuel injector has more computing power than the machine that Englebart demonstrated this on. And it’s not like this problem has been solved elsewhere.

I’d argue that what happened was we moved towards database-driven single use apps. The truth is that there are DOZENS of shopping list apps, from Don’t Forget the Milk, to the creatively titled GroceryList, to whatever LiveStrong is doing this week.

But they each read like GroceryIQ:

Grocery IQ (Free) includes all the features you would expect in a powerful grocery app, including a barcode scanner, list sharing, and integrated coupons. If you are like me and buy the same things every week, the favorites list will help you save time. You can also edit your list online and it will update automatically on the app.

On one level, GroceryIQ works better than anything we see with Englebart. Barcode Scanner! Favorites list! Wow, powerful!

But chances are it’s a worthless piece of junk to you compared to the email method. Why? Well, you need your specific phone to use it. You can’t print it out. The people you share with need accounts too.  Your data is being compiled and sent somewhere for god knows what reason. You can choose favorites, but there’s no easy way to save this year’s Thanksgiving shopping list for next year  without the programmers building that for you. If you’d like to separate your list by the store you buy it at, you’re also out of luck. You’ve got to learn a whole new interface. If you ever move off of it, you lose all the stuff you put into it — no export/import functionality. And of course the idea that your grocery list app could somehow exchange data with a recipe app or a food log — not possible in the least.

On the other hand, managing my list in email gets me 90% of these things. So why go to another app I’ll use for two weeks and give up on?

The reason we keep using email is that for that set of tasks requiring more than plaintext but less than an app we have nothing. MS Word maybe. Excel. But somewhere along the line we handed the keys  to the current model, and so instead of getting general use tools that radically expand our ability to solve problems we get GroceryIQ, followed by whatever piece of crapware we download next week.

The solution is to capture some older priciples of software design. We need to look at that gap between the tools we use for unstructured data (email, Word) and the over-specified locked-in products that comprise single-use computing, or worse yet, enterprise software bloat. As Jon Udell has noted, most of our day consists of dealing with semi-structured data, yet the tools we have access are either too rigid or too loose to be of use. It’s time we fill the gap with networked, generative tools that can take us to the next level.

 

 


The Universal JSON Canvas and Ben’s Five Star Plugin

I’ve borrowed Jon Udell’s term (“universal canvas”) for talking about SFW. In this video I talk about a plugin my brother Ben wrote for SFW earlier this week, and try to show what that means in semi-mechanical terms.

One of the things I think it starts to show is how much of a construction kit SFW is. As Jon Udell noted when discussing the concept of the universal canvas (in 2006!!), most of days are spent in a world semi-structured data, yet most of the products we have access to either have no affordances for structured data at all, or are engineered to a level of precision appropriate to accounting systems.

Over-engineering data collection is not just a waste of resources. It’s a dangerous practice where data is concerned Most times we start collecting data or sharing it, we’re not even really certain what we want to collect — yet modern practice forces us to design tables and subroutines before we collect *anything*. How the heck can lead users move things forward in such an environment?

In practice, of course, no one moves forward at all. Most information that could help us is never logged anywhere, or it is logged in inaccessible, unparseable formats such as Word XML. We it comes to data, we have access to pea-shooters and inter-continental missiles, and little in-between.

A better approach is to create semi-structured data environments that rely more on conventions and culturally adopted techniques to add meaning to data. The data won’t be perfect, but because convention is more fluid than backend schemas, practice can evolve. Despite what a database admin will tell you, the biggest problem we face is not lack of data consistency. The biggest problem we face is the amount of information captured in no way at all. Using flexible JSON documents with front-end plugins starts to address that issue — and we know from history it’s a lot easier to clean up data we have retrospectively than capture data after the fact.

In the example I show here — how would we know that the fivestar plug-in is showing your overall movie rating, and not, for instance, the rough quality of cinematography in the film? Convention. We agree that the first rating is always your overall rating. We agree that unseen films should be rated as “0”.

As that convention solidifies, it gets encoded in a template. Now the template generates these objects with template-determined IDs. So now we don’t need to look for the first “fivestar” to get the rating — we look for object ‘bd3b3ea18244c038′, which is what the template called that top fivestar interface. We walk through the sitemaps in the neighborhood and find all pages named “Pulp Fiction” and average their values, exclusive of the zeros we decided would mean “not rated”.

Is this more laborious than a join? More error-prone than a SQL Stored Procedure? Well, yeah. You’re never going to get a scientific level of precision from this.

But you don’t need it. If your process to find the best movies misses a film because you filled it out pre-template and it didn’t have the right object ID you will still receive more films in your search results than if you had entered no films at all. Those are the problems we tend to have in life and work, and it’s time for a process and software approach that addresses them.


No More Secret Sauce Analytics

I’ve gotten two calls from reporters in the past week asking me about the “dangers of analytics” in higher education.

I’m always quite careful to say I think there’s a lot of promise for analytics in higher education. I can’t imagine a future where we’re not using analytics extensively to try and improve what we do. And I have no doubt those analytics will guide us on where to apply resources in some high stakes areas — how we spend advising time, where we apply remediation strategies, which students we select for special attention, what gateway courses we should fund.

But it’s precisely the power and potential of analytics that makes it so important we get this right. And what getting this right means, first and foremost, is we do it in the open, and avoid the “secret sauce” mindset we’ve tended to have about such things. As we move out of the institutional experimentation phase and into the commercialization phase of analytics, institutions are increasingly being sold a black box of formulas they can neither share nor explain. And more than anything else, it’s this “magic number” mentality which makes analytics dangerous.

The anti-analytics crowd may object that openness is not enough. We are, in fact, turning governance over partially to formula, and that’s frightening. Sugar-coat it all you like, no matter how much individual agency you preserve in your process of identifying at-risk students you wouldn’t be using analytics if they didn’t somehow reduce or otherwise impact your decisions. You can’t simultaneously claim that analytics are powerful and harmless.

But protestations that such a shift is unprecedented are unfounded. The truth is that we are ruled by formula more than most people would admit right now. The highway fund in the United States is a percentage of the gas tax. Your Social Security benefits are indexed to inflation. Cost of living is used to compute a number of benefits and your eligibility for various forms of student aid is a function of a poverty-level formula developed in the 1960s.

The difference with education analytics as it is being implemented is not that we are turning some of our adjudication over to formulas, but that those formulas are often shielded from our view.

If a set of students who should seem like they should be receiving aid are not, I can look at the poverty level formula and understand the ways in which that formula may be distorting public policy or excacerbating inequality because that formula is part of public record. We can have a debate about that.

But what if I am a student who is not selected for special intervention at a college, even though I clearly need help? If it is on the basis of written policy (say, advisors will contact all students with a first semester GPA of less than 2.3), we can debate that policy. If it is a matter of personal judgment, we can hold the individual responsible for that judgment accountable, and ask them to explain their reasoning.

If it’s a “secret sauce” algorithm on the other hand, what’s our option for public discourse on it? Where is our opportunity to examine it for racial bias, for bias against part-time students, or even for general error?

We’re left with “trust the programmers, they’re objective” in a world where the programmers have repeatedly proved that not only are they not objective, but quite often just plain wrong about the math. See, for example, Purdue’s Course Signals, or the strange and sad story of Google Flu Trends.

There’s absolutely much to be gained by using the data at our disposal to try and serve our students better. But deciding which students to expend extra resources on is not the same as guessing your likely personal rating for Pulp Fiction. Public policy, public money, and the public good require an accountability above and beyond individuals deciding whether they will give their $8 a month to Netflix, and it’s time we came to terms with that.

Analytics, yes. But no more secret sauce analytics. It’s time to open up these formulas to the light of day and get public comment on them. I understand that involves additional work on the part of institutions and corporations engaging in analytics. But it’s difficult to see how we hold institutions accountable to the public if we don’t make them show us the formulas producing their decisions.


The Original Factory Education Was a Personalized Learning Experiment

“From the perpetual agency of this System, idleness cannot exist… [T]he whole is a beautiful picture of the most animated industry, and resembles the various machinery of a cloth manufactory, completely executing their different offices, and all set in motion by one active engine.” — Rev. Cordiner, describing the popular Madras System of education in 1820.

Audrey Watters has a great summary of the recent personalization debate, followed by some excellent analysis on some of the politics and history of personalization technology.

In that article she demonstrates that personalization through technology has been an obsession since the invention of the earliest teaching machines. Such efforts may work poorly or they may work well. But they represent a continuation of how we have viewed teaching machines throughout the past century and a half, not a revolution.

Let me throw another log on the fire.

Because all the rhetoric around how we will shove of the mantel of “factory education” for the brand new world of “personalized learning” misses a point of the utmost importance:

Factory education was invented as a form of personalization.

Now, let me add disclaimers here. I am not a historian. If Sherman Dorn comes onto the blog and tells me I got this 100% wrong, I will happily redact. I will recant.

But let’s walk through this, shall we?

Factory Education

People toss around the term “factory education” so much it’s become meaningless. That debate centered Social Studies class you had in 1987? Factory education. That rote class your grandfather had in Latin? Factory education. That project-based class that your daughter is taking in Biology? Factory education. As Salman Khan has informed us, nothing has changed in the history of education since the Prussians rode over and started running our grade schools. It’s just one big ball of factory education.

But if you’re looking for the first model of education truly derived from factory structure and informed by its values, my guess is it would be the Madras System (and its variant in the Lancaster System).

Developed in England by Andrew Bell in the last years of the 1700s, the Madras System used better performing students to teach poorer performing students. It did this by applying a factory model of division of labor and rigid mechanical instruction in a facility that was patterned directly on the factories of the day.

Unlike our schoolrooms today (which, perhaps you’ve noticed, look very little like factories?) both the Madras system and the Lancaster system took place in large warehouse or barn-like spaces where small groups of students gathered around work stations divided by ability.

At each work station, an older student tutored the younger ones. As the students practiced skill application repeatedly they could move up into more challenging groups. Students who had progressed through all the stages could then be employed as leaders of the groups. A school of 500 students could be served with one schoolmaster in this way, with all the students receiving personal tutoring from the monitors, who were trained in the system themselves. (This is why the Lancaster and Bell systems are sometimes referred to as “monitorial systems”.)

Here’s how Bell describes his system in his Manual:

The Madras System consists in conducting a school, by a single Master, THROUGH THE MEDIUM OF THE SCHOLARS themselves, by an uniform and almost insensibly progressive course of study, whereby the mind of the child is often exercised in anticipating and dictating for himself his successive lessons… a course in which reading and writing are carried on in the same act, with a law of classification, by which every scholar finds his level, is happily, busily, and profitably employed every moment, is necessarily made perfectly acquainted with every lesson as he goes along, and without the use or the need of corporeal infliction, acquires habits of method, order, and good conduct, and is advanced in his learning, according to the full measure of his capacity.

I don’t think I have to spell it out for you, but what Bell is describing here is what many folks nowadays refer to as personalized learning. It’s not the only form we see today, but in practice it looks eerily like 100 students sitting in a charter school classroom trying to level up through educational software.

For Bell and his supporters, the “system” was everything, and they saw the parallels with facotry work as a point of pride. Here’s the Reverend Cordiner talking ecstatically about his visit to the Madras School:

From the perpetual agency of this System, idleness cannot exist… [T]he whole is a beautiful picture of the most animated industry, and resembles the various machinery of a cloth manufactory, completely executing their different offices, and all set in motion by one active engine.

This was system was not a historical footnote. It was, in fact, the most popular system of education in the English-speaking world at the beginning of the 19th century. And it was this system that was poised to take over the English-speaking world when the Mannian System in the U.S. (and the Glasgow System in the U.K.) came into prominence in the 1840s. The question Great Britain occupied itself in the first half of the 19th century was *which* monitorial system should form the basis of a national system of education: Lancaster’s or Bell’s?

So why was this approach unseated? Gladman in his treatment of that period sums it up rather neatly:

Failure occurred, as it always will, when masters were slaves to “the system,” when they were satisfied with mechanical arrangements and routine work, or when they did not study their pupils, and get down to the Priciples of Education.

Most of my readers know the history from this point on better than I, but reading through Gladman is instructive. For Gladman, the great displacer of the Monitorial Method of Bell and Lancaster is Stow’s Glasgow system, and the great difference is its flexibility.

Stow’s method involved trained teachers, who don’t exercise step-by-step methods, but rather have a grasp of priciples and techniques and use them to foster inquisitiveness and discovery in the classroom. Stow believed that if students were to get to understanding instead of simple memorization they needed high quality teachers. Turning the “machinery” metaphor of Bell on its head Stow remarked it was useless to have “the machinery without the skilled workman, or the skilled workman without the suitable premises”. Similarly, the evil Prussians that form the villians in Khan’s worldview rejected the personalized models of Bell and Lancaster because they instilled too much thoughtless obedience in their students. Here’s Victor Cousin, writing in the book that fueled the Mannian revolution in America, Report on the State of Education in Prussia:

Our principal aim in each kind of instruction is to induce the young men to think and judge for themselves. We are opposed to all mechanical study and servile transcripts. The masters of our primary schools must possess intelligence themselves in order to be able to awaken it in their pupils otherwise the state would doubtless prefer the less expensive schools of Bell and Lancaster. 

That is those nasty sounding Prussians agreeing with the somewhat less nasty sounding Glasweegians that education must be reformed because it works too much like a factory. And the way to make it less like a factory is to bring in the expertise of a craftsman, in this case, the trained teachers that were the heart of the Mannian, Glasgow, and Prussian systems.

Coda

I’m not here to criticize the Madras System. In fact, there’s aspects of the system which I believe in pretty strongly. Bell’s insight that students learn best when they teach each other remains as true today as then, and his focus on “doing” rather than simply listening was admirable at a time when lecture was overvalued. At the same time, Gladman’s remarks regarding the rigidity of such systems strike me as an accurate summary of the issues that have plagued such systems since then.

Similarly, I know my history in this area is limited. It’s almost wholly gained from years of watching videos of people making claims that seem odd and then executing some Google searches to see if primary materials support the claims made by smug TED lecturers.

And so I could be wrong here. But after years and years of looking up this stuff I’ve found the more I know, the more it drifts away from this Ron Paul-John Taylor Gatto history of education. And the further I get into this area, the weirder it gets. The personalizers in history are the firm believers in applying factory principles to education. The Prussians are in fact the softies, arguing for teachers as trained craftsmen who can inspire students to think for themselves.

The point Salman Khan fingers as the date factory education began is in fact the date it began to die.

I’m not arguing for the current system, or that the system as constructed isn’t overly authoritarian and geared toward compliance over creativity and inspiration.

I’m not arguing against various forms of personalization, even. I think we ought to be doing more to bring out the unique gifts of our students.

But if my history holds up (and I’ve been looking at this for enough years to think it will) the idea that the history of education is an ages long struggle between the Mannian “factories” and the proponents of “personalization for empowerment” is odd at best, and backwards at worst.

I think history does have lessons for us. But in order to learn them, we have to engage with history in all it’s messiness, not the history of think tanks and TED talkers. If you’d like *that* sort of conversation, feel free to school me in the comments.

Follow

Get every new post delivered to your Inbox.

Join 130 other followers