Some Notes on DokuWiki Setup for Academic Settings: Spam

Still  working with DokuWiki as an educational platform for faculty here at WSU Vancouver. I’ve found a couple things that are worth mentioning, Thought I’d jot them down here. This post deals with spam prevention.

The idea that Dokuwiki wikis don’t get spammed as much as MediaWiki installs is true, but trivially so. You’ll get more than enough spam to clog up the series of tubes that is your website. You’re going to have to lock down the installation.

I’ve experiemented with a couple approaches to this. Here’s some things you don’t want to do:

  • The common “must confirm email” approach is not a long term winner. Plenty of spambots now happily confirm email,  get user accounts, and live happily simulated lives on your wiki discussing the latest medical devices and weight-loss drugs available.
  • Corralling freshly registered users into a “non-editing” user type is also not a great idea. I registered 8 students in my class during class for a wiki project. They then waited while I fiddled around and bumped up their privileges. It’s hard to imagine that process scaling in an academic setting.
  • Similarly, deactivating registration and doing admin panel sign-ups manually is not a pleasant activity either.
  • LDAP then? Ugh. An EXCELLENT feature of DokuWiki. But not really a great option in academia for a pilot project. You’d have to coordinate with IT (which will lead to who knows what). Might be something to explore down the road, but not as you’re getting this off the ground.
  • Visual post CAPTCHAs? Yes, this is a great way to spark a multi-million dollar ADA/Section 508 lawsuit. Avoid.

So what do you do?

  • Set read permissions to “all”. Anyone can read.
  • Set edit to whatever your default confirmed registered user is.
  • This configuration is that everyone can read, but only registered users can edit.
  • Keep the registration link/functionality up.
  • Install the Captcha plugin. Under type, chose “question”
  • Make sure that registered users *don’t* have to do the CAPTCHA. In this configuration, since all non-registered can do is read, the only place the CAPTCHA will be is on the registration form.

This option will ask the student a plain text question of your choice when they register. If they get it right, registration proceeds. If not, it bumps them back.

Here’s where a bit of discretion comes into play. You can take one of two approaches:

  • Make the question a piece of cultural knowledge that students should know — e.g. the name of the dining commons.
  • Make the question “Access Code?” and have them supply an access code furnished by you or the prof.

As I went through “cultural knowledge” access codes, I started to realize how fraught that process was. I can maybe talk more about that later. I also realized what I really wanted was a semi-automated process for WSU staff and faculty not available to outsiders. I decided on the access code with a twist.

Here’s how it works. If you mail from a WSU email account, an autoresponder will send you back the code. If you mail it from a non-WSU account, you get nothing. I do this through setting up an autoresponder on that Gmail address with the code in it, but routing everything not from directly to deletion.

So there you go, that’s my setup. Maybe in a few days I’ll talk about my depressing struggle with various markdown plugins. Or requests… I’ll take requests too.


If Your Product Is So Data-Centric, Maybe It Should Have Data Export?

Yesterday-ish, from Justin Reich:

I was also somewhat surprised to learn that in many systems, it is actually quite difficult to get a raw dump of all of the data from a student or class. Many systems don’t have an easy “export to .csv file” option that would let teachers or administrators play around on their own. That’s a terrible omission that most systems could fix quickly.

A couple years ago, working on an LMS evaluation, I kept getting asked what reporting features each potential platform had. Can this platform generate type-of-report-X? About 8 years ago, working on a ePortfolio evaluation, the same question came up — where are the reports? Does this have report Y?

I’d always point out that we didn’t want reports, we wanted data exports and data APIs that allowed us to generate our own reports, reports that we could change as we developed new questions and theories, or launched new initiatives in need of tracking. The data solutions we’re likely to see have real impact (with no offense to Reich’s Law of Doing Stuff) are likely to come from grassroots tinkering. Data that is exportable in common formats can be processed with common tools, and solutions built in those common tools can be broadly shared. CSV-based reports developed and adopted by Framingham State can be adopted by Keene State or WSU overnight. A solution one of your physics faculty develops can be quickly applied across all entry level courses.

What you want is not “reports” but sensible, easy, and relatively unfettered access to data. And if you don’t have someone on your campus that can make sense of such data, then you need to either hire that person, or give up on the idea that a canned set of reports are going to help you. When fields are mature, canned and polished reigns. But when they are nascent (as is the field of analytics) hackability is  a necessity.


Hacking and Reuse: A Regrouping


Via Clay Fenlason: “Feeling like the time spent to understand WTF is talking about would be well spent, but who has that kind of time?”

Fair enough. I blog mostly for myself, to try and push on my own ideas in front of a relatively small group of people I know who push back. And part of that process is a bit manic and expansive. At some point that’s followed by a more contractive process that tries to organize and summarize. Maybe it’s time to get to that phase.

So I’ll do that soon. What I’ll say in the meantime is that all of this stuff — hybrid apps, storage-neutral apps, federated wikis, etc — is interesting to me because of my obsession with hacking and reuse. Why is reuse so darn hard? Why don’t we reuse more things? What systems would support a higher degree of reuse and sharing, of hacking and recombination? What are the cultural barriers?

There are implications to this stuff far bigger than that, but reuse (and hacking, which is a type of reuse) has been a core obsession of mine for a decade now, so that ends up being the lens.

You go to an event and there’s 50 people taking pictures of it individually on their cell phones, none of whom will share those photos with one another, yet all of whom would benefit from sharing the load of picture taking. There are psychological and social reasons why that’s the case, but there’s also technological reasons for that. Likewise there are brilliant economics teachers who have built exercises and case studies that would set your class on *fire* if you used them — but you’ll never see them.

I’ve been over the several hundred reasons why reuse doesn’t happen, over a period of ten years, It’s not just about the technology, absolutely. But occasionally I see places where reuse explodes, and the technology turns out to be a pretty big piece of that. My wife is a K-3 art teacher. And Pinterest just exploded reuse in that community. Sharing went from minimal to amazing in the space of 12 months. And suddenly she was putting together a much better art curriculum than she could have ever dreamed of in half the time, in ways that had a huge impact on her students.

So — reuse, sharing, networked learning, hacking. I’m interested in the two sides of this: first, we must teach students how to work this way. We have to. And two, we have to get our colleagues to work this way.

What does that have to do with the shift to hybrid apps? With moving from a world of reference to a world of copies and forks? With storage-neutral designs? With the pull request culture of GitHub vs. the single copy culture of OER?  With the move back to file-based publication systems? I’m still trying to work that out. But I think the answer is “a lot”, and a post is coming soon.

This is what I mean by new modes of sharing (Fedwiki meets Dropbox Carousel)

File-based sharing based around pushing copies of good stuff to others. That’s what the federated wiki is about.

For that reason I find newer efforts like this that push files around instead of references to be fascinating. This out today from Dropbox, a new product called Carousel:

Photos of events such as graduations and weddings, Houston points out, are spread over the devices and hard drives of multiple guests. It creates pervasive photo anxiety: People are no longer sure they own the best images of the most important moments in their lives. The app, which becomes available this week for iPhones and Android phones—with a version coming soon for desktops—taps into photos stored on Dropbox and allows users to cycle through them quickly and send images to friends and family, so they can add them to their collections well.

Think about how this changes notions of sharing, and you’ll see it as part of a move towards file-based copy systems, and the pull request approach of a GitHub.

Also, read that paragraph again, and tell me if that doesn’t look similar to the educational materials situation we face everyday.

OK, now imagine your wiki exists in a Dropbox account, and you do the same thing — you flip though all your articles and forward the ones that you think are useful to your various federations. Those get dropped into other people’s own dropbox wikis, and the virtuous cycle continues.

It’s a different way of thinking about things. It’s file based, and it sees copies of things as a feature, not a bug. The storage for your project is not seperate from the sharing features of your project. We let the copies happen and we sort out the mess afterwards.

My argument is not that Dropbox rules, but that this is part of a larger trend that rethinks how sharing and forking works on the new web.  It’s also a potentially a powerful rethinking of how OER could propagate through a system.

Gruber: “It’s all the Web”

Tim Owens pointed me to this excellent piece by John Gruber. Gruber has been portrayed in the past as a bit too in the Apple camp; but I don’t think anyone denies he’s one of the sharper commentators out there on the direction of the Web. He’s also the inventor of Markdown, the world’s best microformat, so massive cred there as well.

In any case, Gruber gets at a piece of what I’ve been digging at the past few months, but from a different direction. Responding to a piece on the “death of the mobile web”, he says:

I think Dixon has it all wrong. We shouldn’t think of the “web” as only what renders inside a web browser. The web is HTTP, and the open Internet. What exactly are people doing with these mobile apps? Largely, using the same services, which, on the desktop, they use in a web browser. Plus, on mobile, the difference between “apps” and “the web” is easily conflated. When I’m using Tweetbot, for example, much of my time in the app is spent reading web pages rendered in a web browser. Surely that’s true of mobile Facebook users, as well. What should that count as, “app” or “web”?

I publish a website, but tens of thousands of my most loyal readers consume it using RSS apps. What should they count as, “app” or “web”?

I say: who cares? It’s all the web.

I firmly believe this is true. But why does it matter to us in edtech?

  • Edtech producers have to get out of browser-centrism. Right now, mobile apps are often dumbed-down version of a more functional web interface. But the mobile revolution isn’t about mobile, it’s about hybrid apps and the push of identity/lifestream management up to the OS. As hybrid apps become the norm on more powerful machines we should expect to start seeing the web version becomeing the fall-back version. This is already the case with desktop Twitter clients, for example — you can do much more with Tweetdeck than you can with the Twitter web client — because once you’re freed from the restrictions of running everything through the same HTML-based, cookie-stated, security-constrained client you can actually produce really functional interfaces and plug into the affordances of the local system. I expect people will still launch many products to the web, but hybrid on the desktop will become a first class citizen.
  • It’s not about DIY, it’s about hackable worldware. You do everything yourself to some extent. If you don’t build the engine, you still drive the car. If you don’t drive the car, you still choose the route. DIY is a never-ending rabbit-hole as a goal in itself. The question for me is not DIY, but the old question of educational software vs. worldware. Part of what we are doing is giving students strategies they can use to tackle problems they encounter (think Jon Udell’s “Strategies for Internet citizens“). What this means in practice is that they must learn to use common non-educational software to solve problems. In 1995, that worldware was desktop software. In 2006, that worldware was browser-based apps. In 2014, it’s increasingly hybrid apps. If we are commited to worldware as a vision, we have to engage with the new environment. Are some of these strategies durable across time and technologies? Absolutely. But if we believe that, then surely we can translate our ideals to the new paradigm.
  • Open is in danger of being left behind. Open education mastered the textbook just as the battle moved into the realm of interactive web-based practice. I see the same thing potentially happening here, as we build a complete and open replacement to an environment no one uses anymore.

OK, so what can we do? The first thing is to get over the religion of the browser. It’s the king of web apps, absolutely. But it’s no more pure or less pure an approach than anything else.

The second thing we can do is experiment with hackable hybrid processes. One of the fascinating things to me about file based publishing systems is how they can plug into an ecosystem that involves locally run software. I don’t know where experimentation with that will lead, but it seems to me a profitable way to look at hybrid approaches without necessarily writing code for Android or iOS.

Finally, we need to hack apps. Maybe that means chaining stuff up with IFTTT. Maybe it means actually coding them. But if we truly want to “interrogate the technologies” that guide our daily life, you can’t do that and exclude the technologies that people use most frequently in 2014. The bar for some educational technologists in 2008 was coding up templates and stringing together server-side extensions. That’s still important, but we need to be doing equivalent things with hybrid apps. This is the nature of technology — the target moves.




The First Web Browser Was a Storage-Neutral App

ONE IMPORTANT NOTE: I’m just toying with this idea, not asserting it at this point. But part of me is very interested in what happens when we view the rise of the app as not a betrayal of the original vision of the web, but as a potential return to it. I don’t see many people pushing that idea, so it seems worth pushing. That’s how I roll. 😉


Apropos of both an earlier post of mine and Jim’s Internet Course. This is a screenshot of the first web browser (red annotations added by me):



The first web browser was a storage-neutral editing app. If you pointed it at files you had permission to edit, you could edit them. If you pointed it at files you had permission to read, you could read them. But the server in these days awas a Big Dumb Object which passed your files to a client-side application without any role in interpreting them.

I never used the Berners-Lee browser, but even in the mid-90s when I was hacking my first sites together Netscape had a rudimentary editor (I was using something called HoTMeTaL at the time, but stilll):


This is still the case with many HTML files a browser handles, but what’s notable here is that in those days a browser sort of worked much like what a storage neutral app would today.  When I talk about having the editing functions of a markdown-wiki client-side in an app, we’re essentially returning to this model.

And think about that for a minute. Imagine what that wiki would be like — you tool around your wiki in your browser editing these Markdown files directly. When someone hits your site in their browser, it lets them know that they should install the Markdown extension, or download the Markdown app to view these things. Grabbing a file is just grabbing a file.

So what happened to this original vision? So many things, and I only saw my little corner of the world, so I’m biased.

  • Publishers: The first issue hit when the publishers moved in. They wanted sites to look like magazines. This accelerated a browser extension war and pushed website design to people slicing up sites in Adobe and Macromedia tools.
  • Databases + Template-based Design: As layouts got more complex, you wanted to be able to swap out designs and have the content just drop in; so we started putting pages in database tables that required server interpretation (this is how WordPress, Drupal, or alomost any CMS works for example).
  • Browser incompataibility, platform differences: People didn’t update browsers for years, which meant we had to serve version and platform specific HTML to browsers. This pushed us further into storing page contents in databases.
  • E-commerce. You were going to have a database anyway to take orders, so why not generate pages?
  • Viruses and Spyware. Early on, you used to download a number of viewer extensions. But lack of a real store to vet these items led to lots of super nasty browser helper objects and extensions, and the fact that you used your browser for e-commerce as well as looking at Pixies fan sites made hijacking your browser a profitable business.

In addition, there was this whole vision of the web as middleware that would pave the way to a thin-client future free from platform incompatibilities. Companies like Sun were particularly hot to trot on this, since it would make the PC/Mac market less of an issue to them. Scott McNealy of Sun started talking about “Sun Rays” and saying McLuhanesque things like “The Network is the Computer“.

In the corporate environment, thin clients are wired to company servers.

In your home, McNealy envisions Sun Rays replacing PCs.

“There’s no more client software required in the world,” he said. “There’s no need for [Microsoft] Windows.”

Sun Rays fizzled, but the general dynamic acclerated. And part of me wonders is it accelerated for the same reasons that Sun embraced it. In a thin client world, the people who own the servers make the rules. That’s good — for the people who own the servers.

This is really just a stream of conciousness post, but really consider that for a moment. In the first version of the web you downloaded a standard message format with your email client, and web pages were pages that could live anywhere (storage-neutral) and be interpreted by a multitude of software (app-neutral).  In version two, your mail becomes Gmail, and your pages get locked into whatever code is pulling them from your 10 table database.  And yes — your blogging engine becomes WordPress.

OF COURSE there were other reasons, good reasons, why this happened. But it’s amazing to me how much of the software I use on a daily basis (email, wikis, blogs, twitter) would lose almost nothing if it went storage neutral — besides lock-in. And such formats might actually be *more* hackable, not less.

It’s also interesting to see how much other elements of the ecosystem have solved the problems that led us to abandon the initial vision. Apps auto-update now. The HTML spec has stabilized somewhat, and browsers are more capable. The presence of stores for extensions gets rid of the “should I install random extension from unknown site” problem — people install and uninstall apps constantly. Server power is now such that most database-like features can be accomplished in a file-based system — Dokuwiki is file based, but can generate RSS when needed and respond to API calls. And, interestingly, we are finally returning to a design minimalism that reduces the need for pixel-based tweaking.

In any case, this post is a bit of a thought experiment, and I retain the right to walk away from anything I say in it. But what if we imagined the rise of apps as a POTENTIAL RETURN to the roots of the web, a slightly thicker, more directly purposed client that did interpretation on the client-side of the equation? Whether that interpretation is data API calls or loading text files?

I know that’s not where we are being driven, but it seems to me it’s a place that we could go. And it’s a narrative that is more invigorating to me than the “Loss of Eden” narrative that often hear about such things. Just a thought.

Teaching the Distributed Flip [Slides & Small Rant]

Due to a moving-related injury I was sadly unable to attend ET4Online this year. Luckily my two co-presenters for the “Teaching the Distributed Flip” presentation carried the torch forward, showing what recent research and experiementation has found regarding how MOOCs are used in blended scenarios.

Here are the slides, which actually capture some interesting stuff (as opposed to my often abstract slides — Jim Groom can insert “Scottish Twee Diagram” joke here):


One of the things I was thinking as we put together these slides is how little true discussion there has been on this subject over the past year and a half. Amy and I came into contact with the University System of Maryland flip project via the MOOC Research Initiative conference last December, and we quickly found that we were finding the same unreported opportunities and barriers they were in their work. In our work, you could possibly say the lack of coverage was due to the scattered nature of the projects (it’d be a lousy argument, but you could say it). But the Maryland project is huge. It’s much larger and better focused than the Udacity/SJSU experiment. Yet, as far as I can tell, it’s crickets from the industry press, and disinterest from much of the research community.

So what the heck is going on here? Why aren’t we seeing more coverage of these experiments, more sharing of these results? The findings are fascinating to me. Again and again we find that the use of these resources energizes the faculty. Certainly, there’s a self-selection bias here. But given how crushing experimenting with a flipped model can be without adequate resources, the ability of such resources to spur innovation is nontrivial. Again and again we also find that local modification is *crucial* to the success of these efforts, and that lack of access to flip-focussed affordances works against potential impact and adoption.

Some folks in the industry get this — the fact the the MRI conference and the ET4Online conference invited presentations on this issue shows the commitment of certain folks to exploring this area. But the rest of the world seems to have lost interest when Thrun discovered you couldn’t teach students at a marginal cost of zero. And the remaining entities seem really reluctant to seriously engage with these known issues of local use amd modification. The idea that there is some tension between the local and the global is seen as a temporary issue rather than an ongoing design concern.

In any case, despite my absence I’m super happy to have brought two leaders in this area — Amy Collier at Stanford Online and MJ Bishop at USMD — together. And I’m not going to despair over missing this session too much, because if there is any sense in this industry at all this will soon be one of many such events. Thrun walked off with the available oxygen in the room quite some time ago. It’s time to re-engage with the people who were here before, are here after, and have been uncovering some really useful stuff. Could we do that? Could we do that soon? Or do we need to make absurd statements about a ten university world to get a bit of attention?


One of the great outcomes of the storage-neutral-app firefight (besides Tom’s  lyrical comment) was Pat Lockley pointing me to the Unhosted site.


As we move from an era of browser-based web apps to one that is increasingly about client-side/server-backed apps, one of the very real concerns people have is whether hackability disappears. Unhosted is very fringe, and not exactly poised for wide adoption. But it’s a great example of how the technological shift we’re seeing can be seen as *more* empowering than what we live with now, if we chose to engage with that shift. Unhosted embraces the “one server per human” concept, but pushes the much of the functionality of these servers up to client apps written in a broswer-based, locally run javascript environment. Apart from the browser-based nature of the apps, the idea is similar to what I was discussing in my post — make the server piece dumb enough that apps using it don’t need special server code. The server, in this case, is just a big dumb resource, like a hard drive, CPU, etc. It doesn’t need to know what app you are running any more than your RAM needs to know you are running Microsoft Word or Reason 7.

It’s a classic seperation of concerns (SoC) solution:

The unhosted web apps we use can be independent of our personal server. They can come from any trusted source, and can be running in our browser without the need to choose a specific application at the time of choosing and installing the personal server software.

When you dig into it, you start to see how radical an idea storage-neutrality is.  Our assumption that because we need 24/7 access to our data via servers we also need to run server code is so deeply ingrained in the public consciousness that when you challenge it people don’t tend to comprehend what you’re challenging. But it’s this idea — that because our data is on Server X our code must be as well — that is at the heart of the corporate control of what Jon Udell calls our “hosted lifebits“.  And if you want the sorts of freedoms people care about, that’s the piece you have to attack.

This is not a “compromise solution”. It’s a much more radical rethinking of what needs so happen. The future is server-backed/client-based apps, one way or another. That can serve to increase our freedom or to lessen it, depending on how we approach the next several years. I don’t really know what the correct answers are, but it seems to me this is the right fight.