BBC “5 Live Investigates” on Books LLC, Sunday night 9pm UTC.

BBC 5 Live Investigates is running a piece on Sunday 9pm (this item likely to go out 9:45pm or so) on Books LLC and similar operations, which sell reprints of Wikipedia articles as print-on-demand books on Amazon.

The researcher called a few UK people for the Wikipedian viewpoint. As it turns out, we have one — more than a few Wikipedians have bought these things, thinking they were hitherto-unknown new printed sources to use, only to discover their own words on the topic! At prices like $50 for a 10,000 word pamphlet, this is a most unpleasant surprise. It’s also caught a number of slightly famous people who were surprised to find someone had “written” a book about them.

The casual reader encountering these things may not be aware of the business model. These are print-on-demand books, compiled by computer from a list of keywords. No copies exist until someone orders one, at which point a single copy is printed and sent. People aren’t generally aware that POD is very good quality these days — you can send a PDF to a machine and have it spit out an absolutely beautiful perfect-bound book for you, of a standard which previously would have been quite pricey. So Books LLC and Alphascript and whoever manage to eke out a tiny profit on single copies, having worked out a way to spam Amazon.

The books are entirely legal — you can use our stuff without permission, even commercially, and “Please, use our stuff!” is why quite a lot of us do this at all. And we have a link on each wiki to make your own PDF book, and various projects have partnerships with printers like PediaPress. So the main issue is that it’s not being made clear that these books are just Wikipedia reprints.

I tried to stay strictly descriptive of consensus, but I think I could clearly say that we would very much like the publishers and Amazon to make it clearer just what these things are. Thanks.

The problem is there’s no direct action we can really take without hampering the good reasons for reuse of our material. Or scaring people off entirely — it’s hard enough getting across the idea of freely reusable content as it is. We can use publicity about this to spread awareness that we’re all about reusing our stuff, as we introduce civilisation to the notion of reusability as being the normal order of things.

thewub points out on foundation-l:

From March 1st it might be worth contacting the UK Advertising Standards Authority, as their remit is being extended then: http://asa.org.uk/Regulation-Explained/Online-remit.aspx Amazon product descriptions almost certainly fall under “non-paid-for space online under [the marketer’s] control”. So a misleading description ought to lead to action. But the issue here is the misleading *lack* of any description. It could be an interesting conundrum for the ASA!

The problem will largely solve itself: if physical copies of Wikipedia articles ever gained any actual popularity, competition would kick in very fast. Even competition on quality would fail, as people sought to design beautiful editions just because they could.

But they won’t. Is the essence of a book the informational content? Is the essence of a book a lump of tree pulp? Is the essence of a book the ideal synergy of the two, creating an object of beauty, wonder and love? The answer people seem to be picking is the first. The Kindle may be a hideously locked-down proprietary money trap, but it’s really quite lovely as a book reader. I read books as PDFs on my netbook, hardly ever picking up my paper copies. A printed general encyclopedia is now a ludicrous idea. “It’s from a printed book!” will soon be as relevant a criterion to sourcing as “It’s on a website!”

Anything made of atoms is a white elephant. I have a four hundred kilogram vinyl record albatross. I will never rip these things, having had a turntable four years and ripped none. Music is digital. Books are digital. Stuff is a curse.

(In that essay, Paul Graham explicitly excludes books from being counted as mere “stuff.” He is wrong.)

Update: Books, LLC responds. They claim Amazon removed Wikipedia links from the listings.

Single point of failure.

Monopoly wasn’t a goal for Wikipedia, it’s something that just happened.

There’s basically no way at this stage for someone to be a better Wikipedia than Wikipedia. Anyone else wanting to do a wiki of educational information has to either (a) vary from Wikipedia in coverage (e.g., be strongly specialised — a good Wikia does this superlatively) (b) vary from Wikipedia in rules (e.g., not neutral, or allow original research, like WikInfo) and/or (c) have a small bunch of people who want to do a general neutral encyclopedia that isn’t Wikipedia and who will happily persist because they want to (e.g., Knowino, Citizendium).

Competition would be good, and monopoly as the encyclopedia is not intrinsically a good thing. It’s actually quite a bad thing. It’s mostly a headache for us. Wikipedia wasn’t started with the aim of running a hugely popular website, whose popularity has gone beyond merely “famous”, beyond merely “mainstream”, to being part of the assumed background. We’re an institution now — part of the plumbing. This has made every day for the last eight years a very special “wtf” moment technically. It means we can’t run an encyclopedia out of Jimbo’s spare change any more and need to run fundraisers, to remind the world that this institution is actually a rather small-to-medium-sized charity.

(I think reaching this state was predictable. I said in 2005 that in ten years, the only encyclopedia would be Wikipedia or something directly derived from Wikipedia. I think this is the case, and I don’t think it’s necessarily a good thing.)

The next question is what to do about this. Deliberately crippling Wikipedia would be silly, of course. The only way Wikipedia will get itself any sort of viable competitor is by allowing itself to be blindsided. Fortunately, a proper blindsiding requires something that addresses structural defects of Wikipedia in such a way that others can use them.

(One idea that was mooted on the Citizendium forums: a general, neutral encyclopedia that is heavy on the data, using Semantic MediaWiki or similar. Some of the dreams of Wikidata would cover this — “infoboxes on steroids” at a minimum. Have we made any progress on a coherent wishlist for Wikidata?)

But encouraging the propagation of proper free content licences — which is somewhat more restrictive than what our most excellent friends at Creative Commons do, though they’re an ideal organisation to work with on it — directly helps our mission, for example. The big win would be to make proper free content licenses — preferably public domain, CC-by or CC-by-sa, as they’re the most common — the normal way to distribute educational and academic materials. Because that would fulfill the Foundation mission statement:

“Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.”

— without us having to do every bit of it. And really, that mission statement cannot be attained unless we make free content normal and expected, and everyone else joins in.

We need to encourage everyone else to take on the goal of our mission with their own educational, scientific and academic materials. We can’t change the world all on our own.

So. How would you compete with Wikipedia? Answers should account for the failings of previous attempts. Proposals involving new technical functionality should include a link to the code, not a suggestion that someone else should write it.

Anyone who advocates advertising on Wikipedia is a drooling moron.

I used to be a big fan of ads on Wikipedia. I changed my mind a while ago, and had my opinion confirmed by what Google did to TVTropes just a couple of months ago.

I wonder what gay and lesbian employees of Google think of this. I haven’t heard one breathe a peep over the fact that any TVTropes page with the slightest gayness is behind a filter. Censorship, it’s insidious.

But! That’s okay! You’ve all got mortgages.

TVTROPES IS THE UNIVERSAL COUNTEREXAMPLE. YOU CANNOT ADVOCATE ADS ON WIKIPEDIA WITHOUT A KILLER CASE FOR WHY THIS WOULD NOT HAPPEN TO US.

Wikipedia having ads would be the worst possible move for the mission: “Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.”

No advertiser in existence would stand for that in practice.

What you see is FOR THE WIN.

I posted to foundation-l concerning the other way to get more people editing Wikipedia: the perennial wish for good WYSIWYG in MediaWiki.

This is a much bigger potential win than many people think. From mediawiki-l in May, a Canadian government official posted how adding a (locally patched) instance of FCKeditor to their intranet got eight times the participation:

In one government department where MediaWiki was installed we saw the active user base spike from about 1000 users to about 8000 users within a month of having enabled FCKeditor. FCKeditor definitely has it’s warts, but it very closely matches the experience non-technical people have gotten used to while using Word or WordPerfect. Leveraging skills people already have cuts down on training costs and allows them to be productive almost immediately.

The geeks refused to believe that not requiring people to wade through computer guacamole worked and that everyone new must be idiots. The poster disabused them of this conceit:

Since a plethora of intelligent people with no desire to learn WikiCode can now add content, the quality of posts has been in line with the adoption of wiki use by these people. Thus one would say it has gone up.

In the beginning there were some hard core users that learned WikiCode, for the most part they have indicated that when the WYSIWYG fails, they are able to switch to WikiCode mode to address the problem. This usually occurs with complex table nesting which is something that few of the users do anyways. Most document layouts are kept simple.

Eight times the number of smart and knowledgeable people who just happen to be bad with computers suddenly being able to even fix typos on material they care about. Would that be good or bad for the encyclopedia?

Now, WYSIWYG has been on the wishlist approximately forever. Developer brilliance applied to the problem has dashed hopes on the rocks every single time. Brilliance is not enough: we’re going to need to apply money.

  • We need good WYSIWYG. The government example suggests that a simple word-processor-like interface would be enough to give tremendous results. So that’s an achievable target.
  • It’s going to cost money in programming the WYSIWYG.
  • It’s going to cost money in rationalising existing wikitext so that the most unfeasible formations can be shunted off to legacy for chewing on.
  • It’s going to cost money in usability testing. Engineers and developers are perpetually shocked at what ordinary people make of their creations.
  • It’s going to cost money for all sorts of things I haven’t even thought of yet.

This is a problem that would pay off hugely to solve, and that will take actual money thrown at it.

How would you attack this problem, given actual resources for grunt work? What else could do with money spent on it?

Magnus Manske, in his usual manner, has coded up a quick editor whose name I’ve stolen for this post. It’s rough, but it’s a nice working example of some of the way there. WYSIFTW. Screenshot.