What does winning look like?

Freeculture.org asks us to share our vision of the future: what free culture looks like in five years.

Imagine your life after five successful years working on your free culture projects. How is your day-to-day existence different? What does a city look like? How have the lives of your parents and friends changed? What does it feel like to live in a more free culture? Does it smell different? Sound different?

They have a wiki page for the collected results. Let’s assume Moore’s Law, by the way: you’re typing your response on a 32-core Opteron with 16 gigabytes of memory on your lap. And it’s not even warm.

And, for that matter, what do the Wikimedia projects look like in 2012? When did we leave Google in the dust? Do governments cower at our name and public broadcasters release everything under CC by-sa? How did we get there? Show your working.

Edit: Dammit, the deadline was July 12! Bah. WRITE ONE ANYWAY.

The expertise problem.

English Wikipedia is allegedly anti-expert. This fails to explain why you can hardly move on the wiki without bumping into someone with multiple degrees, or how it got tagged “unemployed Ph. D. deathmatch.”

I submit that English Wikipedia does not have a bias against experts (although there are editors who clearly do), but that massive collaboration is hard. The main problem is how to work with idiots you can’t get rid of, who consider you an idiot they can’t get rid of. “Assume good faith” is not a platitude, it’s a warning that someone really can be that clueless and that sincere idiocy is ten times as hard to deal with as knowing trolling; it’s a nicer way of phrasing “don’t assume malice where stupidity will suffice.” Summary of the summary: people remain the problem.

Academia has evolved mechanisms to deal with antisocial idiots (throw them out) and antisocial experts (put them to work in a locked room and keep them away from humans); wikis are still working on the problem. Antisocial experts on a wiki — unquestionably expert, unquestionably unable to collaborate on a wiki — are really special. Thankfully they’re usually too weird to then go blogging about it …

How do other wikis cope with this? Other Wikipedias? Citizendium doesn’t seem to have had this yet that I know of, but that could just be early days. Ideas?

Edit: You’re allowed to comment, you know. The same post on my LiveJournal is going great guns!

Wikimedia UK update: we’re affiliated at last!

Wikimedia UK is go! Alison Wheeler has posted an update to wikimediauk-l. Precis: we had to have a quickie AGM so as to have had one within eighteen months of forming the company, but we’ll have a proper General Meeting in September and there will be a way for people to become members of the chapter, and we can actually start doing things as “Wikimedia UK.”

Said you were smart, said it would just take a day of your time.

No-one will ever start a serious general encyclopedia again on the “one smart person writes the whole thing” (Aristotle, Pliny the Elder) or “a bunch of smart people write the whole thing” (Britannia, Brockhaus) models — they’ll use wikis and massive collaboration.

In fact, no-one will ever start a serious specialist encyclopedia on the one-smart-person or bunch-of-smart-people models again, because wikis already do the job much better, much faster.

For general encyclopedias the earlier models are already economically unviable; for specialist encyclopedias they’re not only unviable but just can’t produce as useful results nearly as quickly.

(I haven’t posted this month because Freda has been keeping me busy. No, she doesn’t have a Wikipedia login yet.)

Quick notes.

A Manual of Style for humans.

Our Manual of Style is lengthy, comprehensive and really sucks to try to read or use. Compare to a really readable reference, like Fowler or Strunk & White. Or even Chicago. Have you ever picked up those books and thought “this is really good, I can use this stuff”? I’d hope you had. If you have aspirations to writing better, those books get your brain sizzling.

But, rather than being a guideline for thoughtful application by editors seeking guidance in writing effective encyclopedia entries, our manual of style has become a sequence of programming instructions for bots. So no-one ever looks at it unless they’re looking for (or adding) a stick to hit other editors with.

Our MOS should be something that editors will want to read.

Here‘s my attempt to make the intro readable.

Anyone want to help recast the rest of the megabytes of MOS as thoughtful guidance in English, rather than programming instructions for bots and weapons to be wielded by the antisocial?

Edited to add: From my user page, my personal style guide: We’re writing articles for someone who knows nothing about a topic but needs to get up to speed really quickly. You have ten seconds.

I sometimes picture my reader as a very bright ten- to twelve-year-old. Someone with a good reading age, but who knows nothing yet. Did you used to devour encyclopaedias as a kid?

{{spoiler}} Jesus dies … False ending! He comes back! {{endspoiler}}

  • Science proves that trolls really are a bunch of dicks.

This proves Phil Sandifer‘s deep evil. Superlative call, sir.

I’m slightly surprised, if pleased, at the pent-up hatred for the {{spoiler}} tag’s overapplication. It actually survived a deletion nomination last year, but the arguments for its grossly unencyclopedic nature and direct incitement to violate and defend violations of neutrality this time are much more convincing. Particularly the examples of the sort of misuse its presence fosters — did you know this thing had been placed on Anagram and Kiss? I thought this was the most unthinkingly process-over-product edit (complete with txt spk) I’d seen on the wiki yesterday, then I saw this.

I expect the tag will not be killed utterly, but I do expect its application will be severely curtailed. Someone’s already helpfully noted that if there’s a “Plot”, “Summary”, “Synopsis” or similar header, then, duh, there are going to be plot elements therein. Personally, I’d favour the German Wikipedia’s spoiler warning policy, which Babelfish and I loosely translate as:

When discussing creative works, e.g. books, music, computer games, TV series or films, an encyclopedia’s task is to give a summary of the work and its place in the overall field. Thus, it is natural that the action of a book or a film will be described and discussed in full.

Many books or films lose their attraction, however, if too many details or the ending are revealed before they are read or seen. So it became common on the Internet to put a spoiler warning before such descriptions.

In encyclopedias, however, this is rare. In the German language Wikipedia, after long discussions, consensus developed not to include spoiler warnings, and to remove existing ones. The section which contains a description of the plot should, however, always be clearly denoted, for example by the heading ==Plot summary==.

Why deal with bad policies by nominating them for deletion? Because processes are generally held responsible for their widespread misuse. If the idea is good but the process is bad, the idea doesn’t justify saving the process. (Of course, I expect IAR will quite properly continue to ignore this.) I am enormously pleased that in this case, it was done by direct attention to core policies and detailed demonstration of how it violates those.

As Doc glasgow notes: “I mean that Prince Charming marries the girl is a plot twist you’d never expect ;)”

I HAS A {{SPOILER}}

By Kat Walsh. Based on Moldy nectarines by Roger McLassus. GFDL.

Notability for deletion.

Notability is a contentious notion on Wikipedia. It originally entered Wikipedia jargon on Votes For Deletion (as was) as a euphemism for “I don’t like it.” (I was there and watched this happen. I was one of those saying “rubbish, there’s no such rule.” So of course someone wrote a rule.) It’s an obvious notion — of course we don’t want non-notable things on Wikipedia — but its application is grossly problematic, because it’s so subjective in practice and becomes a hideous source of systemic bias. So inside the wiki people argue endlessly, and outside the wiki it becomes a source of horrible public relations because it’s so obviously subjective and applied subjectively. And it trashes our usefulness for the Long Tail, thus damaging our breadth, one of our greatest strengths.

(I don’t want to seem to be minimising the Firehose Of Crap problem. There are 6,000 deletions every day at present. “Notability” is also a euphemism for a quite justifiable “WHAT THE HELL IS THIS CRAP WHAT ON EARTH ARE YOU THINKING.” Anyone who thinks they’re an inclusionist needs to read all of Special:Newpages. Once should be enough.)

Now, then. The policy on biographies of living people was written in a real hurry after the Seigenthaler fuckup: Jimbo declared “this damn well needs fixing” and it had to be swung. So I wrote the second draft based strictly on neutrality, verifiability and no original research, so as to avoid the peril of sympathetic point of view becoming mandatory. And it stuck. Because these are the three fundamental content policies of the wiki that aren’t up for a vote — if you disagree with them, you’re on the wrong project — it was easy to support an important guideline from the fundamentals.

Your assignment: Construct a useful notion of “notability” using only neutrality, verifiability and no original research. Look to the living biographies policy for how it was done previously. Note in particular: you may not use What Wikipedia is not (especially that “indiscriminate collection of information” one, which is most often explained in terms of phone books but applied in practice as a euphemism for “fancruft”). You may only use the three fundamental rules on content.

Tubgirl is Love.

An English Wikipedia admin account just got compromised and abused again, because the admin used “fuckyou” as a password. That’s the sixth most common password, I think. The main page was deleted for five minutes and Tubgirl was put in the sitenotice.

Brion and Greg are (right now) running a password cracker over the admin accounts. If you want to keep your admin bit and know, deep in your heart, that your password is a bit rubbish, I strongly suggest changing it or it will be locked. Hint: if it shows up in Google, it’s a rubbish password. Or enter it into the search box at the right of this page with your username — I have a, uh, phishing detector running there. Yes, that’s it. A note on the subject has been added to Wikipedia:Administrators.

Now we eagerly await Single Crack 0wnz0ring. Normal people just don’t get passwords. I used to do dial-up Internet tech support. “What do you want for a password?” “Oh, [username].” “I’m sorry, you can’t have it be the same.” “Oh, [username]1.” Suggestions? Assume we can’t require an RSA keyfob for all editors.

TEH ILLEEEEGIL NUMBAH WILL EAT J00R AAAASSSSSS.

A flashmob of fight-the-power morons are still spamming an allegedly illegal number into every input box on the web. The Wikipedia admins collectively declared “FUCK OFF YOU SPAMMERS.” (Some have gone rabid “ZOMG LAWSUIT” and we were getting a pile of oversight requests as well — I didn’t zap, Fred did, until Erik told us not to. Mind you, it nicely short-circuited the idiotic deletion review.) Eventually it was put into the spam filter, because distributed spam is spam.

We’re a project to write an encyclopedia, not a public graffiti wall. You want to paint “09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0” in fifty-foot high letters on every Hollywood studio, I’ll buy brushes. You want to splatter it across Wikipedia, you can fuck off. I expect the article will contain the number in due course; I’d guess two to four weeks, any earlier would in my opinion only encourage further use of Wikipedia as a graffiti wall.

Immediatism is the greatest curse of our popularity and blatantly interferes with the far from finished encyclopedia project. Wikimedia has a newspaper. No candy for you. You come back, one month.

Update: Wikipedia:Keyspam.

Revealed! Why the community is on crack.

The problem with Internet-based projects is that they form groups of humans, and a group is its own worst enemy. That’s a marvellous essay by Clay Shirky, who’s on the Wikimedia advisory board for good reason. When I read it I was just nodding my head and going “yep” over and over. A community (Internet or not) has a life cycle. It starts, it’s good for a while, it chokes itself or falls away. I’ve seen this time and time again.

On Wikipedia, the community is not an end in itself but has grown around a purpose. The English Wikipedia’s interesting community problems are an emergent phenomenon, not Wikipedia or Jimmy Wales doing something wrong.

(Not to mention the flood of people for whom this is their first online community, who haven’t experienced the cycle even once. We have enough trouble enculturating Usenet refugees and their robust interaction style.)

Larry Sanger is trying to work around this on Citizendium, as advised by Shirky’s main source, Wilfred Bion‘s Experiences In Groups: group structure is necessary. Robert’s Rules of Order, parliamentary procedure and so forth. The obvious risk is killing the best in favour of steadiness.

Shirky notes: “Constitutions are a necessary component of large, long-lived, heterogenous groups.” I’ve long spoken of Wikipedia’s fundamental policies — neutrality, verifiability, no original research; assume good faith, no personal attacks, don’t bite the newbies — as a constitution, and said that any process that violates them must be thrown out. The catch being there’s not yet a way to enforce that.

One thing Shirky strongly points out: “The third thing you need to accept: The core group has rights that trump individual rights in some situations. This pulls against the libertarian view that’s quite common on the network, and it absolutely pulls against the one person/one vote notion. But you can see examples of how bad an idea voting is when citizenship is the same as ability to log in.” You would probably believe the outrage when I applied the phrase “one moron one vote” to Requests for Adminship, the prime example on English Wikipedia at present of a group that’s being its own worst enemy. Worse than Articles for Deletion. (The reason people form into insular groups that defend one moron one vote is that the groups then attain local “core” status and feel they can get some work done. This is why new committees keep popping up.) The trouble is then squaring this with not being exclusionary toward the newbies.

(And you’ll see Shirky’s 2003 essay speaking of Wikipedia as a project that’s dodged that one. Whoops.)

The Tyranny Of Structurelessness by Jo Freeman is one of my favourite essays on emergent hierarchies: if you pretend there’s no hierarchy, one will emerge out of your sight and bite you in the backside. (I’m unconvinced its solutions, particularly electing everyone, are directly applicable here — just about every process on English Wikipedia even resembling a vote rapidly turns into an insular committee or a lynch mob.)

Some consider cabalism on English Wikipedia the source of all problems. Unfortunately, with 4330 frequent editors and 43,000 occasional editors each month, no-one is going to know everyone. So people will cluster with those they do know just to get anything done.

The people who do work on a project will usually ignore idiocy until it gets in their face. In the Linux world, the kernel.org lists resolutely ignore the baying fanboy cat piss men, and Linus Torvalds remains project leader by acclaim. The LambdaMOO solution in Shirky’s paper may be the best option: the wizards return and lay the smackdown. Let’s start with shooting all rules that violate the above six constitutional basics. So who are the wizards?

How to keep the community focused on the point of the exercise? What level of control does one apply to keep the project on track without killing off the liveliness? How would you apply Shirky’s findings?

Is there a sociologist in the house?

(Other useful responses on the social networking site. From people I mostly know from Usenet.)

SEO spammers and Googlemancers.

Dear SEO spammers and Googlemancers: go away. We actively don’t care about your page rank.

(That TechCrunch article is really special: make several errors of fact, assume they come from malice and start a conspiracy theory.)

Our responsibility as a top 10 site is to our readers. Our responsibility is not to a third party (search engine optimisers) to make them look good to a fourth party (Google). People whose interest in Wikipedia is page rank are in no way, shape or form our constituency. Because their interest is, fundamentally, spamming.

Pagerank is not a consideration for Wikipedia — it contributes nothing to the project of writing an encyclopedia. This is why SEOs and Googlemancers find it so hard to find anyone at Wikipedia or Wikimedia who cares.

The interwiki map is for the convenience of the projects. Not for the SEO spammers.

This post is fair use under the “I wanna” clause of US copyright law.

I’m a staunch defender of fair use on the English Wikipedia: talking about things requires being able to quote them, and that applies as much to images as to text.

To this end, I’ve been removing a lot of the ridiculous abuses. Orphaning and later deleting a lot of fair abuse — one screenshot is fair use, ten is taking the piss and “fair use” galleries violate copyright, not just policy — not to mention resizing. No, you don’t need a 1500×1000 PNG for a 200×300 thumbnail. I need a bot to resize high-resolution fair abuse.

Today’s grand missing the point was {{User no GFDL}}, whose text was: “This user would prefer not to use free images if there are better fair use ones available.” And never mind little details like the Wikimedia Foundation licensing policy and mission statement. Here’s the deletion discussion, before I came to my senses and zapped the horrible thing, the comment on my talk from its aggrieved creator and the ensuing deletion review.

Perhaps I should be sweeter and fluffier to people, but I find myself unable to rightly apprehend the confusion of ideas involved. How to get someone from there to here in less than geological time?

A modest proposal.

Just posted to foundation-l:

How about using the old domain, wikipedia.com, as a site for stable Wikipedia versions, with ads on? The ad money, as well as paying our comparatively small hosting and staff costs, could go toward educational programmes for those people who could benefit from our hard work but aren’t comfortable, well-fed first-world citizens.

(As far as I can tell, pretty much all opposition to ads on Wikimedia comes from people who are in fact comfortable, well-fed first-world citizens who have no problem accessing this material at all. Including opposition on the new thread. I have asked for demographics otherwise and eagerly await any.)

The thread is ticking along nicely, with ideas on how to, why not to, alternatives and of course a ton of ideas on what we could actually do with BUCKETS OF CASH.

Update: I have since changed my mind.

Cleaning up your crap.

OTRS future burnouts habitués frequently declare that the sky is falling, particularly with regard to biographies of living people — they see nothing but the complaints. (The actual problem is likely not nearly as bad, though it still needs urgent attention.) To help, Messedrocker has compiled a list of ill-referenced living bios — you are heartily invited to dive in, reference or gut and cross another name off the list. (The list is in order of article creation — start at the end.) Then you can make a smug post with the title “Sourced, bitch.” and the content being just a list of diffs. Or just wreak havoc on a string of deserving deletables.

I’ve been doing lots of admin stuff this weekend. As a staunch defender of the value of fair use — to discuss something, quoting images is as necessary as quoting text — I’ve been having lots of fun lately going the hack on abuse of the excuse in contravention of policy and indeed copyright. The kids want their candy, and it’s my job and pleasure to take it away from them. And, don’t forget: you can replace any fair-use picture of a living person on English Wikipedia with Image:Replace this image1.svg and it’ll turn into a direct invitation to upload a genuine free content image they actually own themselves.

(I’ve also just unpacked twenty years’ photos and have been scanning and uploading my own replacement free images. If I can, you can.)

Let’s you and him fight.

Wikipedia a force for good? Nonsense, says a co-founder: “The founder of the Wikipedia online encyclopaedia criticised the Education Secretary yesterday for suggesting that the website could be a good educational tool for children.”

(Larry Sanger says on his blog that this was the media going “let’s you and him fight” with an out of context quote. He meant our governance is broken … which a fair few Wikipedians agree on.)

I got calls from the BBC and the Press Association. I didn’t play up to the “let’s you and him fight,” but did note that:

  • Citizendium is more free content and therefore a good thing (per the WMF’s mission, no less) as it helps validate the model and open content in general.
  • They’ve got a good community and seem to have started well.
  • There’s certainly got to be more than one way to do this.
  • Wikipedia is not “reliable”, and the best way to use Wikipedia in schools is for the teacher to teach the kids critical reading. Wikipedia is good if you think. Same for Citizendium, Britannica, autobiographies, blogs and newspapers.

The BBC wanted a telly piece, so I went to the Borders in Oxford Circus, and Borders kindly let the BBC film there. The interviewer, Rory Cellan-Jones, asked me the same question about reliability three or four times until I got it down to a nice soundbite.

They filmed a few walking-around bits in the reference section. Oddly enough, Borders don’t sell printed encyclopedias any more. We decided the Oxford dictionaries would be suitable (I mentioned how the OED used a model like ours starting 150 years ago — volunteer contributions).

This should be on BBC1 six o’clock news this evening. Probably a seven- to ten-second clip of me. That took an hour to make. Maybe I might actually not end up cut this time!

Edit: And a call just now from Andrea from Computeractive. I’ve got it down to two minutes now, each sentence repeated twice.

Edit 2: 15 seconds of fame! About 6:22pm BST. RealVideo stream. My head is way too shiny.

Disaster recovery planning.

The Wikimedia Foundation is in no danger of collapse. There’s all sorts of deeply problematic things about it, but no more than at any other small charity. Situation normal all fouled up.

But it would be prudent to be quite sure that the Foundation failing — through external attack or internal meltdown — would not be a disaster.

The projects’ content: The dumps are good for small wikis, but not for English Wikipedia — they notoriously take ages and frequently don’t work. There are no good dumps of English Wikipedia available from Wikimedia. (I asked Brion about this and he says the backup situation should improve pretty soon, and Jeff Merkey has been putting backups up for BitTorrent.)

The English Wikipedia full text history is about ten gigabytes. The image dumps (which ahahaha you can’t get at all from Wikimedia) are huge, as in hundreds of gigabytes. It’ll be a few years before hard disks are big enough for interested geeks to download this stuff for the sake of it. What can be done to encourage widespread BitTorrenting right now?

The easiest way for a hosting organisation to proprietise a wiki, despite the license, is simply not to make dumps available or usable. And to block spidering the database fast enough to substitute. This is happening inadvertently now; it would be too easy to do deliberately.

Who are you? The user-password database is private to the Foundation, for obvious good reason. But I really hope the devs trusted with access to it are keeping backups in case of Foundation failure.

In the longer term, going to something like OpenID may be a less bad idea for identifying editors.

Hosting it somewhere that can handle it: MediaWiki is a resource hog. Citizendium got lots of media interest and their servers were crippled by the load, with the admin having to scramble to reconfigure things. Conservapedia was off the air for days at a time just from blogosphere interest. Who could put up a copy of English Wikipedia quickly and not be crippled by it?

Suitable country for hosting: What is a good legal regime for the hosting to be under? The UK is horrible. The US seems workable. The Netherlands is fantastic if you can afford the hosting fees. Others? (I fear languages going to the countries they’re spoken in would be a disaster for NPOV.)

Multiple forks: No-one will let a single organisation be the only Wikipedia host again. So we’ll end up with multiple forks for the content. In the short term we’ll have gaffer-and-string kludges for content merging … and lots of POV forking. A Foundation collapse would effectively “publish” wikipedia as of the collapse date — or as of the previous good dump — as the final result of all this work.

(The English Wikipedia community could certainly do with a reboot. Hopefully that would be a benefit. It could, of course, get worse.)

In the longer term, for content integrity, we’ll need a good distributed database backend. (There’s apparently-moribund academic work to this end, and Wikileaks note they’ll need something similar.)

Worst case scenario: A 501(c)(3) can only be eaten by another 501(c)(3), but the assets of a dead one (domains, trademarks, logos, servers) can be bought by anyone. Causing the Foundation to implode could be a very profitable endeavour for a commercial interest, particularly if they smelt blood in the water.

Second worst case scenario: The Wikimedia Foundation’s assets (particularly the trademarks and logos) go to another 501(c)(3): Google.org. Wikipedia’s hosting problems are solved forever and Google further becomes the Internet. Google gets slack about providing database dumps …

What we need:

  • Good database dumps more frequently. This is really important right now. If the Foundation fails tomorrow, we lose the content.
    • People to want to and be able to BitTorrent these routinely.
  • Backups of the user database.
    • A user identification mechanism that isn’t a single point of failure.
  • Multiple sites not just willing but ready to host it.
  • Content merging mechanisms between the multiple redundant installations.
    • A good distributed database backend.
  • The trademarks to become generic should the Foundation fail.

I’d like your ideas and participation here. What do we do if the Foundation breaks tomorrow?

(See also the same question on my LJ.)

Correction: Google.org is not a 501(c)(3). So it couldn’t gobble up Wikimedia directly.

Hanging on the telephone.

Wikimedia UK.

I spent Thursday evening and the weekend on Wikimedia UK stuff.

  • Alison Wheeler, the chair, tendered her resignation on Thursday, in the hope of unsticking things. We had an emergency general meeting that evening where we declined the resignation and passed a vote of confidence in her.
  • The old treasurer, Jon Garrett, has been removed for inactivity (we don't even have a bank account right now) and uncontactability.
  • Arkady Rose has been drafted to the board and tagged as treasurer for the crime of flagrant cluefulness in public places. (Note also that Alison and James Forrester did the drafting, not me … though them drafting my girlfriend enhances the cabalism nicely.)
  • James got the address and paperwork updates submitted to Company House.

Now to start doing stuff with it again … particularly pushing the charity registration through. And starting the bank account process again. TRA LA LA LA LA! Minutes should be posted real soon, won't they, James.

Open Wiki Blog Planet grows apace, and the official Planet Wikimedia is open for feeds.