Archive for the ‘wiki’ Category

Firefogg video transcoder plugin for Firefox.

Wednesday, January 28th, 2009

Wikimedia only accepts video in Ogg Theora format, because it’s not patent encumbered (and Dirac is not yet ready for prime time). Nothing produces this by default. Fortunately, Firefogg will do the job for you on Firefox 3.1 (which is that cool, by the way). Brianna Laugher’s posted (1, 2) a couple of useful guides to the fancy media stuff we’re doing.

The Mozilla Foundation has also given the WMF $100,000 to improve Ogg Theora. The goal is to get it as good as H.264. This is actually considered feasible.

(And MPEG LA plans to charge for H.264 encoding after December 31st, 2010. Some people are going to get a horrible lesson in why encumbered formats are a problem.)

Take that, Apple, Nokia! Lying arseholes.

Flagged revisions media hype.

Tuesday, January 27th, 2009

The media thinks the flagged revisions proposal for English Wikipedia is just the thing for the hype machines. I did Chris Evans BBC2 drivetime today. Hopefully not too oversimplified. (Cheers to Chris Down for transcript.) It behooves those of us all for it to make damn sure the edits are patrolled really fast.

Update: Mathias Schindler on BBC World Service Europe Today.

On dealing with the press.

Wednesday, December 31st, 2008

When Wikipedia was little (when I started in early 2004, we were #500 in the world. I was seriously impressed), and I was just someone who volunteered to answer a UK press enquiry then another one, we were in the technical press a lot.

The trouble with the technical press is that they are whores. Cheap diseased ones. (The press in general arguably is, but the tech press are so blatant.) Previously whores to print advertisers, now whores to ad-banner trolling. So unsubstantiable bullshit is the order of the day, because IT GETS THE CLICKS.

Some of you aren’t whores, but you know damn well you’re few and far between. The rest can fuck off, thanks.

Wikipedia should have ignored the tech press from the start. You should too. Taking someone seriously just because they pay you attention is not a good idea.

It’s so much nicer dealing with the mainstream press — at least they can spell “journalism.” They can’t work computers, but anything you can’t explain in a difficult-to-corrupt soundbite you can’t explain.

WHAT A PIP.

Wednesday, December 10th, 2008

Channel 4 was fun. I recorded a sequence of soundbites for them to pick’n'choose from. News story with video — their Flash player is crappy, crappy, crappy shit and doesn’t buffer. I look dead. “UNBLOCK WIKIPEDIA OR TEH INTARWEB ZOMBIES WILL EAT YOUR BRAINS.” The IWF head came across as a curtain-twitching weasel.

A small amount of gleeful dancing on the skulls of the IWF today. Next trick is to get a better UK press corps together for Wikimedia. Mostly it’s reserve duty. I’m really annoyed we didn’t have someone spare for Sky News (I was doing Channel 4 at the same time) as the reporter was actually technically clueful. Any Wikipedia editors here who think they could talk in soundbites on telly if needed?

By the way, the really massive IWF fail, which didn’t come out in the press coverage: they blocked the page about the album, and they blocked the page for the image, but they didn’t block the image itself.

The next question is what happens next. The filtering infrastructure melted when faced with filtering a top 10 site; they’re not going to give up and go away, so I expect them to (a) beef it up (b) make it less evident — we spotted this by the collateral damage.

I also predict a flood of helpful citizenry going to the IWF reporting page and entering any image on any top 10 website that might be “potentially illegal.” Much as the head of the IWF is “potentially a fabulous drag queen.”

I wasn’t aware before going on it just how important Radio 4 Today is. In fact, I’d barely heard of it. Do I get deported now?

Today show transcript

Monday, December 8th, 2008

Right-click, download mp3 here. MP3 and transcript below (by TRS-80, cheers!) are copyright BBC; if they object I’m sure they’ll let me know.

I’m on Channel 4 News and More4 News tonight, possibly another thing too. HOLY CRAP.

Broadcast on 2008-12-08, on BBC Radio 4‘s The Today Programme between 08:54:11–08:59:56 (UTC)

James Naughtie
The time is six minutes to nine. Another curious story about censorship is spreading across the internet, it’s all about a page on the online encyclopaedia, Wikipedia, about a heavy metal band of 25 or 30 years ago, The Scorpions. The image at the centre of the story is a record cover from an album of the early eighties featuring a picture of a naked child, which the Internet Watch Foundation says could be illegal. Now the foundation is a watchdog funded by the internet services industry, but Wikipedia says that it’s unacceptable censorship. Susan Robertson speaks for the Internet Watch Foundation, and David Gerard is here, he’s a volunteer media spokesman for Wikipedia in this country. Susan Robertson, now what is it that’s lead you to think that this, which after all appeared in a publicly available album cover 35, 25 years ago, is now illegal?
Susan Robertson
Good morning. We received this report last week at the Internet Watch Foundation and then assessed it according to our normal channels, which is it’s reviewed by our team of analysts, in conjunction UK law enforcement.
Naughtie
So somebody simply said “Have a look at this, because it looks to me as if it’s over the top” ?
Robertson
Exactly, the Internet Watch Foundation is the UK hotline providing just that service; if the public are worried they’ve stumbled across content which might be illegal, they can report it to us. Our job then is to assess that content then and trace it and indeed our assessment last week was that the image in question was indeed a potentially illegal child-sexual abuse image.
Naughtie
So, which law would it contravene?
Robertson
It’s the Protection of Children Act 1978.
Naughtie
Right. So, in fact would it have been illegal, do you think, if someone had complained at the time? …I can’t remember when that album came out specifically…
Robertson
Yeah, I under the album came out before that date, of course it’s an important issue—we’re applying today’s standards and today’s legislation to the reports we’re receiving today. Obviously, y’know, this is an old image.
Naughtie
David Gerard, speaking for Wikipedia, what do you make of that?
David Gerard
The album was issued in 1976, it’s been available continuously for 32 years. The album cover was changed because some various people told the band “this is a stupid and crass image”, which it is, and I’m not questioning it’s a tasteless image, but that’s quite different from illegal. You can still buy the record with this image in The Scorpions box set, in any high street. There are many other record covers available; Blind Faith by Eric Clapton, Houses of the Holy by Led Zepplin, Nevermind by Nirvana which feature naked underage people. None of these albums are illegal, you can go into high-street record shop and buy them. You can see this image—the image of the album Virgin Killer by The Scorpions—on the Amazon website right now. When we asked the Internet Watch Foundation why they blocked Wikipedia and not Amazon, apparently their decision was quote “pragmatic” unquote, which-
Naughtie
right
Gerard
-we think means that Amazon have money and would sue them, whereas we’re an educational charity and-
Naughtie
Well Susan Robertson, can I just put that point to you; if it’s proper to block Wikipedia, it’s proper to block Amazon?
Robertson
Absolutely; we only act on the reports we receive, and as I understand it, the only report we received regarding this content as of Friday was the content on Wikipedia-
Naughtie
So you’ll go for Amazon will you?
Robertson
We need to take a view today, obviously we need to look at the reports that have come in over the weekend, I know there’s been a lot of activity as you’ve said on the internet. We need to take a view with our analysts here and with our police partners.
Naughtie
Yes, indeed, but you confirm it isn’t a question of how much money somebody’s got, if it’s a principle, it’s a principle and it applies to Amazon as well as to Wikipedia.
Robertson
Absolutely. We process about 35,000 reports every year, only about a third of those are confirmed to be potentially illegal, as such they’re all treated the same.
Naughtie
What can you do, many people are concerned about you know, the consequences of the freedom which they value on the internet, and a lot of people think that the Internet Watch Foundation is a sort of guardian for them, but what can you actually do?
Robertson
What we do do is do our very best to ensure that the only content that is inaccessible is the specific content, including illegal images, so how we block… I mean, our main function is a hotline—we’re also a take-down body for illegal content when it’s hosted in the UK. But if it’s hosted abroad-
Naughtie
There’s nothing you can do.
Robertson
There is, and our industry members have asked us to provide them with a list of specific URLs, which we do. All the URLs, which as an individual webpage are live, and they’re depicting child-sexual abuse images.
Naughtie
David Gerard, everyone will know, most people will know if they use the internet, about Wikipedia, and how it works and what source of information—occasionally disinformation—it is. Do you object to the idea that there is someone out there, funded by the industry, who can take down something which is regarded as so offensive, or potentially illegal that it goes beyond the boundaries.
Gerard
Nobody objects to the IWF blocking actually illegal content; that’s what it’s for. What they object to in this case is they blocked an image that is not illegal, that has not been found illegal anywhere in the world, that has been …it was investigated in America by the FBI in May after a complaint by a fundamentalist Christian group, who told them to go away. The IWF also censored the text—what the issue in this case is they censored encyclopaedia text on the number four website in the world. This is the biggest website the IWF has ever blocked, and we think it was an experiment to see what they could get away with, without people noticing.
Naughtie
From IWF point of view, last world Susan Robertson, was it an attempt to see what you could get away with?
Robertson
It was absolutely not an experiment, we don’t experiment. Look, we do our job in good faith, we apply the Protection of Children Act, and the UK sentencing guidelines-
Gerard
-Blocking text?
Robertson
We’ve only blocked the URL that contains the page-, the image.
Naughtie
Susan Robertson, David Gerard, thank you both.

UK censorship of Virgin Killer sleeve and page on Wikipedia

Sunday, December 7th, 2008

Facebook group against this; Pledgebank ISP boycott; Wikinews story

I’m going to be on the BBC Radio 4 Today show tomorrow at 8:20am about this. IWF people present.

The technical press are swarming. The story’s being touted to the national press.

The IWF apparently sought the advice of police before blocking. Now, the police in the UK are notorious for trying it on with censorship cases, so that doesn’t mean the image is illegal.

The album was released in 1976; child porn was illegalised in the UK in 1978. If the album was distributed in the UK since 1978 with that cover, it’s probably legal.

The album cover has been reprinted in many books. Most of those books are in the Briitsh Library. Are those now obscene?

Question for all: Has this precise image ever come to court? In the UK, in the world?

The IWF had it pointed out that they were censoring encyclopedia text, which was clearly not illegal. The IWF responded that they needed to block the page to block the image effectively. This is of course utterly ludicrous bollocks, but apparently that’s the advice the IWF have received.

They were also asked if they’d be censoring Amazon as well. They said they’d have to get back on that one.

It’s the clbuttic error, but this time on a top-10 site for everyone.

Oh, and Blind Faith by Blind Faith, Houses of the Holy by Led Zeppelin and Nevermind by Nirvana, also depicting nude underage persons, are still readily available in any high street CD store in the UK.

It is clearly false that all images of an unclothed person under the 18 is automatically child porn and illegal in the UK. However, that’s the rule the IWF works to.

Like DRM, if anyone works out there’s an IWF and how it works, then they’ve already lost. They’re tolerated precisely as long as they target only clearly illegal material. Here, they’re expanding their remit.

Disclaimer: I do press for Wikipedia/Wikimedia in the UK as a volunteer (and I’ve been on my email and phone all last night to about 2am and today since 9am). However, I am not a WMF employee and cannot legally claim to speak for them, only as a volunteer editor.

Selling the dream.

Monday, November 17th, 2008

The marketing sense of the word “evangelism” was popularised by Guy Kawasaki in his time at Apple. Guy wrote a book on the subject, Selling the Dream (Amazon link) about how he did it at Apple, which I highly recommend. Guy also blogs chronically about this sort of thing. I need to find my copy again … its in a box somewhere. (My house: “Where’s x?” “It’s IN A BOX!” One day everything will be unpacked … then we’ll probably move.)

Read this post: The Art of Evangelism. Apply all ten steps to Wikipedia and Wikimedia.

(This suggests to me that free software “gateway drugs” really work, e.g. Firefox, OpenOffice, GIMP. Users care about applications; once they’re using all-free applications, swapping the OS out from under is easy and they sudenly discover their battery life has doubled from not running an antivirus. Sysadmins are already used to swapping Windows out from under a stack, putting Linux in its place and vastly improving performance. We use Wine for this on business-critical systems at work. This suggests that Microsoft’s drive to make Windows a first-class platform for open source software will in fact shoot them in the foot. I’m sure they have a game plan that says it won’t, but I still can’t see what it might be myself.)

To unsubscribe from this list, pick up the phone and shout “STOP FOLLOWING ME!” to the dial tone.

Saturday, November 15th, 2008

You know you’ve made it when you get onto the special mailing list that includes the FBI, CIA and White House. (The place, not the band.)

DANGER!
Clicking here may induce
bleeding from the eyes.

Why we do this.

Friday, September 19th, 2008

“Can you imagine what work life would be like if one of the conditions for promotion was you had to give away everything you knew to people who could use it for their growth and development, and you had to reach out and help people to be successful, and you had to demonstrate you were open to being helped by others in your own pursuits.”

FayssalF suggests that the sentence “We agree to make this a place where we extend a hand to each other” be put at the top of every talk page.

News of the News.

Thursday, August 28th, 2008

By the way: for those who’ve missed my UnNews, I’m now writing them for (vanishingly small amounts of) money for today.com, roughly one a day. Read News of the News and join the daily alert email. Here’s one for the Wikipedians.

Update: I have moved my stuff to my own site, newstechnica.com.

Forget the writers.

Wednesday, August 27th, 2008

Knol is Google trying to recreate Squidoo or Helium, not an encyclopedia. Wikipedia is #8 on Alexa, Squidoo is #431, Helium is #4999 and only Google knows how well Knol is actually doing. I mean, I was incredibly impressed when I first joined Wikipedia in early 2004 that it was #500. But nevertheless. At least about.com makes #86.

(In fairness, Google has never pushed Knol as a Wikipedia killer; that’s entirely a media-created synthetic controversy.)

There’s hardly a “Wikipedia replacement” that hasn’t started from trying to make a welcoming environment for authors. Wikipedia, however, is popular because it’s what readers want. Writers are important, but way less so than the readers.

I’ve seen very few Wikipedia replacements or even forks that aim primarily at creating a better resource for the reader, and leave the rest to happen. Citizendium is the only one that springs to mind — CZ is very reader-oriented, and slowly accumulating lots of good stuff. It also expressly tries for good writing, unlike Wikipedia.

If readers wanted ten articles on one topic, they’d just click the first ten Google hits. It’s like metasearch engines that gave you results from ten bad pre-Google search engines in the hope you might find a damn thing, when the real answer was one search engine that didn’t suck. Tell you what, the main value of Cuil is to explain to the kids how bad search engines were before Google got it right. One good resource kills ten mediocre resources.

Leaving the editors to battle it out to collaboratively create the one article on a topic appears to have worked to give readers the simple quick reference site they actually want to use. Inherent unreliability and all. Discuss.

Freebase.com: your second hit is also free.

Tuesday, August 19th, 2008

I’m sitting here with my dear friend Kirrily Robert of Freebase. Her office is being remodelled, so decided to work from home in London for a week. We hung out with the geeks and drank to excess on Sunday (Kirrily says she drank to “sufficient”), so today we’ve been geeking Freebase and Wikipedia and social content creation and so forth.

Freebase is a collection of structured data, with little or no notability barrier. (Spam is fine if it’s structured data!) The differences from Wikimedia are that (a) it’s all CC-by (b) it’s run by a company, not by a charity. The differences from Google Base is that (a) you can do mashups of every data table with every other data table (b) they don’t want your private data (unless you want your daily calorie counts available forever under CC-by).

I didn’t think it was way cool until she showed me David Huynh’s Freebase Parallax demo video. I most strongly urge you to watch this.

Advancing Freebase is in line with Wikimedia goals, as it’s useful free content (and the dumps work). The really good thing you can do is: if you’re getting someone to release a bunch of data, do your damnedest to get it under CC-by or public domain. That way we can have it and they can have it and everyone can have it.

The other thing we rambled about was the social structure of the thing. At the moment Freebase’s Alexa rank is about 47,000; socially it sounds like Wikipedia in 2002. The key point is that in a public participatory content production project, people are all your problems and this is not susceptible to quick fixes, technical or social. Just so she knows what they’re in for.

London readers: there’s a Freebase meetup at the Yorkshire Grey pub in Holborn from 6:30pm.

And I thought it’d be real ale.

Saturday, August 2nd, 2008

Fuel your editing! Keep a tall, cool glass of Wikipedia juice to hand. Just the thing with stir-fried Wikipedia, congo eel with Wikipedia or just a couple of slices of Wekipedia toast.

Please test Theora in Firefox nightlies.

Thursday, July 31st, 2008

Ogg Theora and Ogg Vorbis support for the HTML5 <video> element has landed in Firefox Minefield nightlies (3.1a2-pre). This is big news because it means a standard way of displaying video in web browsers will be available to all without being stuck with Flash. And Theora is the only accepted format on Wikimedia Commons. Posts: Greg Maxwell, Christopher Blizzard, Chris Double, Gervase Markham.

What we need is people to test this. So please download a copy of Minefield, test it thoroughly on Wikimedia Commons video, beat on it, thrash it, report bugs. There’s plenty. You need to load the video, click “More …” and it’ll give you the option. Wikimedia would very much like to make it a first option rather than a last one, but first it needs to be better (more functional and stable) than loading Cortado with Java.

Apple and Nokia tried some truly disgusting FUD around the topic and successfully got the words “Vorbis” and “Theora” taken out of the HTML5 spec, but Firefox adoption means 20% of Web users in short order. So we can leave them to play catchup per business needs. “You got a Nokia? No wonder you can’t watch that Wikipedia video, Nokias suck.”

WordPress 2.6, yay w00t bah.

Monday, July 21st, 2008

The Wikimedia blog is still on WordPress 2.5. I moved three blogs on my own site — this one, Cyber Chatelaine and Rocknerd — to 2.6 today. The first two were fine because they were all but unmodified from the standard install; the first has lotsa tweaks and extensions, and I had a marvellously annoying time this afternoon fixing it up and I’m still not finished. Grah. I have recommended severe beta testing for the Wikimedia blog before updating.

London Wikimeet 11: Penderel’s Oak, 11am Sunday 13 July.

Saturday, July 12th, 2008

Note the early start time! 11am. (So people going to Wikimania can catch planes.) Signup page. Arkady Rose and I (and the small child) plan to attend, particularly with the intent of seeing how to make Wikimedia UK more useful for something. I really should attend more of these …

To cover the world.

Sunday, June 1st, 2008

FritzpollBot was recently approved to create stub articles on English Wikipedia for most or all of the documented villages and towns in the world. (Example.)

In 2003, Rambot created placename articles for every census location in the United States. We were therefore able to claim complete coverage (per that “encyclo-” prefix) of one country. FritzpollBot aims to complete this coverage for the entire world.

I think this bot-assisted programme of article creation is a Good Thing for topics where we do in fact have the data. It’ll certainly help alleviate our systemic bias. The issues I can see are editorial — the Rambot articles are data in prose form that these days we’d do with a parameterised template, etc. — but Fritzpoll is quite aware of these and the planned programme includes considerable human review and the active involvement of country WikiProjects. Good.

(May I note that people whose objections are that this will artificially inflate the article count or make Special:Random annoying appear to have forgotten that we’re here to write an encyclopedia.)

The question that springs to mind is: what else can we get complete data on for bot-assisted article creation? Every state-level or higher politician in every country ever? What else?

Update: Fritzpoll is proceeding with all due caution, and the bot will be doing nothing but preparing lists as yet. See evolving FAQ.

Why we do this.

Thursday, May 15th, 2008

If you ever wonder why you bother working on Wikimedia projects: 1, 2.

Even the Free Software Foundation doesn’t understand the GFDL.

Thursday, April 24th, 2008

Has anyone ever gotten a straight answer from licensing@fsf.org about GFDL queries? I have never even heard of an answer from them that isn’t their Magic 8-Ball imitation. “Reply hazy, read the license text and ask your own lawyer.” Our lawyer is Mike Godwin and he says it makes his head hurt. YOU WROTE THE DAMN THING. WHAT DID YOU MEAN? WHAT WERE YOU THINKING? ANSWER ME!

In fairness, the FSF contact page says licensing@fsf.org will help with “questions about the GPL and free software licensing.” Even the FSF has given up trying to make sense of the GFDL. The new version can’t happen soon enough.

(Provoked by asking for help with the reuse FAQ and the likely utter unfeasibility of audio versions of GFDL text. The latter is one of the best arguments I can think of for running screaming to CC-by-sa as absolutely soon as possible and throwing the GFDL into a fire.)

Regular expressions to EBNF?

Wednesday, April 9th, 2008

Last Thursday at London.PM, I got asked a lot why MediaWiki wikitext doesn’t have a WYSIWYG editor. The answer is that a WYSIWYG editor would need to know wikitext grammar, and there is no defined grammar. The MediaWiki “parser” is not actually a parser — it’s a twisty series of regular expressions (PHP’s version of PCREs).

So any grammar effort (and several What You See Is All You Get editors — others just forget wikitext and write HTML) requires reverse-engineering that, and lots of people have tried and gotten 90% of the way before stalling. It doesn’t help that wikitext is (I’m told) provably impossible to just put into a single lump of EBNF.

The goal is to replace the twisty series of regexps with something generated from a grammar. Tim Starling has said, more or less: “We can’t change wikitext. Go away and write something that (a) covers almost all of it (b) is comparably fast in PHP.” Harsh, but fair.

It occurred to me that there must exist tools to convert regexps into EBNF. And that if we can get it into even a few disparate lumps of hideous EBNF, there should be tools to take those and simplify them somewhat. (Presumably with steps to say what given bits mean.) Or possibly things other than EBNF, just as long as the result is parseable.

I am not (even slightly) a computer scientist, but many of you are. Does anyone have any ideas on this? Or pointers to anyone having done anything even remotely similar? Or knowledgeable friends they could point this query at?

The other approach is parserTests.php. Running maintenance scripts, the scripts (look for parserTests), the list of tests. A “parser” will be anything that passes the unit tests.