Three weeks left in the Wikipedia Participation Challenge

There are still three weeks left in the Wikipedia Participation Challenge (see prior blog post)!  So far, the competition has exceeded our expectations.  As of this morning, 78 teams (167 total individuals) from across the world have participated in the competition, with a total of 735 entries submitted. Half of these teams have beat the benchmark we set at the beginning of the competition, which is a testament to the quality of the teams and their submissions.   We can’t wait to see what great algorithms the participants are developing.
There’s still time to jump in before the competition closes on September 20, 2011, so if you haven’t done so, download the data and start crunching.  And those who want to cheer from the sidelines may follow the competition on Kaggle’s leaderboard.
Howie Fung
Senior Product Manager, Wikimedia Foundation
Diederik van Liere
Research Consultant, Wikimedia Foundation

Archive notice: This is an archived post from, which operated under different editorial and content guidelines than Diff.

Inline Feedbacks
View all comments

I’m interested in participating, I’ve never really done something like this before. Could you give me a brief idea of the kind of code which is expected? Can I use any coding language? The dumps are huge files which ones would i absolutely need?
Best Regards,

Most of the information about the data challenge may be found on You may use any programming language, but there are special prizes for using open source tools. You may read more about the contest rules here:
Also, the datasets are located here, including a few examples that are pretty small:
Good luck!