Swedish Wikipedia surpasses 1 million articles with aid of article creation bot

On June 15, 2013, Swedish Wikipedia hit one million articles, joining the club of English, Dutch, German, French, Italian, Russian and Spanish Wikipedias. The article that broke the barrier was the butterfly species Erysichton elaborata. There is, however, one fact that separates this million article milestone from almost all others.
The one milionth article was not manually created by a human, but written by a piece of software (a “bot”). The bot, in this case, Lsjbot, collects data from different sources, and then compiles the information into a format that fits Wikipedia. Lsjbot has to date created about 454,000 articles, almost half of the articles on Swedish Wikipedia.

Lsj, Sverker Johansson, who runs Lsjbot

Bot-created articles have led to some debate, both before Lsjbot started its run, and currently. First, there was a lengthy discussion on Swedish Wikipedia after the initial proposal by Lsjbot’s operator, science teacher Sverker Johansson. The Swedish Wikipedia community was wary, having learned the lessons from previous conflicts about article-creating bots, including rambot in 2002. But there was also curiosity, so a series of test runs was made to make sure that the articles were acceptable.
After review, the Swedish Wikipedia editor community said okay. Lsjbot started by creating articles about different species of animals and plants – articles that are largely uncontroversial and that can have a similar format without feeling mechanical.
Subsequent criticism has come from prolific article writer Achim Raschka on German Wikipedia’s Kurier. Here the main complaint was that article is short: only 4 sentences long. This is a valid complaint. Even if longer articles are not always better, they tend to contain more information.
Therein lies the rub. The bots use as many datasets as their operators can find, but many sources are behind paywalls or are incomplete across entire taxon (covering only selected species). The upside of this criticism is that each statement in articles created by bots is supported by references, something that doesn’t happen in many other articles. This means that more references are added to Wikipedia by bots than by humans. This is of course not in itself a sign of quality, but it is a start for human contributors to search for more information. As with any article in Wikipedia, the readers can also help make bot-created articles better.
Is this the future for Wikipedia, to let software create articles? With Wikidata, it is certainly becoming easier to use software to create articles, something that can benefit the smaller Wikipedias. But we still need more humans to help make the determination of which sources are high quality, what information is presented correctly and what qualifies as clear writing.
So far, bots have shown that they are much quicker to create articles. In that respect, I, for one, bow to our robot overlords.
Lennart Guldbransson, Swedish Wikipedia editor

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

10 Comments
Inline Feedbacks
View all comments

Bot-created articles are being used elsewhere, such as for sports news stories where much of the article is a listing of stats intermixed with quotes from players and coaches.
I like this idea because it at least creates more meaningful stubs. If the articles created by the bot are too shallow, they at least provide a starting point for someone to edit. It can be hard to get started editing a wiki and this might encourage more people to do it.

As author of Wikistats I always took a keen interest in how wikimedians judged about or participated in the friendly rivalry between language versions of our projects. After all I helped to make the ranking between versions more visible. Too often to my taste I witnessed on-wiki conversations that some language project had been surpassed in number of articles by another, and even a few times that “a bot will fix this”. So I became a little wary about mass article creations by bots, and the true motive of their makers. When several bots on the Dutch Wikipedia started to… Read more »

From http://de.wikipedia.org/wiki/Wikipedia_Diskussion:Kurier#Schweden_feiert_1_Million
“[[sv:Erysichton palmyra]] ist ebenso veralteter quatsch wie [[sv:Erysichton]]. die gattung wurde bereits vor 3 jahren ebenso aufgespalten. der millionste artikel der schweden wäre besser unter Jameela palmyra angelegt worden.”
In short: the article is outdated. We celebrate masses of articles nobody can maintain or correct.

Before that we’ve seen the bot run in Cebuano and then Waray-Waray Wikipedia. I say these articles better exist at Wikispecies, and make Wikispecies a multilingual site. I wonder if Swedish Wikipedians have considered the thought.

@Bjarne. That issue is discussed on the article’s discussion page. So far we have not got to the point that we think the new naming (made in 2010) has been embraced by the scientific sources. I think we have to wait and see, and in the meanwhile this article has gained a little weight. @Bennylin. What Ljsbot is doing is exactly that, making a Wikispecies inside a Wikipedia edition. Personally I think this is good, as Wikipedia allows for a more extensive cross-linking and categorising and makes Wikisource-type material available for editing to a larger Swedish-speaking community. I’m not alien… Read more »

[…] de que la enciclopedia libre fundada en 2011 por Jimmy Wales hubiera anunciado la publicación de su artículo un millón en la versión sueca. Se trata de una entrada que habla de una especie de mariposa llamada Erysichton Elaborata y que ha […]

Bots are much better at creating articles. Humans need not apply.
Wikipedia Netherlands was the first Wikipedia to break the 1 million article barrier through the use of bots.

[…] Link […]

I think eventually, most things online will be automated so there is less need for human intervention.