Wikimedia Research Newsletter, November 2015

Translate this post
Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png

Vol: 5 • Issue: 11 • November 2015 [contribute] [archives] Syndicate the Wikimedia Research Newsletter feed

Do Wikipedia citations mirror scholarly impact?; co-star networks in silent films

With contributions by: Daniel Mietchen, Guillaume Paumier, Piotr Konieczny, and Tilman Bayer

“Are Wikipedia citations important evidence of the impact of scholarly articles and books?”

Reviewed by Piotr Konieczny

This paper[1] contributes to the discussion of the relation of Wikipedia and academia, in the context of the use of academic publications on Wikipedia. The authors looked at whether articles and academic books (monographs) indexed in the Scopus database (302,328 articles and 18,735 books) are cited by Wikipedia, other articles, and books. They found that only about 5% of all academic articles are cited on Wikipedia, compared with about 33% of books. Arts, humanities and social science books are cited almost twice as often as those from natural and medical sciences. The authors conclude that Wikipedia citations are not strongly related to scientific impact, but more so to the work’s educational and cultural one. They further conclude that Wikipedia citations are likely a good source for understanding the work’s non-scholarly impact, particularly for books. On that note, while the authors discuss some limitations of their study, they do not address the topic of open access, which could explain the discrepancy between the use of books (many of which are at least partially available through online through Google Books, a database the authors themselves used as well in this study) and articles (most of which are not available to an average reader). Therefore the authors’ conclusion should be moderated by the fact that while in Wikipedia is not currently citing the majority of academic articles, as said majority is not readily available to the project contributors, further research is needed on whether Wikipedia can be used to understand the impact of scholarly open access sources. (See also the review of a related paper in our August issue: “Amplifying the Impact of Open Access: Wikipedia and the Diffusion of Science”.)

Greta Garbo and Ricardo Cortez linked together (d:Q638165#claims) in the 1926 silent film “Torrent”

Exploration of co-stardom networks between 1920 and 1930 using Wikidata

Reviewed by Guillaume Paumier

In this blog post[2], Pierre-Carl Langlais relates how he used actor–movie relationships from Wikidata to graph and examine co-stardom networks in the 1920s and 1930s. His exploration, based on the assumption that transnational collaborations were easier in silent films productions, aimed to determine whether co-stardom networks tightened by country after the move to spoken film.

The author queried actor–movie data from Wikidata, processed it with R to create actor–actor relationships, and created graphs using Gephi for two periods: 1920–29 and 1930–39. He found that (software-determined) clusters of actors did seem to follow countries for the 1930–1939 period, with some overlap for countries with the same language. However, the software had less success identifying clusters for the 1920–1929 period. Some clusters mixed different countries, and some countries were split into several clusters. The author invited readers to explain those results. He also highlighted the case of transnational actors like Greta Garbo and Maurice Chevalier.


“Editing for equality: the outcomes of the Art+Feminism Wikipedia edit-a-thons”

Reviewed by Piotr Konieczny

This article[3] discusses the Art and Feminism Wikipedia edit-a-thon, an event the authors describe as the largest of such events ever. Framed in the context of importance of gendered activism and information activism for librarians, it discusses what the authors perceive as a growing collaboration between gender and information activists that also includes Wikipedia GLAM activists. The article presents an interesting overview of this developing movement.

Research at Wikimania 2016

By Daniel Mietchen

The planning for next year’s Wikimania is in full swing. The organizers are experimenting with approaches that differ from those of previous years, and that includes asking the community a bit more explicitly as to what issues they’d like to see covered and what they’d like to learn about during Wikimania. One of the topics considered for this advance feedback is research, which is understood to encompass Wikimedia coverage of research-related topics as well as research on Wikimedia-related topics, and everything in between. How should such research-related topics be addressed by the event? If you have suggestions in this regard, please put them forward here. Thanks!

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

  • “Do experts or collective intelligence write with more bias? Evidence from Encyclopædia Britannica and Wikipedia.[4] From the abstract: “We evaluate these questions empirically by examining slanted and biased phrases in content on US political issues from two sources—Encyclopædia Britannica and Wikipedia. … Using a matched sample of pairs of articles from Britannica and Wikipedia, we show that, overall, Wikipedia articles are more slanted towards Democrat than Britannica articles, as well as more biased. Slanted Wikipedia articles tend to become less biased than Britannica articles on the same topic as they become substantially revised, and the bias on a per word basis hardly differs between the sources.” (See also our 2012 reviews of two related papers by the same authors, including a critique of their reliance on the method of using the US Congressional Record as a gold standard of unbiased language: “Language analysis finds Wikipedia’s political bias moving from left to right“; “Given enough eyeballs, do articles become neutral?“)
  • “Towards better visual tools for exploring Wikipedia article development – the use case of ‘Gamergate controversy’.”[5] From the abstract: “We present a comparative analysis of three tools for visually exploring the revision history of a Wikipedia article. We do so on the use case of ‘Gamergate Controversy‘, an article that has been the setting of a major editor dispute in the last half of 2014 and early 2015, resulting in multiple editor bans and gathering news media attention. We show how our tools can be used to visually explore editor disagreement interactions and networks, as well as conflicted content and provenance …”
  • “What triggers human remembering of events? A large-scale analysis of catalysts for collective memory in Wikipedia.”[6] From the “Conclusions and future work” section: “we managed to identify some first pattern for event memory triggering for diverse event types including natural and manmade disasters as well as accidents and terrorism. For doing this we have combined correlation detection, analysis of the surprise aspect (unexpected change) in the distribution of the past event surrounding the peak time of the triggering event and analysis of the skewness of the distribution of the past event at the peak time of the triggering event. Our analysis confirmed the influence of closeness in time and location, but also has shown that these aspects cannot be considered in isolation and that high-impact events and semantic similarity of events also influences, which event memories are triggered by an event.”
  • “As dinâmicas do conhecimento científico e tecnológico na era da Web 2.0 : um estudo sobre a Wikipédia lusófona.”[7] In Portuguese; from the English abstract: “This investigation has been made from the analysis of “formal” Wikipedian’s structure and from semi-structured interviews with 24 collaborators users of the ‘highlighted subjects’ on scientific and technological issues of the Lusophone Wikipedia. The thesis’ conclusions are: on one hand, that the Wikipedian’s interface tends to reproduce the disciplines’ hierarchy of the S&T [science and technology] knowledge and the distribution of users’ areas of interest according to the social places occupied by them, as well as express values, representations and practices associated with modern science paradigm. On the other hand, this study observes that Lusophone Wikipedia is propitious, not only to diversify the profile of the collaborators involved in the production, dissemination and achievement of scientific and technological subjects compared to the conventional spaces of the S&T knowledge, but also opening various possibilities for participation in the site, included ownership of collective product.”
  • “Generating quizzes for history learning based on Wikipedia articles.”[8] From the abstract: “Aiming at reducing the cost of developing educational contents, this study proposes a method to generate multiple-choice history quizzes using Wikipedia articles.”
  • “Detecting spatial patterns of natural hazards from the Wikipedia knowledge base” [9] From the abstract: “Over 230,000 geo-tagged articles are […] extracted from the Wikipedia database, spatially covering the contiguous United States. … In this work, Wikipedia articles about wildfires are extracted from the Wikipedia database, forming a wildfire corpus […]. The spatial distribution of wildfire outbreaks in the US is estimated […] and mapped using GIS. To provide an evaluation of the approach, the estimation is compared to wildfire hazard potential maps created by the USDA Forest service. “
  • “Extraction of career profiles from Wikipedia”[10] From the abstract: “We describe a system that gathers the work experience of a person from her or his Wikipedia page. We first extract an ontology of profession names from the Wikidata graph. We then parse the Wikipedia pages using a dependency parser and we connect persons to professions through the analysis of parts of speech and dependency relations we extract from text.”
  • “Extracting and visualising biographical events from Wikipedia”[11] From the abstract: “This work presents a proposal for the development of a natural language processing module for event and temporal analysis of biographies as available in Wikipedia” (using DBpedia; no mention of Wikidata)
  • “The Descent of Pluto: Interactive dynamics, specialisation and reciprocity of roles in a Wikipedia debate.”[12] From the abstract: “… we performed a longitudinal analysis of a specific case-study within the French-speaking “astronomy” Wikipedia OEC [“Online Epistemic Community“], revolving around the renaming of the article on the celestial body “Pluto”, given the ‘descent’ of its scientific status from that of a planet to an asteroid. Our choice was to focus on the analysis of dialogic and epistemic roles, as an appropriate meso-level unit of analysis. We present a qualitative-quantitative method for analysis of roles, based on filtering major participants and analysing the dialogic functions and epistemic contents of their communicative acts. Our analyses showed that […] roles become gradually specialised and reciprocal over sequences of the discussion: when one participant changes role from one sequence to another, other participants ‘fill in’ for the vacant role. Secondly, we show that OECs, in the case of Wikipedia, do not function purely on a knowledge-level, but also involve, crucially, negotiation of images of participants’ competences with respect to the knowledge domain.”


  1. Kayvan Kousha and Mike Thelwall: Are Wikipedia Citations Important Evidence of the Impact of Scholarly Articles and Books? PDF
  2. “Modéliser le réseau social des acteurs de cinéma des années 1920 et 1930 avec Wikidata | Sciences communes”. Retrieved 2015-11-25. 
  3. Siân Evans, Jacqueline Mabey and Michael Mandiberg: Editing for Equality: The Outcomes of the Art+Feminism Wikipedia Edit-a-thons. Art Documentation: Journal of the Art Libraries Society of North America, Vol. 34, No. 2 (September 2015), pp. 194–203. Published by: University of Chicago Press on behalf of the Art Libraries Society of North America DOI:10.1086/683380 Closed access
  4. Greenstein, Shane, and Feng Zhu. “Do experts or collective intelligence write with more bias? Evidence from Encyclopædia Britannica and Wikipedia.” Harvard Business School Working Paper, No. 15-023, October 2014. PDF
  5. Fabian Flöck, David Laniado, Felix Stadthaus, Maribel Acosta: Towards better visual tools for exploring Wikipedia article development – the use case of “Gamergate controversy”. Wikipedia, a Social Pedia: Research Challenges and Opportunities: Papers from the 2015 ICWSM Workshop PDF
  6. Nattiya Kanhabua, Tu Ngoc Nguyen, Claudia Niederée: What triggers human remembering of events? A large-scale analysis of catalysts for collective memory in Wikipedia. In: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’14), Pages 341–350. IEEE Press Piscataway, NJ, US, 2014, ISBN 978-1-4799-5569-5 Author’s copy
  7. Lima, Leonardo Santos de: As dinâmicas do conhecimento científico e tecnológico na era da Web 2.0 : um estudo sobre a Wikipédia lusófona. Master thesis, Universidade Federal do Rio Grande do Sul. Instituto de Filosofia e Ciências Humanas. Programa de Pós-Graduação em Sociologia. 2014
  8. Yoshihiro Tamura, Yutaka Takase, Yuki Hayashi, Yukiko I. Nakano: Generating quizzes for history learning based on Wikipedia articles. DOI:10.1007/978-3-319-20609-7_32 Closed access
  9. J. Fan , K. Stewart: “Detecting spatial patterns of natural hazards from the Wikipedia knowledge base” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-4/W2, 2015. International Workshop on Spatiotemporal Computing, 13–15 July 2015, Fairfax, Virginia, USA PDF
  10. Firas Dib, Simon Lindberg, Pierre Nugues: Extraction of career profiles from Wikipedia. In: Proceedings of the First Conference on Biographical Data in a Digital World 2015 (BD2015), Amsterdam, The Netherlands, 9 April 2015. PDF
  11. Irene Russo,Tommaso Caselli, Monica Monachini: Extracting and visualising biographical events from Wikipedia. In: Proceedings of the First Conference on Biographical Data in a Digital World 2015 (BD2015), Amsterdam, The Netherlands, 9 April 2015. PDF
  12. Françoise Détienne, Michael Bakera, Dominique Fréard, Flore Barcellini, Alexandre Denis, Matthieu Quignard: The Descent of Pluto: Interactive dynamics, specialisation and reciprocity of roles in a Wikipedia debate. International Journal of Human-Computer Studies. Volume 86, February 2016, Pages 11–31 DOI:10.1016/j.ijhcs.2015.09.002 Closed access

Wikimedia Research Newsletter
Vol: 5 • Issue: 11 • November 2015
This newletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed Email WikiResearch on Twitter[archives] [signpost edition] [contribute] [research index]

Archive notice: This is an archived post from, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?