The Monthly Wikimedia Research Showcase is a public showcase of recent research by the Wikimedia Foundation’s Research Team and guest presenters from the academic community. The showcase is hosted at the Wikimedia Foundation every 3rd Wednesday of the month at 9:30 a.m. Pacific Time/18:30 p.m. CET and is live-streamed on YouTube.
Theme: Editor Retention
January 18, 2023 Video: YouTube
Vital Signsː Measuring Wikipedia Communities’ Health
By Cristian Consonni, Eurecat – Centre Tecnològic de Catalunya, Barcelona
Community health in Wikipedia is a complex topic that has been at the center of discussion for Wikipedia and the scientific community for years. Researchers observed that the number of active editors for the largest Wikipedias started declining after an initial phase of exponential growth. Some media outlets picked this fact as a death announcement for the project, but the news of Wikipedia’s death turned out to be greatly exaggerated. However, it remains true that researchers and community activists need to understand how to measure community health and describe it more accurately. In this presentation, we would like to go beyond the traditional metrics used to describe the status of the community. We propose the creation of 6 sets of language-independent indicators that we call “Vital Signs.” We borrow the analogy from the medical field, as these indicators represent a first step in defining the health status of a community; they can constitute a valuable reference point to foresee and prevent future risks. We present our analysis for several Wikipedia language editions, showing that communities renew their productive force even with stagnating absolute numbers; we observe a general need for renewal in positions related to particular functions or administratorship. We created a dashboard to visualize all the indicators we have computed and hope that the communities will find it helpful for improving their health.
Learning to Predict the Departure Dynamics of Wikidata Editors
By Guangyuan Piao, Maynooth University
Wikidata as one of the largest open collaborative knowledge bases has drawn much attention from researchers and practitioners since its launch in 2012. As it is collaboratively developed and maintained by a community of a great number of volunteer editors, understanding and predicting the departure dynamics of those editors are crucial but have not been studied extensively in previous works. In this paper, we investigate the synergistic effect of two different types of features: statistical and pattern-based ones with DeepFM as our classification model which has not been explored in a similar context and problem for predicting whether a Wikidata editor will stay or leave the platform. Our experimental results show that using the two sets of features with DeepFM provides the best performance regarding AUROC (0.9561) and F1 score (0.8843), and achieves substantial improvement compared to using either of the sets of features and over a wide range of baselines.