On March 14, we launched v2.0 of the Article Feedback Tool. Version 2.0 is represents a continuation of the work we started last September. To quickly recap, the tool was originally launched as a part of the Public Policy Initiative. In November, the feature was added to about 50-60 articles on the English Wikipedia, in addition to the Public Policy articles. The purpose of adding the tool to these additional pages was to provide us with additional data to help understand the quality of the ratings themselves, namely do these ratings represent a reasonable measurement of article quality?
Since then, we’ve been evaluating the tool using both qualitative and quantitative research. We conducted user research on the Article Feedback tool both to see how users actually used the tool and to better understand the motivations behind rating an article. Readers liked the interactivity of the feature, ease of use, and the ability to easily provide feedback on an article. On the other hand, some of the labels (e.g., “neutral”) were difficult to understand. A detailed summary of the user research has been posted here.
We also did some quantitative research on the ratings data. Though the ratings do appear to show some correlation with changes in the content of the article, there is ample room for improvement (see discussion of GFAJ-1). It also appears as though articles of different lengths show different ratings distributions. For example, there appears to be a correlation between Well-Sourced and Completeness and length for articles under 50kb, but for articles over 50kb in length, the correlation becomes far weaker (see Factors Affecting Ratings).
Based in part on the results from the first version, v2.0 of this feature was designed with two main goals in mind.
- First, we wanted to see if we could improve the correlation between ratings and change in article quality by segmenting ratings based on the rater’s knowledge of a topic. We introduced a question which asks the user whether she is “highly knowledgeable” about the topic. The answers to this question will enable us to compare ratings from users that self-identify as highly knowledgeable versus ones that don’t.
- Second, we wanted to see if rating an article could lead to further participation — does rating an article provide an easy way to contribute, leading to additional participation like editing? We wanted to test this hypothesis in light of the recent participation data. We don’t know whether this will actually be the case, but we wanted to get some data. In v2.0, there is a mechanism that shows a user a message (e.g., “Did you know you can edit this article?”) after they submit a rating. We will measure how well these messages perform. (These messages are dismissible by clicking a “Maybe later” link).
We also made some UI changes based on the feedback from the user study. For example, “Neutral” was changed to “Objective” (as were some other labels) and the submit button has been made more visually obvious. There are a number of other improvements which may be found on the design page.
Finally, in an effort to get a wider variety of articles to research, we increased the number of articles with the tool. We knew from our early analysis that articles in different length bands received different rating distributions, so we created length buckets (e.g., 25-50kb) and selected a random set of articles within each length bucket. User: Kaldari wrote a bot which takes the list of articles and places the tool on the articles in the list . As of March 24, there are approximately 3000 articles that the tool is currently active on. We may expand this list if we can do so without impacting performance of the site.
We’ll be publishing analysis on v2.0 in the coming weeks. In the meantime, please let us know what you think on the workgroup page. Or better yet, join the workgroup to help develop this feature!