Searching for Wikipedia: DuckDuckGo and the Wikimedia Foundation share new research on how people use search engines to get to Wikipedia

The Wikimedia Foundation and DuckDuckGo recently partnered on new research to better understand the relationship between the design of search results and readership on Wikipedia. Wikipedia depends on search engines to help readers find relevant content, and search engines depend on Wikipedia for high-quality results. Findings from this research demonstrate that readers actively seek out Wikipedia, and that search design plays a large role in this continued readership.

Wikipedia receives over 15 billion pageviews from 1.5 billion unique devices every month from nearly every country in the world. Looking back on 2020, the top articles being read on Wikipedia covered an enormous spectrum of politics, entertainment, and current events such as the pandemic. Not surprisingly, 75% of reader sessions (pages viewed by the same user) come from search engines, and 90% of these come from a single search engine, Google. While these 15 billion pageviews are a lot, they don’t necessarily capture all the people who are seeing and learning information from Wikipedia. 

Why? One big reason is changes in Search.

Overview of Wikipedia and Search

Over the past several years, search engine platforms have been evolving their designs to offer more rich and complex results pages. In the past, you may have typed “Fennec Fox” into a search engine, which would then populate a list of blue links to external sources — such as Wikipedia — where you can learn more about the large-eared species. Today, you would likely see a different, richer results page from the same search. You would likely see an information module

Information modules, also referred to as “knowledge panels” or “information boxes,” are the boxes on search result pages, generally to the right of the blue links. They often include a short summary of information from Wikipedia alongside images, facts, and links to relevant websites, including Wikipedia (see Figure 1).

Figure 1. Example information modules for fennec fox. Screenshot used under fair use.

Many people will click the Wikipedia link in the module to learn more. Others, however, do not click through, finding the information they are looking for contained within the module. They see and learn from Wikipedia content, but they never click through to Wikipedia.org.

Wherever Wikipedia reaches its readers, that helps with achieving its goal of providing free access to the sum of all human knowledge. Nevertheless, it is important that people do come to Wikipedia itself because readers are essential to its success and sustainability. As a reader on Wikipedia, you can learn about citations and verify what you are reading. You see all the links to learn more and can go explore down rabbit holes. You see the edit button and realize you can improve the content. You see the banner asking you to donate and might become a donor.

The evolution in the design of search result pages — including Wikipedia modules — have raised important questions about the impact to readership on Wikipedia and other websites. 

Research Goals

To better understand the relationship between Search and Wikipedia, the Wikimedia Foundation, the nonprofit that operates Wikipedia, teamed up with DuckDuckGo to explore how people use search engines to get to Wikipedia. 

DuckDuckGo served over 23 billion search queries in 2020. It is the fourth-largest source of traffic to Wikipedia, the second-largest search engine on mobile devices in the United States, and the sixth largest search engine in the world. Wikimedia and DuckDuckGo share the value that access to knowledge should not require compromising on privacy — just as reading and contributing to Wikipedia does not require an account or expose you to tracking and profiling, the same is true of searching on DuckDuckGo. And like many search engines, DuckDuckGo’s search provides a standard list of blue links, as well as richer search result modules that extract information from Wikipedia to show their users.

Through this shared experiment, Wikimedia and DuckDuckGo were guided by two central research questions: 1) how prominent is Wikipedia in search engines? and 2) how do rich search results like information modules affect Wikipedia traffic?

Key Findings

To answer these questions, we completed a series of analyses of aggregate data from DuckDuckGo’s Search Results Pages (SERP’s) that allowed us to see high-level patterns of searches and clicks without tracking any individual users. Since DuckDuckGo does not have search histories or track users, every time a user searches for something on DuckDuckGo, it’s like they visited for the first time. The results below are for mobile users of DuckDuckGo in the United States, but we verified that the similar patterns (except where noted otherwise) were found across desktop readers in the United States, as well as desktop and mobile readers in Germany:

  • Wikipedia is the most common result across all DuckDuckGo searches. It shows up either as a module or one of the top five blue links in more than 15% of searches in the United States, more than any other website.
  • Wikipedia often shows up both as a blue link and information module. For mobile queries on DuckDuckGo in the United States, 13.4% of queries have a Wikipedia blue link in the top five results and an information module that links to Wikipedia. For these searches, people click on the blue link 7.9% of the time and module link 8.0% of the time (for a total clickthrough rate of 15.9%). 
  • The information module receives a larger proportion of clicks to Wikipedia as the blue link drops in position. When Wikipedia is listed as the first blue link result, the total clickthrough rate to Wikipedia is highest at 24.5% (with slightly more than half of that coming via the blue links and the rest via the module). The clickthrough rate from the blue link almost halves with each drop in the ranking, but the clickthrough rate from the information module, which stays prominent in the results, drops more slowly (by 30%) with each corresponding drop in relevance. When the Wikipedia blue link is the fifth result, the overall clickthrough rate is only 4.2% but over two-thirds of that comes via the module.

These results raise a major question as to why people click on Wikipedia: is it just prominent on the search results page, or is Wikipedia special? 

To answer that question, we conducted a simple but powerful experiment: a small, random proportion of searches were split into two groups. For half (we’ll call this Group A), the Wikipedia information module was shown as it normally would be. For the other half (we’ll call this Group B), the Wikipedia information module was not shown in the search results. Total clicks to Wikipedia were analyzed for each group and compared to understand how the information module impacts clicks to Wikipedia.1

If removing the module led to a large drop in clicks to Wikipedia, we would know that DuckDuckGo users replace Wikipedia with other sites in the blue links. On the other hand, if the clicks that would have gone to a Wikipedia link in the module instead go to a Wikipedia link in the blue links, that tells us that Wikipedia indeed is sought after by these users and has earned its prominent status.

Figure 2. Example A/B conditions for desktop in the United States. Screenshot used under fair use.

And the result… Wikipedia is indeed special! When looking at Group B we found that:

  • 95% of the clicks that would have gone to the Wikipedia information module instead went to Wikipedia blue links. Remember that the overall clickthrough rate to Wikipedia in Group A was 15.9%, so we would have expected most people to click on other links when the module was removed. Removing the module only dropped the clickthrough rate from 15.9% to 15.0%. This indicates that the vast majority of people are not choosing Wikipedia just because it happens to be ranked high in Search and prominently in the information module but because they are explicitly looking for Wikipedia.
  • High clickthrough rate remained even with lower link positions. When the Wikipedia blue link was ranked first, 98% of the clicks that would have gone to the information module instead went to the blue link.2 This did drop with link position, however, pointing to the impact of information modules on search behavior. When Wikipedia was the fifth blue link, only 65% of the clicks to the Wikipedia module were replaced with the Wikipedia blue link. This is still quite high given that the overall click-through rate when Wikipedia is in the fifth position is only 4.2%. Again, this tells us that the majority of folks who click on the Wikipedia module do so because they are searching directly for Wikipedia and will scan down the page until they find it.

Conclusion

From this data we can reach a number of conclusions:

  • Wikipedia is central to the success of Search, and, in turn, Search is core to how people find Wikipedia. Wikipedia is ranked highly because people are looking for it.
  • Search design matters for websites. The richer search results such as information modules do draw a large number of clicks. People in Germany and the United States have high awareness of Wikipedia, and we see that the information module only has a small impact on click-through rate to Wikipedia. We hypothesize that in regions with lower awareness of Wikipedia and less content available in their local languages, the design of search would have substantially more effect on which sites readers choose to view.
  • More research is needed. We looked at Germany and the United States — how do information modules affect behavior in regions with lower awareness of Wikipedia? DuckDuckGo provides a very straightforward information module with good attribution — how do more complex search elements or other forms of attribution affect behavior?

This research demonstrates that Information modules are an important aspect of Search and help bring readers to Wikipedia, where they can engage fully with the content and potentially contribute back as an editor or donor. Designed in a straightforward manner as DuckDuckGo does — i.e. with a simple overview and clearly attributed links to Wikipedia — we saw scant evidence to back concerns that they were stopping readers from clicking through. As Wikipedia content continues to grow (in English and other languages), we hope that it can continue to facilitate these rich search modules. As depicted in Figure 1, there are many divergent designs for information modules, however, and oftentimes other complementary modules that crowd out the search results. Transparency about the impact of these design decisions, as evidenced by this research, is core to ensuring that Search continues to bring readers to Wikipedia so that Wikipedia can continue to grow and bring knowledge back to readers.

Acknowledgments

We would like to thank our partners at DuckDuckGo for the collaboration and work on this project.

Footnotes

1 Known as an A/B test, this design is commonly used for evaluating changes to interface design on internet platforms. Though Wikipedia’s reading interface almost never changes, when the Wikimedia Foundation wants to try out new editor tools, they are often provided to a small sample of editors first to see how they are used. Through the power of statistics and large numbers, the aggregate click data from Group A can be compared to the aggregate click data from Group B without knowing anything about who was in each group. See Figure 2 for an example of both conditions.

2 For desktop users in the United States, we actually saw a 2.9% increase in clickthrough rate when the module was removed, suggesting that some readers had been finding the information they wanted within the information module without needing to click through. The full data used in this study, including desktop results in the United States and results from both platforms in Germany, can be found at: https://analytics.wikimedia.org/published/datasets/one-off/searching-for-wikipedia/