Semantic search: making it easier to find the information readers want

Translate this post

How do you find what you’re looking for on Wikipedia?

If readers know the exact article they want to read (say, Cat), some may go to Wikipedia and type it into the search bar. 

But if they have a question – for example, Can cats see in the dark? – they are much more likely to go to an external search engine and ask the question. Then they might click through the results to the relevant Wikipedia article, or just read the search engine’s preview of an article without coming to the site at all. Today, an estimated 78% of Wikipedia reading sessions begin on external search engines, with about 90% of those coming from Google Search.

Wikipedia’s value is in providing trusted content, made by and for humans. If we want the world to use and appreciate this content, they have to be able to find it, without getting it remixed and regurgitated by AI on a big tech platform. When readers consistently find it easier to access Wikipedia’s content through other websites,  Wikipedia risks becoming a background data source rather than a destination for learning, curiosity, and joy. 

In this post, we explain why search on Wikipedia falls short for many readers today, what our research shows about how people actually search, and how we’re exploring semantic search in partnership with editors. 

The problem: readers often can’t find what they want on Wikipedia

Current results as of January 2026 through on-site keyword search.

As the example above shows, Wikipedia search is not very effective if someone has a question or wants to explore a topic that doesn’t map neatly to a single article. When readers run into this friction, they often turn to major search engines. The kind of search they offer – often called semantic search – goes beyond word matching by using machine learning to understand a user’s intent. 

Exploring semantic search on Wikipedia

To address this gap, a cross-disciplinary team at the Wikimedia Foundation — the Information Retrieval working group — is working to improve on-platform search so readers can find what they’re looking for directly on Wikipedia.

In particular, we’re exploring questions such as:

  • How often do readers arrive with questions or exploratory queries, rather than a specific article in mind?
  • In what situations does keyword search make it difficult for readers to find relevant information?
  • Could meaning-based approaches help readers discover existing Wikipedia articles and sections more effectively?
  • What risks, limitations, or trade-offs should be carefully considered before pursuing this further?

While some search engines now layer AI-generated summaries on top of search results, that is a separate feature from what we’re exploring here. This work focuses on harnessing semantic search to better surface existing, editor-created Wikipedia articles and sections, not to generate new answers or summaries.

Research results

The working group recently completed a research report drawing on design research, technical prototyping, and community feedback to test whether this problem is real and whether improving search would meaningfully help readers. Our findings confirmed both. Here is a summary of what we learned:

1. About 98% of Wikipedia reading sessions originate outside Wikipedia search.

  • The small group who do use internal search are much more likely to be editors than casual readers. Most readers move between articles by returning to external search engines, even when links exist within Wikipedia itself.

2. Roughly 80–95% of on-wiki search sessions use autocomplete suggestions.

  • The preference for autocomplete suggestions – those that appear as someone types – shows that small improvements to speed can have a large impact.

3. Between 4–7% of Wikipedia search queries are phrased as questions, but these queries are less likely to succeed. 

  • While this is a minority of searches, it shows that some readers attempt it and that many others likely avoid it because they’ve learned it doesn’t work.

What’s next: experimentation and community discussion

Based on this research, we believe a hybrid search approach, combining semantic and keyword search, has the most potential to help readers find information more easily. In early tests, combining the two produced the most relevant results and the highest reader satisfaction.

Our next step is a small-scale experiment that tests a hybrid search experience on Wikipedia. The results of this experiment, along with feedback from editors, will help determine whether and how semantic search should become part of Wikipedia’s search tools. 

We will document progress on Meta-Wiki and MediaWiki and share updates through Village Pumps, newsletters, community calls, and events. Editor input, especially on risks and tradeoffs, will directly shape whether and how semantic search is developed, tested, or paused.

Wikipedia’s value is rooted in human-created, trusted knowledge. By improving search, we can help more people find that knowledge, explore what they love, and keep coming back.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?