Clarity on the future of Wikimedia search

Wmf_sdtpa_servers_2009-01-20_34
Photo by RobH, freely licensed under CC BY-SA 3.0.
Over the past few weeks, the Wikimedia community has engaged in a discussion of the Wikimedia Foundation’s plans for search and discovery on the Wikimedia projects. More recently, there has been confusion in the press and among community members about the Foundation’s plans and intention. Although we’ve participated in those discussions in other places and other ways, we want to clarify what we are, and are not doing, at the Foundation.
Since Wikipedia was founded in 2001, the amount of freely available knowledge on Wikimedia projects has grown exponentially. Today, Wikipedia and its sister projects have more than 35 million articles across nearly 300 languages, nearly 30 million images, and terabytes more across the other Wikimedia sites and projects. As the quantity, quality, and diversity of this content has grown, so too has the challenge of helping people find the most relevant information.
At the Wikimedia Foundation, we see a clear need to improve search and discovery on the Wikimedia projects. Improvements will help our users access and share in the knowledge they seek, and in doing so, bring us closer to reaching our mission and helping open knowledge stay accessible and relevant.
Search engineering is not new to the Wikimedia projects. We originally relied on a “homegrown” search engine, written by Wikimedia volunteers and based off Apache Lucene, later replaced with Elasticsearch and the CirrusSearch extension integrated with MediaWiki sites.
These improvements were significant, and we felt encouraged to continue improving in other areas such as relevance, user experience, multi-language support, multi-projects search, and incorporating new data sources for our projects. The Wikimedia community has generally been supportive throughout, indicating through surveys, support tickets, and direct feedback that improvements should continue. And the Wikimedia Foundation has communicated about our plans and improvements in many places, including on the Discovery team’s portal and on the Wikimedia blog.
We created the Discovery team in April 2015 to focus resources on improving this core part of the Wikimedia experience and making knowledge even more accessible, connected, diverse and discoverable in an open and transparent way. To support these efforts, we received a grant from the Knight Foundation for initial research.
These grant terms—and our plans—are straightforward. We intend to research how Wikimedia users seek, find, and engage with content. This essential information will allow us to make critical improvements to discovery on the Wikimedia projects. And in keeping with our values, we will make our findings public, in order for the world to better understand the way we all engage with free, open knowledge.
What are we not doing? We’re not building a global crawler search engine. We’re not building another, separate Wikimedia project. We’re committed to our mission of helping the world access and interact with free knowledge.
Despite headlines, we are not trying to compete with other platforms, including Google. As a non-profit we are noncommercial and support open knowledge. Our focus is on the knowledge contributed on the Wikimedia projects. We strive to make all Wikimedia content available under an open license, which can then be used by anyone for any purpose, including commercial and non-commercial use. This means everyone, including commercial entities, can and do use results of our work.
As we thought about the future of discovery and open knowledge, we continued to explore ideas. We have responsibility towards the open web, the future of free content, and our allies in the open culture community. We may pursue some of these ideas, such as including other sources of open knowledge. Other ideas we have explored and modified or rejected. This is a normal, if sometimes confusing, part of the process of thinking creatively and widely about the shared future of open knowledge.
What we do know is that open dialogue with the Wikimedia community is critical to finding meaningful solutions that match our movement’s unique needs.
Community feedback was planned as part of the Knowledge Engine grant, and is essential to identifying the opportunities for improvement in our existing search capacity. It has been vital to helping the Discovery team develop their product roadmap and projects. It is a major component of the deliverables in our grant-funded research. We intend for this to become even more apparent in the coming months, as we begin further surveys and research.
You can learn more about this work on the Discovery team’s page on MediaWiki.org. We look forward to sharing these findings, and working together to identify the best path forward. Through information sharing and conversation, we will arrive at the best past forward.
It is true that our path to this point has not always been smooth, especially through the ideation phase. By sharing our publicly available product goals, planning, status updates and the Knowledge Engine grant agreement, we are making an effort to improve our collaboration. A dashboard for this project has also been publicly available. We invite you to continue to engage with us: we will be releasing the results of recently concluded Wikimedia Foundation strategy consultation, in which more than 500 people have participated. In 2015 we made real progress in our technical work, and we intend to do more in 2016. We welcome your comments and thoughts, and your continued guidance as we progress.
Wes Moran, Vice President of Product
Lila Tretikov, Executive Director

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

15 Comments
Inline Feedbacks
View all comments

That doesn’t explain why you considered Google to be a risk. This doesn’t explain why the grant application was so ambiguously worded. You repeatedly said it was for “the Internet”. Your grant application reads that “today, commercial search engines dominate search engine use of the Internet, and they’re employing proprietary technologies to consolidate access to the Internet’s knowledge and information. Their algorithms obscure the way the Internet’s information is collected and displayed”. How on earth would an internal search engine help in this regard? And why is this wording so broad? The kicker is the line in the grant that… Read more »

Hello other Chris. I’m a volunteer and recently joined the WMF on the discovery team. The foundation hired me to help improve the collaboration with the community on issues related to search and discovery. I’d like to help clarify, and would be happy to continue the conversation with you. The grant agreement with the Knight Foundation states, “Risks: Two challenges could disrupt the project: 1. Third-party influence or interference. Google, Yahoo or another big commercial search engine could suddenly devote resources to a similar project, which could reduce the success of the project. This is the biggest challenge, and an… Read more »

I know that former VP of Engineering Damon Sicore secretly shopped around grandiose ideas about a free knowledge search engine, which eventually evolved into the reorg creating the Discovery team. From leaked documents, we know at least some of those grandiose plans went into the early drafts of the Knight Foundation grant request, which eventually became a smallish grant to support Wikipedia’s search capabilities. What we don’t know is to what degree Executive Director Lila Tretikov was supporting the secretive “compete with Google” plan without putting it into WMF’s public plans. It seems this would all be a simple case… Read more »

Copying a comment I made elsewhere: Disappointing post. I guess I was expecting something more aimed at the community than the press. It leaves a lot of questions unanswered and has a bit of a “yeah this is how these things always go, stop being so alarmed by nothing” tone.

How on Earth should it be possible “to advance new models for finding information by supporting stage one development of the Knowledge Engine by Wikipedia, a system for discovering reliable and trustworthy public information on the Internet”[1] with $250,000? This is enough money for one single average computer scientist to work for a year (including tax and social insurances). One programmer for a year! That sounds like all of a), b) or c) from Chris’ comment and I am not quite sure which one the WMF would prefer. Lila, may I also remind you that last year at WMCON in… Read more »

You mention that you published the grant agreement. Please add a link to the grant agreement to this blog post, so that readers can refer to it.

Implementing the ideations of Mythbusters I see:
https://www.youtube.com/watch?v=yiJ9fy1qSFI

+ 1 to GorillaWarfare.
This issue -among other facts- is becoming bigger because of a bad internal communitation procedure and the tone of this post doesn’t help to improve it. It seems more “press release clarification” than an “Ok, let’s talk”.

+ 1 to GorillaWarfare.
This issue -among other facts- is becoming bigger because of a bad internal communitation procedure and the tone of this post doesn’t help to improve it. It seems more “press release clarification” than an “Ok, let’s talk”.

To clarify: * Yes, there were plans of making an internet search engine. I don’t understand why we’re still trying to avoid giving a direct answer about it. * There has never been any actual technical work on this project. * The whole project didn’t live long and was ditched soon after the Search team was created, after FY15/16 budget was finalized, and it did not have the money allocated for such work (umm, was it in April? in such case, this should have been soon after the leaked document was created). * I don’t think anybody but the certain… Read more »

“The Wikimedia community has engaged in a discussion of the Wikimedia Foundation’s plans
Community feedback was planned as part of the Knowledge Engine grant”
The process should be the other way around. Plans should be discussed and approved by the community before accepting grants.

[…] As a foundation embedded in modern search rankings, Wikimedia could be forgiven for wanting to branch out on their own. Despite speculation, they came out and denied the rumours in a recent blog post: […]

[…] Kritik an Tretikovs intransparenter Führung spitzte sich weiter zu, nachdem sie in einem Blogpost gleichwohl versichert hatte, keine Suchmaschine entwickeln zu […]

[…] Clarity on the future of Wikimedia search « Wikimedia blog […]

[…] latter was the initial suggestion put out by the Wikimedia Foundation in a blogpost after the news broke, but was contradicted by the terms of a grant from the Knight […]