Wikipedia builds a digital library system

Photo by Stewart Butterfield, CC BY 2.0.
Photo by Stewart Butterfield, CC BY 2.0.

We’ve always thought that the world’s largest encyclopedia should have a world-class library. Through the Wikipedia Library program, the encyclopedia’s editors have free access to a collection of over 80,000 unique periodicals, like journals, magazines, newspapers, newsletters, pamphlets, and series, in addition to an untallyable number of books. This access has been facilitated by over 60 partners, including many of the world’s leading publishers and aggregators.
The library of resources available to Wikipedians continues to grow, allowing these editors to use the best sources available to improve Wikipedia.
Why do we do this? Because facts matter.
That said, while an enormous amount of content is available, our current distribution and access processes leave a lot of room for improvement. Signing up for partners is currently done individually and on a per-partner basis, resulting in a slow turnaround on approval and distribution of access, taking on average three weeks from application to access, which is far too long. We’ve been limited in the number of accounts we can give out for most publishers and accounts are generally granted for exactly one year at a time, whether editors need longer perpetual access, or worse if they only want to grab a couple of references. Lastly, there hasn’t been a way to search through the vast content across all our separate partners.
This year, we’re working hard to solve these problems. We’re excited to share our plans with you, from the full rollout of the Wikipedia Library Card Platform to adding proxy access using Wikipedia logins to piloting a “bundle” of resources that could be accessed by thousands of qualifying editors at any time, as well as making Wikipedia’s citations far more open and accessible for readers.

“You build a thousand castles, a thousand sanctuaries, you are nothing; you build a library, you are everything!” –Mehmet Murat ildan

homepage_of_the_wikipedia_library_card_platformscreenshot_2017-01-10_at_9-05-39_am

Wikipedia Library card

At the center of our plans for increasing and improving access is the Wikipedia Library Card platform. We rolled out phase one last quarter, addressing the slow signup and approval challenges while beta testing and improving it with our latest new partners. Already delivering access, it should improve signup speed from three weeks on average to closer to one week, won’t require editors to provide all their details every time they sign up for a new resource, and will be translated into as many languages as possible. By the end of March, we plan to move all partner signups over to the platform.

Proxy access

Our next step (phase two), is to integrate a proxy authentication method, which will allow users to use their single Wikipedia login for direct access to partners who can accept authentication through the Library Card. This will greatly improve the ease of access for editors, should reduce the workload for us and our partners, and will hopefully translate to increased usage and citation of available resources. We are aiming to have this ready in approximately 6 months.
Proxy integration alone isn’t a major departure from the current setup; the same individually approved users would have access to one partner’s content per application, they will just be accessing it directly through a single authenticated login proxy rather than a username and password distributed for each website.

Wikipedia Library bundle

A very exciting addition to our signup model, currently dependent on per-user approvals by volunteer coordinators, will be the Wikipedia Library Bundle. It would give any editor who meets account age, edit count, and recent activity criteria automatic access through the platform to a certain set of TWL partners, effectively replacing the account coordinator approval step, and covering approximately 25,000 editors across all language Wikipedias.
The Library Bundle will provide immediate access to participating partner resources for eligible Wikipedians, without having to sign up and with no need to worry about only using their access for a handful of sources at a time. We’re really excited about the opportunities and accessibility this access method will provide.
To automate the account coordinator check for recent activity and good standing in the community we will be implementing requirements beyond the current 500 edits and account age of 6 months. These automated checks would include recent activity (e.g. 10 edits in the past month) and not currently being blocked. The requirements aren’t yet finalised and there may be other restrictions such as a limit on total concurrent users, but we don’t aim to make the requirements more restrictive than the current checks carried out by account coordinators.
This will run on an opt-in model that some partners will choose to be a part of, and we have had encouraging responses from a number of publishers who are already excited to use this distribution method.

Integrated search

Phase three of the Library Card Platform will seek to solve the issue of editors needing to browse partner-by-partner for needed resources. We will be implementing an integrated search tool which will index partner resources and provide search via a single interface. Not only will editors not need to log in separately to each of TWL’s partners’ websites, they will be able to search all their content from one place too.
Integrated search should pair powerfully with proxy authentication. If an editor finds a search result from a partner they’ve individually signed up for, they’ll be able to click directly through to it from the Wikipedia Library Card platform. And, if that partner is a member of the Wikipedia Library Card Bundle, then they’ll be able to access it automatically even if they’ve never signed up for it—just because they meet the basic criteria for account age, edits, and recent activity.

Open citations

Our publishing partners have further empowered Wikipedia by sharing access to their scholarly and news sources. We are always experimenting with and evolving our publisher relationships to improve the ability of Wikipedia editors to do rigorous research more easily and more impactfully.  We also care about our readers and their ability to access full text.
OABot is the next step in bringing that openness to readers. OABot, technically approved but still pending community consensus, scans closed Wikipedia citations and finds free-to-read links available in open web repositories; it then adds a link to the open version into the existing Citation. Ideally, this added link will be tagged with an icon indicating that it’s free-to-read.
Open Access Button is another useful tool, that pings paper authors when readers hit a paywall to their work. OAButton is working on a batch request infrastructure to contact thousands of authors simultaneously.
Since OABot scans Wikipedia to determine where a free-to-read link is missing but available elsewhere, by inverse, it can determine when a free-to-read link is missing and not available elsewhere. Thus, we can generate a batch of hundreds of thousands Wikipedia citations that aren’t free to read, and send them to OAButton to contact authors with the specific use cases of becoming readable from Wikipedia. Completing the feat, we could have OAButton simultaneously instruct authors how to deposit a free version of their paper in a way and place that OABot will definitely find it, and add that newly open link directly into Wikipedia.
A virtuous circle from empowered editors to informed readers. That is what we’re building.
Jake Orlowitz, Head of the Wikipedia Library
Sam Walton, Partner and Metrics Coordinator
Wikimedia Foundation

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

8 Comments
Inline Feedbacks
View all comments

Err, this is not a digital library. Let’s not be blasphemous towards those who build actual digital libraries.

Jake and Sam, have you talked to publishers about setting up an ebooks library? It will cost money but WMF does have money, and I can’t think of a better use of their money. Same applies to journals, actually. Why not just pay for a service that rivals the best university journal collections?
I guess I’m saying, why do you have to go cap in hand, begging for freebies from the publishers? Isn’t this exactly the kind if thing the WMF should be spending donations on?

Hi Anthony, it’s a fair question and there are several good reasons. The first, is that we aim to make the most effective use of donor money, and not paying for resources is a lot cheaper than paying for them. Our staff time spent arranging and managing these partnerships is easily one or two orders of magnitude less than what it would cost to furnish access for the 20-30 thousand editors we’re aiming to serve with over 100,000 unique journals. Subscriptions and licenses are often in the millions for a world-class research library, and yet we have many (but not… Read more »

I’m aware of the order of cost involved. But I think our editors deserve efficient and comprehensive access to journals and textbooks. I don’t think that’s what you’re providing here. Perhaps I’m wrong but isn’t your provision of medical journal access and medical textbook access piecemeal, at best? Nothing remotely like the access offered by top medical schools? It was the last time I looked. I’m happy to be corrected.

The OABOT link should lead to https://en.wikipedia.org/wiki/User:OAbot

Thanks for flagging this mistake! We have fixed it, Anon.

I am absolutely opposed to the suggestion that we should use donor funds that were given to grow a free knowledge project to become customers of an industry that sells the work of scientists, work that was largely paid for with public funds. That would mean that instead of using our influence to open up knowledge to the community, we would feed and legitimise an industry that is fundamentally opposed to our mission.