Improving Wikidata-Wikisource Integration

This blog post is the first of a two-part series that talk about integration between Wikidata and Wikisource. While this part talks about an initiative by the Wikimedia community that helps to make use of the data on Wikidata, in Wikisource, the next part will focus on what opportunities are possible in this space for further work.

Bibliographic metadata is used on Wikisource in various forms and processes, including but not limited to, verifying the copyright status, on index pages, for categorization etc. However, most of this data currently entered manually to Wikisource, in the majority of the language editions. 

Though this is not a problem, it not the ideal situation. For instance, if there is a mistake in the data, updating data on either Wikidata or Wikisource, will not fix the mistake on the other platform, unless it is updated as well. At this point, Wikidata can be leveraged to its best, and one of the main reasons for its creation, to centrally curate all data on various Wikimedia projects. Here is one such intervention that makes use of the data on Wikidata, in Wikisource.

Modules to display data from Wikidata on Wikisource Index pages

An example index page from French Wikisource.

Index pages are the main page for each proofreading project on Wikisource. The page will also show the progress of the proofreading and a quick summary of the text’s details (such as title, author etc). 

Lua-based modules placed in the MediaWiki: namespace of a wiki, significantly developed by written by Tpt from French Wikisource, and further improvements made by Bodhisattwa from Bengali Wikisource, and Tshrinivasan as part of the WikiCite Project Grant, help to display metadata about a book by just adding a Wikidata ID to the index page form. These modules also help to generate various maintenance and parameter-based categories for the page.

The language editions currently using these modules include Bengali/Bangla, French, Indonesian, Punjabi, and Tamil. Please see the following screenshots to see how these modules work.

The Index page of the book, Paadri Sergai, (English: Father Sergius) on Punjabi Wikisource
is retrieving its data from Q107273983 on Wikidata.

Various maintenance categories on the Indonesian Wikisource,
auto-generated with the help of the above-mentioned Lua modules.

Categorisation of works based on author information on Bengali Wikisource,
auto-generated with the help of the above-mentioned Lua modules.

Are you interested in deploying these modules on your Wikisource? If yes, do read the documentation to get started, and if you need any help with the same, please post a message on the talk page of the documentation.

While this is a good start, leveraging Wikidata’s database across Wikimedia projects is an exciting space that presents us with a lot of opportunities. The next post in this series will talk about some of such opportunities for Wikisource.


WikiCite wordmark

WikiCite is a Wikimedia initiative to develop open citations and linked bibliographic data to serve free knowledge. Initially a series of conferences and workshops in support of that goal, WikiCite is now a community of people and an ecosystem of projects which focuses on source metadata leveraging the Wikidata platform—a free and open knowledge base that can be read and edited by both humans and machines, and central storage for the structured data of its Wikimedia sister projects.