#LD42023 IV: Wikidata Tools everyone is talking about

Translate this post

For the past few months, Silvia Gutiérrez and Giovanna Fontenelle (from the Culture and Heritage team at the Wikimedia Foundation) have been analyzing the results of a workshop they organized during the 2023 LD4 Conference on Linked Data. The product is a series of Diff posts talking about what they discovered. This is the fourth post in the series and it will go deep into the third slide of the workshop.

So far, we have talked about yourselves, about others, and about libraries using Wikidata. Next post, we will talk about the concerns regarding some of the difficult tools and Wikidata’s problematic aspects. For now, however, we are going to have a good time with the Wikidata Tools everyone is talking about and Wikidata aspects that are useful for the participants’ work.

For this activity, we asked participants to add their suggestions of good tools and Wikidata aspects using the green post-its, their reasoning on the blue post-its, as well as emojis of hearts (❤️) or toolboxes (🧰).

This is the raking for the tools:

1 – OpenRefine (4 ❤️, 1 🧰):

  • “Reconciliation helps us to match strings and other data points to Wikidata items” (3 ❤️)
  • “Creating Quickstatements batches easily” (2 ❤️)
  • “Ability to connect with OpenRefine to enable batch editing and creation of datasets, and Google Sheets to enable collaborative working” (2 ❤️)

According to its page on Meta-Wiki, OpenRefine is “…a free data wrangling tool that can be used to process, manipulate and clean tabular (spreadsheet) data and connect it with knowledge bases (…) It is widely used by librarians, in the cultural sector, by journalists and scientists, and is taught in many curricula and workshops around the world.

OpenRefine is a tool heavily used by the Wikimedians, traditionally for uploading and editing items on Wikidata. It even won, in 2019, the category “Editing”, during the WikidataCon Award 2019. For a few months now, OpenRefine is also available for editing and uploading files to Wikimedia Commons.

Temporary photo installation in a public space to award OpenRefine in the category “Editing” during WikidataCon Award 2019 (OpenRefine Birgit Müller, Mouna Assali, CC BY-SA 4.0, via Wikimedia Commons)

If you want to discover more about OpenRefine, you can check its official website and its pages on Meta-Wiki, Wikimedia Commons, or Wikidata. If you want to learn how to use OpenRefine, there are some interesting training available:

Library Carpentry: OpenRefine – A course about Wikidata and OpenRefine
OpenRefine for Wikimedia Commons: the basics* – An introduction course on how to use OpenRefine with Wikimedia Commons
* This course is also available in Spanish, French, and Italian (Portuguese coming soon!)

2 – Scholia (3 ❤️):

  • “Because I can showcase the research at my university” (1 ❤️)
  • “It makes our data about researchers in Wikidata more discoverable on the open web”

Scholia is one of the most popular tools out there on Wikimedia projects, as it has great visualizations. However, it’s not surprising at all that it is one of the favorite ones indicated by Librarians. This is because Scholia is a tool that uses Wikidata to create visual scholarly profiles for many items, including topics, people, organizations, species, etc.

For example, here you can check Denny Vrandečić’s (creator of Wikidata and MediaWiki) profile and learn more about his publications, like the number of publications he published per year or even the number of pages per year, the topics he researched the most, and the authors he collaborated the most with.

In this other example, you can learn about the Technical University of Denmark: the topics its researchers have published on, recent publications, awards, and much more. It’s also possible to learn which WikiProjects have more items (spoiler: it involves Mathematics!) and even the global distribution or the male vs. female difference recipients in awards such as the Nobel Prize in Chemistry.

The global distribution of Nobel Prize in Chemistry recipients (via Scholia)

3 – Quickstatements (2 ❤️):

Similarly to OpenRefine, QuickStatements (or QS) is one of the major Wikimedia tools and it is heavily used by Wikimedians, especially for Wikidata batch edits – even though it also works with Wikimedia Commons. It is based on a simple set of text commands and it can add and remove statements, labels, descriptions, aliases, qualifiers, sources, and any other data on Wikidata.

As the comment highlighted, one of the ways Wikidatians find to use QuickStatements is with Zotero, which is a software for managing bibliographic data and related research materials. Since 2017, Zotero has had a Wikidata translator and QuickStatements can export metadata from Zotero. This way, users can add to Wikidata works that are in their Zotero library or get the information from Wikidata from a publication to add to their Zotero library.

An example of the export translator from Zotero to Quickstatements (Firefox, Zotero, zotkat (GPL or AGPL), via Wikimedia Commons)

4 – Mix’n’Match (2 ❤️):

  • “ability to link multiple identifiers from multiple name authority files in one item and benefit from associated wikidata properties to enhance research value of name data”
  • “Also lets us reconcile names with what is in Wikidata and easily create new items in Wikidata that we later enhance.”

“Red link lists on steroids” is how Mix’n’Match defines itself. The tool allows users to match entries of more than 300 thousand external databases with Wikidata items.

For example, for thesaurus, Mix’n’Match offers 60 in several levels of completeness and contributors can choose to help and complete their mappings on Wikidata. It’s possible to help add items related to the Musical Instrument Museums Online or even to the Art & Architecture Thesaurus by the Getty Research Institute.

According to the Meta-Wiki about Mix’n’Match, these are some of the top missing entries you can help to complete:

Illnesses, diseases
Comics artists and authors
Croatian biographies and encyclopedia articles
Czech biographies, people
Italian biographies and encyclopedia articles
Polish authors, politics, playwrights, theater people
Silent/early films
Women writers

Here’s a video tutorial of Mix’n’Match recorded for the “Graphic Possibilities Workshop 2020 Wikidata Edit-a-thon”:

19-minute video tutorial of Mix’n’Match, during the “Graphic Possibilities Workshop 2020 Wikidata Edit-a-thon”

5 – SPARQL (1 ❤️)

  • “its helpful docs and helper tools, but also the ability to just download the whole database in bulk if all else fails and I can’t get a consistently working SPARQL query. https://query.wikidata.org/” (1 ❤️)
  • “check work, demonstrate value quickly”

SPARQL is a semantic query language for databases and it’s used both on the Wikidata Query Service and Wikimedia Commons Query Service (beta).

One can use it to extract data from Wikidata or Wikimedia Commons (via structured data on Commons), by applying a “query composed of logical combinations of triples.”
Queries can be difficult to create and to appropriately extract data from them. To help you, check this page, the Wikidata Query Builder tool, or watch this short video on how to build a query from scratch:

Short video on how to build a query from scratch (Jonas Kress (WMDE), CC BY-SA 4.0, via Wikimedia Commons)

Other mentions: 

Wikidata gadgets (1 ❤️) – “They make my work go faster”
LearnWiki – “a tool to learn Wikidata for librarians!”

And here is the ranking for the aspects of Wikidata:

  1. “Out-of-the-box data visualizations. ” (2 ❤️)
    • “They make the data look pretty and show value at a glance.”
  2. “One aspect I like is being able to create a schema or vocabulary from the existing properties to describe what you’re working on. It’s very flexible.” (1 ❤️)
  3. “Connect with external databases.” (1 ❤️)
  4. “The ability to just download the whole database in bulk if all else fails and I can’t get a consistently working SPARQL query. Haven’t had to do that for a while, though.”
  5. “Recoin and ORES.”

As these posts-its show, Wikidata can be used by a myriad of different activities and Librarians are really making use of these possibilities to not only leverage their day-to-day activities, but also to contribute to the Open Access ecosystem. Their experience and input are valuable and now that we have reviewed the positive possibilities of Wikidata from Librarians, we are ready to know more about how we can improve from their perspective and explore, in the next post in this series, the Main Challenges of Wikidata for Librarians.

This is the fourth of six blog posts! Do you want to read it from the beginning? Here’s the list of links to the previous posts: 

  1. #LD42023 I: The Future of Wikidata + Libraries (A Workshop)
  2. #LD42023 II: Getting to Know Each Other, Librarians in the Wikidata World
  3. #LD42023 III: The Examples, Libraries Using Wikidata
  4. #LD42023 IV: Wikidata Tools everyone is talking about (this post!)👈
  5. #LD42023 V: Main Challenges of Wikidata for Librarians 
  6. #LD42023 VI: Imagining a Wikidata Future for Librarians, Together

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?