In the latest edition of Wikimania in Katowice, wikilari Galder González, in a presentation titled: “Abstract Wikipedia and the dream of a Universal Language“, pointed some things that tend to go unnoticed about man-machine collaboration, and the production and knowledge organization. Inspired by his magnetism, we begin to shape these reflections. After that, we synthesize them in conversations and lectures, trying to also connect them to dialogues and workshops focused on the part that generative artificial intelligence plays in Wikimedia’s ecosystem.
The product and not the process
Just like electricity or tap water, the existence of Wikipedia seems to us like a finished object and accessible at will, a commodity of some kind, tidy and carefully ordered, available online, without surprises and in any language. The success of the result – the encyclopedia, the product – all of that makes us forget the process: the community and its voluntary labour. The Wikimedia community is certainly not oblivious to the ripple effects of generative artificial intelligence on any area involving the processing and transformation of textual information. We have participated in several forums that went beyond examining the capabilities of large language model (LLM)-based applications to produce convincing results by writing or enhancing articles. These discussions aimed to understand, before it’s too late, the potential impacts on a delicate and unique ecosystem like that of free collaborative encyclopedias. The mere existence of so many solid Wikipediaes in each language should already strike us as an unexplainable accident. Its foundations—radical trust and large-scale decentralized collaboration—often lead to zombie pseudo-sites filled with spam and noise in other projects, yet in Wikipedia, they produce an elegant and monumental handcrafted work of meaning and utility that continues to be refined and gain significance.
Machine readers
The interest of digital corporations in extracting knowledge from Wikipedias, and its structured reverse, the open database Wikidata, is well known. Search engines and algorithms use them to identify concepts, entities, people and improve their relationship with the real world, and the GPTs of the world have read and butchered their content to learn how to talk and explain things to us. Until now, Wikipedia’s reading machines fulfilled a mediation and search role, where already the zero-click effect affected the search engine-based economy. Nowadays, even Google’ s Knowledge Graph seems to be a harmless mischief, because fast as lighting, these automated data, text or information collectors can also continue writing the encyclopedia.
Our experience with ChatGPT is just the tip of the iceberg – an explorative and manual conversation – compared to what a pipeline can do – a program structured in increasingly refined stages with a sophisticated network of inputs and outputs – in which several artificial agents agree among themselves an outline for an article, locate sources and produce an entry step by step, as does the recent prototype STORM, the result of research supported by the Wikimedia Foundation and with results that can be described as overwhelming. Another line in development, such as Abstract Wikipedia, seems to have a harder path because its purpose is to produce a universal language model (that drives the production of content from factual data and apply to minority languages and trade-offs between knowledge produced in different languages).
Natural readers for a slow-encyclopedia
Language technologies can already produce functional results, especially when used in sophisticated production lines. But can we speak of satisfactory results? Beyond the concept of “skills without understanding”, coined by Daniel Dennett and highlighted by the artificial intelligence researcher Ramón López de Mántaras in his reflections on common sense and general intelligence, the question that makes us uncomfortable is whether we are really interested in an encyclopedia that produces itself, and therefore ceases to function as a community that reads and doubts, filters and searches for sources motivated by the desire to build a compact map of knowledge. The automatic encyclopedia is feasible, but it will surely also lead us to a scenario of artificial reading, in which generative machines compile information to produce encyclopedia articles that other machines will read to learn to produce texts with apparent meaning.
Is it really in our interest to delegate the production of Wikipedias to artificial text generators? The temptation of productivity is undeniable, we all want “more, bigger and better”. But we would forget that it is precisely the desire to write and revise the encyclopedia that motivates thousands of contributors to read, learn and contrast information. The energy consumed in making “by hand and without permission” an organized and reliable content infrastructure – Wikipedia – has as a tangible result a dam against the tide of insubstantial, tendentious, ephemeral and disjointed content.
Periodically we are alerted about the difficulty of attracting new editors, and of maintaining the motivation of those of us who already edit. Now several voices are warning about a certain change in our self-perception as Wikipedians. Alek Tarkowski, director of strategy at the Open Future Foundation, pointed out in the Wikimania panel entitled “Wikimedia & GenAI: A 360 movement panel one year later” of the gradual shift in these feelings of merely “feeding a dataset” and not participating in the collective adventure of knowledge. Wikipedia serves as an excuse to keep searching and reading reliable sources of information, and to mobilize our competence to construct meaning and draw the world to our liking. The Internet is a huge organism, Wikipedia is just a skeleton that helps to shape it and a safe harbor for humans who read and write. Natural readers, writing and conversing at a slow pace, natural intelligences that relate to information and to the environment with a sense of responsibility.
Does Wikipedia dream of electric readers? Wikipedians sometimes dream of electric encyclopedias, but once awake, they find that crafting little piece of their Wikipedia gives sense to reading and keep reading, and pass on that knowledge acquired with effort and pleasure. We don’t need the best automatic encyclopedia; we just want to do it well and enjoy the journey before the destination.
Originally published in Spanish, 19th September, 2024 by Tomás Saorín professor at Library and Information Science studies at University of Murcia (Spain) and Florencia Clases, president of Wikimedia Spanish Chapter and coordinator of Libre Culture at University Rey Juan Carlos I (Spain)
Can you help us translate this article?
In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?
Start translation