This is the second part in a two-part update on the MinT translation service. In the first part we covered what MinT does and how it impacts translations, in this post we’ll share volunteer experiences in using the service.
Translators have shared their appreciation for MinT. Many of them feel the quality of MinT translation is better than that of other translation services they have used before. Underrepresented languages such as Kashmiri, Santali, Tumbuka, Sardinian, and several others have been lacking machine translation services until the introduction of MinT. Below are the experiences and thoughts of three translators from the underrepresented languages.
- Iflaq, a Kashmiri native speaker who has been a Wikipedian since 2020, found his inspiration to edit the English Wikipedia from a volunteer’s Facebook post. He clicked on the link in the volunteer’s post, and it took him on a learning adventure on how to edit Wikipedia. Eventually, he created his account, using an unusual username “511KeV,” meaning five hundred and eleven-kilo electron volts (a name that applies to a state in physics). He started his editing journey in English Wikipedia. Six months later, he discovered the Kashmiri Wikipedia while exploring Wikipedia. At first, he was disheartened to find that the user interface contained some information in English rather than Kashmiri, a language spoken by over 7.1 million population in India. Nevertheless, it was encouraging that it had few active contributors and 300 articles existed in the Kashmiri Wikipedia.
The discovery prompted a shift in his attention from editing the English to Kashmiri Wikipedia; at the point of his discovery, he was grounded in English Wikipedia and attained extended user rights. Notwithstanding his achievements in the above Wikipedia, he changed his focus to the Kashmiri language, mastered the Translate extension tool and tasked himself with translating the Kashmiri Wikipedia user interface from English to Kashmiri. Subsequently, he transitioned to making more content available on the Kashmiri Wikipedia by translating content from English to Kashmiri.
In addition, 511KeV actively engaged in outreach activities to recruit more volunteer translators for the Kashmiri Wikipedia. In his experience working with newcomers and translating content, it is time-consuming for Kashmiri Wikipedia contributors to translate because the language is underrepresented without initial machine translation support from the services available in other Wikipedia. He further explained that they translated from scratch without a “dictionary” embedded in the tool that suggested words in Kashmiri to compare, review and edit like other language Wikipedias to maximize our volunteer time.
For 511KeV and other Kashmiri contributors, It is a huge relief to have the long-awaited translation support service MinT in Kashmiri Wikipedia; the numbers say it all. The community have published a record-breaking 330 translations in just four months, 15% short of last year’s cumulative translations. The Kashmiri community openly expresses its delight in having this valuable assistance. Iflaq has effectively utilized MinT as an aid to publish over 120 sections from different articles in 3 months, including the biography of Ahmed Raza Khan Barelvi and Aziz Hajini. He is optimistic that the quality of initial translations from MinT will improve to become 100% accurate, and we would have to request that the translation limit be adjusted to accommodate the improvement and ensure that translators still review and edit their translation before publishing it.
- Prasanta Hembram, who goes by username Rocky 734, Is a volunteer translator based in Odisha, India. Although Santali has 7.6 million language speakers around the world, it only has 3388 articles on its Wikipedia project. To address this gap, Prasanta has been using the content translation tool to translate articles from English to Santali since 2019.
For the past several years, machine translation into Santali was one of the most requested software features from the volunteer community. For a long time, Prasanta was using a low quality translation service; It was also very tedious, and required manual copying, pasting and reviewing afterwards..
Finally, four months ago, the Wikimedia Foundation deployed the MinT translation service and Santali Wikipedia was one of the first Wikipedia communities to use it. Prasanta has now used the MinT translation service to translate approximately 200 articles, including articles about locations, plants, and biographies. The MinT translation service has helped him to speed up the translation process on articles into Santali. He is impressed with the overall translation quality for simple sentences, though it is not yet as effective at translating complex sentences. Other volunteer translators he has worked with also noticed that translating science and technology topics is still difficult, and wish that the service was available for more than English as a source language. Overall though, the creation of MinT is an important step forward. Almost all major contributors of Santali Wikipedia are using MinT as well as the section translation tool, which was also recently released. He is hopeful that MinT will continue to improve, and that the number of articles using the translation service will double. Overall, he shared, “We are extremely grateful to the developers for creating a machine translation tool in our language. We never expected such a tool to be available to us, and we are truly appreciative of their efforts.”
- Adrià Martin is a Wikimedian who lectures in Translation Technologies at California State University Long Beach. Within his teachings, he introduces his university students to the Content Translation tool and encourages them to save their translations in their personal sandboxes for subsequent review and publishing. His work focuses on Catalan and Sardinian translations. Adria had collaborated with a team in the past that created a Machine translation system for the Sardinian language, and he was thrilled when someone sent him the link to test MinT through the web; “At that time, I was supervising a PhD thesis, and I used MinT to generate a machine translation of some of the statistical content from Italian to Sardinian; MinT did a great job in the translation.” Adria explained.
He has yet to use MinT support in the content translation tool, but he uses the test instance often for his personal work. He commended the remarkable translation quality achieved in Catalan and Sardinian and highlighted the tool’s ability to support a broader range of language pairs; this is particularly notable considering the challenges posed by Sardinian’s under-resourced status and lack of a universally accepted standardized model for the language. Moreover, the diverse ways Sardinian is written pose a complex challenge for developing language technology for the language.
You can get started using MinT by visiting Special:ContentTranslation from Wikipedia in any language, on translatewiki.net, or directly in a test instance.
Keep track of the WMF Language team’s plans for the MinT translation on this page and be notified of recent happenings by subscribing to the Language team’s quarterly newsletter. If you have questions about MinT, don’t hesitate to ask on the project talk page.
Can you help us translate this article?
In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?
Start translation