A few days ago, the article about the 1950s song “Crying, Waiting, Hoping,” by Buddy Holly, was written for the Spanish Wikipedia using content translation, the Wikipedia article translation tool.
What makes this article special? It was the one hundred thousandth new Wikipedia page written with assistance from content translation since the tool’s introduction as a beta feature in January 2015, a mark that shows how it is already being heavily used by Wikipedia editors. Everyday we, content translation’s developers, are making new improvements to iron out problems and bring changes that will help editors write high quality articles conveniently and reduce post publication clean ups.
From the time content translation was designed, we have given priority to what users, particularly Wikipedia editors, have highlighted as important features for a translation tool like this. Before actual development work started we conducted research sessions with users,and later, as content translation was being made available gradually to more users, we continued this practice of collecting feedback and integrating their views with the development plans.
The Catalan Wikipedia community was the earliest group of users to use content translation. Most speakers of Catalan are bilingual, and they quickly adopted this tool to write new articles by translating them from the Spanish Wikipedia. Editors in many other languages have similarly used the tool to translate high quality articles from other wikis.
Ravishankar Ayyakkannu, who edits the Tamil Wikipedia, mentioned that articles are often translated into Tamil from the English Wikipedia. While content translation has not been extensively used in Tamil Wikipedia translathons yet, Nahid Sultan coordinated a successful online translation campaign for the Bangla Wikipedia to celebrate Wikipedia 15: 600 new articles were translated from English Wikipedia good articles list. 550 users participated in the event, and content translation was heavily used to introduce new editors on how to write new Wikipedia articles.
Mehtab Solangi, a Sindhi Wikipedia editor from Pakistan, started contributing in August 2015. He came across content translation while looking around the preference settings and liked the tool so much that he introduced it widely to other editors. He had never translated an article before, but he prefers using the tool as it offers a clear structure for the new article, and adapts links and categories that may have been difficult for new users who are still learning wikitext.
During the many months of design and development, we spoke to many editors who liked the interface and the ease of translation that the tool provides. Automatic translation support through machine translation services has been a major advantage for many. The Catalan Wikipedia editors have been using Apertium heavily to translate from Spanish and have observed a significant reduction in the time taken to translate a new article. Àlex Hinojo has previously noted that he could write a new article of about 20 lines in less than 5 minutes, with the tool taking care of many of the required wikitext changes.
Like many other editors, Olena—who mainly edits the Ukrainian Wikipedia—said that she would also recommend content translation to new editors. However, she adds a caveat that editors should be aware of the fact that the tool does have its shortcomings and it is important to check the accuracy of the content and any other errors that may need correction. She emphasized that to ensure exchange of knowledge, there is need for better machine translation support between languages, to be able to translate from many different wikis about content that is topical for a region or culture and is often less represented in other wikis.
While many new articles are getting written with content translation, several users have requested extending the tool to support translation of existing articles, especially stubs that can be improved from well written articles about the same topic in other languages. Better template handling is another common issue that many users have requested. While the former needs more thought, the Language team is responding with planning for a major overhaul to template support in the coming months.
Aside from the editors, content translation has also been able to connect with developers who are enthusiastic about supporting the tool. Kevin Brubeck Unhammer, an editor on the Norwegian Nynorsk Wikipedia, proposed a project under the Wikimedia Foundation’s Individual Engagement Grants (IEG) and improved machine translation support in the content translation tool for Danish, Swedish, Norwegian Bokmål, and Norwegian Nynorsk. Integrating these with the core product was a step up from his previous efforts, when he used custom hacks. Kevin suggests that content translation should be introduced to more people working in the field of language technology to interest other developers to come forward and contribute.
As we continue improvements on content translation, we also aim to reach out and introduce the tool to more Wikipedia users and help grow the sum all human knowledge across languages. The Medical Translation Project is successfully using the tool to expand its coverage of essential health content in many languages; they have observed about 17% improved productivity in the efforts taken to coordinate and complete translations, thus helping spread medical information faster around the world. We plan to continue supporting similar initiatives and connect with more individual users and groups who are working on topical projects, both short and long term. We hope to better connect with editathon and translathon events where content translation can be used.
Content translation is developed by the Wikimedia Language team. You can reach us via the content translation project page or follow @WhatToTranslate on Twitter for updates on the tool.
Runa Bhattacharjee, Language team