Content translation tool helps create one million Wikipedia articles

The origin of the tool/Initial adoption

The Content Translation tool, which was developed by the Wikimedia Foundation Language team in 2014 to simplify translating Wikipedia articles, recently reached a massive milestone of supporting the creation of one million articles.

The tool plays a key role in closing knowledge gaps on Wikipedia by making it easier to  translate Wikipedia’s knowledge into new languages. The tool’s journey has been steady and evolving over the past seven years. It is available by default in 90 Wikipedias, and it exists as a beta feature in the rest. It is used to translate an article every three minutes, and the articles created with the tool are deleted less often than those created from scratch. 

We are excited to celebrate this remarkable milestone with over 70000 Wikimedia contributors who helped get here! As we celebrate, we also want to reflect on the tool’s journey so far and take a look back to its beginnings and other major moments…

In 2014, the tool was tested in Wikimedia Labs, with a focus on translation from Spanish to Catalan. The tool was deployed after receiving  positive feedback. This was just the beginning of our success story. The decision to test the tool with the languages mentioned above was influenced by the availability of the robust open-source machine translation support service through Apertium for them, and by the passionate Catalan community of contributors that were eager to participate in the testing and feedback process. These communities were the backbone of the tool and its chronicle is incomplete without Spanish and Catalan communities.

Getting established: A more solid tool

Following the successful deployment in Spanish and Catalan Wikipedia, in January 2015, the tool was enabled in six other Wikipedias (Danish, Esperanto, Indonesian, Malay, Norwegian (Bokmal) and Portuguese) as a beta feature. The deployment was further extended at the request of most communities to 22 Wikipedias. After three months, 260 users had translated with the tool, and 1,000 users manually enabled it on their Wikipedia from beta. The success so far motivated the team to deploy the tool in beta for all Wikipedias. The above decision was influenced by the positive acceptance and usage of the tool in less than six months of its enablement in eight languages. It is interesting to know that the outcome received mid-year 2015 proved our assumptions of accepting the tool by recording 1,300 new translators and 3,000 new translations. That year was undoubtedly a busy one for the development team and our ardent translators, who also reported dozens of bugs.

Another remarkable, eventful period for the Wikimedia Foundation Language team was when the tool started in 2018. This period was the revision phase of the Content Translation tool. Based on the feedback from translators across different languages about the tool, its impact and use over the years, the translation tool was ready for a revamp. The change focused on incorporating the more solid VisualEditor editing surface and other milestone improvements to evolve the Content Translation Version 2. By the end of 2019, the Wikimedia Foundation Language team had significantly updated the Content Translation tool, and it could boast of the following:

  • Better guidance for newcomers
  • Improved artificial intelligence to enhance automated steps
  • Quality control mechanisms for machine translation 
  • Extended machine translation support service from Yandex, Google Translate, Youdao, Matxin (currently replaced by Elia), and Lingocloud
  • Independent customised systems to improve the quality of content in different Wikipedia communities.
  • The achievement of a five hundred thousand (500,000) articles milestone

Notwithstanding the above achievements, Content translation’s developers had more work to do, with a big theme being to help more communities utilise the translation tool and attract newcomers in emerging communities. Being energised by what they have achieved with this tool and craving to support the volunteer communities that are ready to make the sum of all knowledge available for all, the team initiated the Content Translation Boost project. 

New ways to translate: Sections and mobile

We started research to explore more ways to translate and make the tool more pronounced, resulting in the launch of the Section Translation tool initiative and a process to enable Content Translation by default (out of beta) in Wikipedias that had fewer than 100,000 articles with the potential to grow with translation. With the above plans, the Wikimedia Foundation Language team were about to take translation to another dimension.

Section Translation became the primary project of the Boost project. Section Translation is an expansion of the capabilities of Content Translation to solve key limitations of the tool:

  • prioritising a mobile-friendly tool for phone and tablet users
  • allowing the collaboration of many users to translate articles section by section
  • attracting new contributors by lowering the entry barrier from translating an entire article to just a section
  • the capability to improve existing articles and not only create new ones.

To a layman, Section Translation is still a translation tool that will help mobile device users translate articles in bits easily. Now that you know, let’s walk you through this phase.

In early 2020, before the COVID-19 pandemic, the project supported a design exploration to gather interview data about the assumptions of Section Translation. The prototype development started, and in the middle of the pandemic, the development of the tool was in full swing. In January 2021, an initial version was ready to be tested in a testing instance by Bengali Wikipedia editors. Bengali emerged as the chosen community because of their interest in the initiative and participation during the design exploration. The community tested the tool and provided feedback, and some of the feedback was adopted immediately. 

In February, the Wikimedia Foundation Language team was ready to deploy Section Translation. This marked the beginning of another tool that will further bridge the content gap in small-sized Wikipedias.

Since the first enablement in Bengali Wikipedia, improvements have been made on the tool based on community feedback and takeaways from user research conducted after the deployment in Bengali Wikipedia. Some of the improvements are: introducing other entry points to increase discoverability and the ability to search for an article of interest.

As for the Section Translation tool, we are still evolving the tool and learning from the outcomes. Currently, after a feedback and validation process, it is enabled in five more Wikipedias: Igbo, Yoruba, Hausa, Thai, and Kurdish. We are excited about its future and impact. While we evaluate the tool’s impact and users’ experience, and also continue to improve it, we welcome other members of the different Wikipedias to test the Section Translation, provide feedback and indicate interest in having it.

Congratulations and thank you to everyone who has been part of the journey to this one million article milestone!