Wikimedia wikis are available in nearly 300 languages, with some of them having pages with mixed-script content. An example is the page on the writing systems of India on the English Wikipedia. We expect users to be able to view this page in full and not see meaningless squares also known as tofu. These tofu squares represent letters written in the language, but cannot be rendered by the web browser on the reader’s computer. This may happen due to several reasons:
- The device does not have the font for the particular script;
- The operating system or the web browser do not support the technology to render the character;
- The operating system or the browser support the script partially. For instance, due to gradual addition of characters in recent Unicode versions for several scripts, the existing older fonts may not be able to support the new characters.
Fonts for most languages written in the Latin script are widely available on a variety of devices. However, languages written in other scripts often face obstacles when fonts on operating systems are unavailable, outdated, bug-ridden or aesthetically sub-optimal for reading content.
Using Webfonts with MediaWiki
To alleviate these shortcomings, the WebFonts extension was first developed and deployed to some wikis in December 2011. The underlying technology provides the ability to download fonts automatically to the user if they are not present on the reader’s device, similar to how images in web pages are downloaded.
The old WebFonts extension was converted to the jquery.webfonts library, which was included in the Universal Language Selector—the extension that replaced the old WebFonts extension. Webfonts are applied using the jquery.webfonts library, and on Wikimedia wikis it is configured to use the fonts in the MediaWiki repository. The two important questions we need answered before this can be done are:
- Will the user need webfonts?
- If yes, which one(s)?
Webfonts are provided when:
- Users have chosen to use webfonts in their user preference.
- The font is explicitly selected in CSS.
- Users viewing content in a particular language do not have the fonts on their local devices, or the devices do not display the characters correctly, and the language has an associated default font that can be used instead. Before the webfonts are downloaded, a test currently known as “tofu detection” is done to ascertain that the local fonts are indeed not usable. The default fonts are chosen by the user community.
Webfonts are not applied:
- when users choose not to use webfonts, even if there exists a valid reason to use webfonts (see above);
- in the text edit area of the page, where the user’s preference or browser settings are honored.
See image (below) for a graphical description of the process.
The font to be applied is chosen either by the name of the font-family or as per the language, if the designated font family is not available. For the latter, the default font is at the top of the heap. However, negotiating more complex selection options like font inheritance, and fallback add to the challenge. For projects like Wikimedia, selecting appropriate fonts for inclusion is also of concern. The many challenges include the absence of well-maintained fonts, limited number of freely licensed fonts and rejection of fonts by users for being sub-optimal.
Challenges to Webfonts
Merely serving the webfont is not the only challenge that this technology faces. The complexities are compounded for languages of South and South-East Asia, as well as Ethiopia and few other scripts with nascent internationalization support. Font rendering and support for the scripts vary across operating system platforms. The inconsistency can stem from the technology that is used like the rendering engines, which can display widely different results across browsers and operating systems. Santhosh Thottingal, senior engineer for Wikimedia’s Language Engineering team who has been participating in recent developments to make webfonts more efficient, outlines this in greater detail.
A major impact is on bandwidth consumption and on page load time due to additional overhead of delivering webfonts for millions of users. A recent fallout of this challenge was the change that was introduced in the Universal Language Selector (ULS) to prevent pages from being loaded slowly, particularly when bandwidth is a premium commodity. A checkbox now allows the users to determine if they would like webfonts to be downloaded.
Several clever solutions are currently in use to avoid the known challenges. The webfonts are prepared with an aim to create comparatively smaller footprints. For instance, Google’s sfntly tool that uses MicroType Express for compression is used for creating the fonts in EOT format (WOFF being the other widely used webfont format). However, the inherent demands of a script with larger character sets cannot always be overridden efficiently. Caches are used to reduce unnecessary webfonts downloads.
FOUT or Flash Of Unstyled Text is an unavoidable consequence when the browser displays text in dissimilar styling or no text at all, while waiting for the webfonts to load. Different web browsers handle this differently while optimizations are in the making. A possible solution in the near future may be the introduction of the in-development WOFF2 webfonts format that is expected to further reduce font size, improve performance and font load events.
Special fonts like the Autonym font are used in places where known text—like a list of language names—is required to be displayed in multiple scripts. The font carries only the characters that are necessary to display the predefined content.
Several technical solutions are being explored within Wikimedia Language Engineering and in collaboration with organizations with similar interests. Wikipedia’s sister project Wikisource attempts to digitize and preserve copyright-expired literature, some of which is written in ancient scripts. In these as well as other cases like accessibility support, webfonts technology allows fonts for special needs to be made available for wider use. The clear goal is to have readable text available for all users irrespective of the language, script, device, platform, bandwidth, content and special needs.
For more information on implementing webfonts in MediaWiki, we encourage you to read and contribute to the technical document on mediawiki.org
Runa Bhattacharjee, Outreach and QA coordinator, Language Engineering, Wikimedia Foundation