Malay Wikisource, ten thousand edits after: the story so far

Translate this post

By the grace of God, the Malay Wikisource enters the main Wikimedia space on 30th April 2024, and it is the first Wikimedia project ever been launched in the history of the Wikimedia Community User Group Malaysia (WCUGM). Ever since the launch, I’ve been a regular on the wiki, primarily for two reasons: 1) creating “models” of finished pages as future references for format editing and transcription, including creating formatting templates and foundational categories; and 2) promoting the wiki to the public.

On 28th June, around the two-month anniversary, the Malay Wikisource reached its 10,000th edit. Here’s the story so far: what we’ve done, building, and future prospects for the good of the wiki, WCUGM, and the field of Malay literature in general. The story of ten thousand edits of transcribing, trial and error, discussions, and the ideas that came within them.

This writing aims to give an idea on what should be expected when launching a new wiki (especially Wikisource), and also to record the history of our beloved wiki in the oncoming future. A first “newsletter”, so to speak, and also as a report on the early stages of Malay Wikisource.

What we’ve done

The content

After the launch of the standalone wiki on 30th April, various texts have been in transcription, now with . Current focus now are seemingly concentrated on classical Malay texts, including the UNESCO-recognized Sulalatus Salatin (a.k.a. Sejarah Melayu) and Hikayat Hang Tuah, mythology-intertwined epics revolving over the old Malay kingdoms and development.

The first pages of the Sejarah Melayu manuscript.
The first pages of the Sejarah Melayu manuscript.

Going forward to the colonial, early modern age, we can see another focus—Malay linguistic books. Currently, numerous Malay dictionaries of several languages are uploaded and currently transcribed, encompassing works such as the local trilingual Kamus Kecil (Malay-Arabic-Sundanese), to the magnum opus Malay-English dictionary from British officer, R. O. Winstedt. (One note: foreign-language material are excepted in Malay Wikisource if it pertains on Malay linguistic matters.) On my personal experience, transcribing these books are fairly much easier than the classical literature of the past. In this period, attempts of romanization Malay from the common Arabic-derived Jawi script has been undergone by colonial officers and Western linguists, as explained by Winstedt’s Malay Grammar.

First main page of R.O. Winstedt’s Malay-English dictionary.

Moving to the modern era, Wikisource contents switch from books to documents as we’re leaping through the “universal” public domain limit of around 100 years ago. For this period, works are concentrated on government-issued documents that are clarified as public domain material by local law, such as the weekly early copies Pelita Brunei, and various Malaysian acts such as the Copyright Act (Akta Hak Cipta 1987).

Front page of Pelita Brunei, edition of 1st July 1956.

Sharing the content

A tool is no use if no one uses it—same with a wiki. Outside of the bytes and edits decorating the Wikisource, publicity plays a key role in building the Wikisource, from WCUGM and individual efforts. On WCUGM’s side, we’ve build up our social media platforms specifically for Wikisource: check out our X/Twitter account, by the way!

Individual efforts goes hand-in-hand with WCUGM, and perhaps even stronger, given that our current WCUGM team has little capacity to update the social media … a typical stuff for a small group like us. Anyway, this is something that me myself has contribute to. This is my personal X/Twitter account telling my experience of transcribing Winstedt’s dictionary as above.

Another example is from my fellow Wikisource editor, Hadith Fajri, incorporating Malay Wikisource in his #pelikpelikwikisumber (“the oddities of Wikisource”) series, uncovering unconventional tidbits of olden Malay manuscripts in Wikisource.

Such sharing give a slight positive effect on the wiki awareness. Other than the definite example of being observed by the local community, it also attracts other interested users to contribute to Wikisource … only to realise that we don’t really have a starting point for beginners yet. So, that’s one problem … and just one of the many things we’ve faced on the opening stage on the wiki, as what I’ll explain next.

Building the foundations

Over the past two months, the main focus of Malay Wikisource are two: increasing content, and laying the foundations for it. Within our ten thousand edits, several issues has been identified and discussed among our small but dedicated members, making constructive noise at the tea house (Kedai Kopi), and outside.

The quality

Content quality should be the upmost priority on any wiki, and that’s what I hold onto in all of my wiki-works. As such, it’s imperative that the basics of what counts as a quality content should be set up early. In this case, the firsts of the discussions in Kedai Kopi are what should be included in our Wikisource. In the end, it’s generally decided that these contents should be capable of insertion into Wikisource:

  • any Malay-language sources that’s not self-written (i.e. “vanity publications”), and achieved public domain status, and;
  • any public domain material regarding Malay linguistics regardless of main language, including dictionaries, among others.

Furthermore, another concern is to what extend the formatting style of the original content should be replicated into the digital Wikisource format. While my consideration is to follow the English Wikisource principle of “to give an authentic digital transcription of the content, not an imitation of a printed page; to produce a type facsimile rather than a photographic facsimile,” while some of the community wanted to go further. Some other discrepancies discussed including the variability of the then-unstandardized Jawi spellings of manuscripts, and the lower quality work of olden transcription efforts done during the multiple Wikisource age.

Eventually, foundations started to take shape. Formatting shouldn’t be too extreme to the photographic sense, Jawi writings should be copied as far as what is available in our computer keyboards, some letter variations are just calligraphic variations and can be ignored, and olden transcriptions must be revised.

All in all, this highlights the necessity of the community to prioritize quality from the very start, and how important it is to set up foundations in the early stage, or perhaps even before a wiki is officialized. The gathering of the community within formalized discussions is also crucial to set the foundations solid, together with the help of wiki association(s) aiding in its development.

Outside of Kedai Kopi, discussions on the construction of Malay Wikisource goes beyond the textual online walls. In this subject, WCUGM had a wonderful luck of becoming the host of ESEAP Conference 2024. Few presentations pertain on Malay Wikisource. Me myself, PeaceSeekers, also touched on the opportunities of Wikisource in my presentation pertaining the future directions of Malay wiki projects.

The technicals

Technical issues are one of the early troubles, such as in writing display and formatting. One example is the display of Jawi Arab writing that’s deemed as “too small”, with a current workaround being using the individual work’s style page to increase the Jawi font, and using an alternate font to increase readability. Another peculiar example is a work using the elusive Rejang script, as in Syair Perahu from 1700, where current operating systems do not have the support needed to display the writing properly—once again, the style page becomes our workaround: just simply set a proper font to display it.

The first written plank of Syair Perahu of Hamzah Fansuri.

Speaking of the Rejang script—a rare, classical Malay script that no one of us knows how to read—one other principle we’ve discussed is regarding transliteration of non-Roman scripts for readers’ references, and the need of the translation namespace. Based on a brief discussion, the community decided that transliteration efforts can become a good additive to the Wikisource, although should not be a main priority as we’re waiting for the translation namespace to arrive. Current workarounds are simply playing with sections to “divide” a work into the original script and Roman transliterations; this is merely just temporary and non-obligate measure. A new Wikisource does not need to do this in case they would like to incorporate transliterations in their early stages.

The community

The lack of awareness and popularity of Wikisource is one matter. Another matter is as far as we concerned, Wikisource is far more technical than Wikipedia and Wiktionary for comparison. The multiple important namespaces, the technicality of transcribing old documents, and formatting makes Wikisource a platform with a significant learning curve. And the problem is … beginners don’t really have a place to start. This is the thought dawned within my mind whilst a question was presented to me in Twitter: an interested person wants to contribute, but there’s no place to start.

At least for me, the solution that I gave is transcribe something that is relatively easy to be done, with only basic-level formatting (in this case, I gave the Pelita Brunei), and replicate my transliteration steps. This is something that can be considered when building up a new Wikisource: prepare simple-level texts as reference for beginners to understand how Wikisource works. In the future, a beginners’ page akin to this link here in our Wikipedia is plausible.

The content

As of now, there are many Malay texts available in Commons for us to transcribe, the “Collections of Leiden University Library (uncategorized)” Commons category for example has plenty of Malay documents up for grabs, but besides them, a lot more of manuscripts and documents are still unavailable. There are approximately 20 thousand Malay manuscripts scattered within Malaysia and worldwide, and thousands more of law acts and writings, and governmental reports already prescribed as public domain material per the Copyright Act. Some of them are available freely within the Internet, but needs some of hunting work to do, such as this one I salvaged from an archived, now-defunct Malay manuscript database.

Now, empowering Wikisource is entering the responsibilities of local Wikimedia associations, where outreach programs and cooperation should be done with all levels of power, including advocative efforts of awareness towards public domain material, and finding individuals of interest.

Our future

The wiki-world

Wikisource in general envisions itself as a free, digital library for the world. In my visions, having our own “personal library” in our local wiki-sphere gives us a significant pool of centralized resources and references for other projects such as the Malay Wikipedia and Wikisource. It’s like being a Wikipedia editor and getting the news that your municipality just opened a public library at your doorstep full with old texts. You get excited that you have an easy spot of finding a bit of references for your projects, only to find out that it’s a small library, but you can also contribute to it.

It’s a bit of a cycle system: Wikisource materials theoretically can be used to catalyze articles on manuscripts in Wikipedia, and Wikipedia users can go to Wikisource to see the raw texts for themselves, kinda like a textual museum. Dictionaries in Wikisource in the meanwhile can be referenced in Wiktionary and the lexeme part of Wikidata. The one thing that matters here is materialize these bridges in an apparent matter: something that should involve the communities each wiki.

The movement world

Launching this Wikisource, as what I said earlier, is a big step for our association as we’re opening the first wiki that we’ve launched under WCUGM. As such, many trials-and-errors and experiments had been done to see what’s best for our new wiki. Here, I tried to present a report on this milestone as a piece of history, and also as a guideline for future wiki launches, either from WCUGM and beyond. I do hope this report can achieve my objective as I’d stated here, and wiki launces in the future gets smoother.

The outside world

Perhaps the most benefitted of this Wikisource is not just us, but the outside world. Malaysia, or to be specific, the Malay language didn’t really have an easy-to-go website to obtain Malay manuscripts online, with some available are subjected under a paywall or restrictive access such as the ones of the national archive or the judiciary texts behind Lawnet. Having a Wikisource will be a monumental step in the world of Malay linguistics, where researchers and ordinary individuals alike can finally gain access to free, unrestricted Malay manuscripts. This sounds like I’m pulling off a Sci-Hub here, but nevertheless, isn’t this our mission: to free knowledge? Well … in a legal manner, of course.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?