Odia Wikisource digitizes classic books to create large Unicode text library

Translate this post

Group photo-KISS Bhubaneswar-2014December3
Group photo of the faculty and students of KISS Bhubaneswar who took part in the campus program to digitize books.
Group Photo by Subhashish Panigrahi, under CC BY-SA 4.0

In January 2013, some of the active Wikimedians from the Odia Wikipedia community submitted a request for the approval of the Odia Wikisource. The Odia language is one of the six Indian classical languages, and it is spoken by more than 40 million people worldwide. After two long years of persistent effort by the community, the project finally went live on October 20, 2014. This online library project aims to archive text from early literature and old books now out of print, with a license that allows reproduction, even for commercial use. Odia Wikisource surpassed other conventional archives with its features: lightweight, completely text based and searchable — but accessible on computer and mobile devices. Texts from books are re-typed to make sure that they appear in search engines. With thousands of books printed so far in this language, Odia Wikisource opens up a whole new world to readers and book lovers.

The incubation

Like other new Wikimedia projects, “Odia Wikisource” was first created as an incubator project. No community existed to digitize books for it. Existing Odia Wikipedians doubled their time spent on the wiki to keep the project growing. For someone like Mrutyunjaya Kar, a veteran editor on many Wikimedia projects in four languages, it was never an easy job to devote so much of time balancing life and work.

Our language and literature are rich, and I think the Internet is the best place to open them to the entire world. Those who are in need of Odia books often don’t get to accesss the books of their choice. The Odia Wikisource could be a platform for making the valuable texts available to people of all age groups.

— Mrutyunjaya Kar

Open Access To Oriya Books (OAOB), a book digitization project launched by Odisha based non-profit Srujanika in collaboration with National Institute of Technology, Rourkela, and literary organization Pragati Utkal Sangha, became even more valuable after Odia Wikisource took off. Currently, OAOB houses more than 200 books, a majority of which are in the Public Domain. A few of these books that were old and far from being put through OCR (Optical Character Recognition, a technique used to create text from images of typed or written text) were retyped in Unicode on Odia Wikisource.
The author has been privileged to be part of this great journey, which took a new shape with the beginning of relicensing copyrighted books under a Creative Commons Share-Alike license initiated by the Centre for Internet and Society’s Access To Knowledge program (CIS-A2K). To begin with, thirteen books from three authors in the first phase were relicensed under CC-BY-SA 4.0. This required permission from 67 more books from seven different authors. Needless to say, Mrutyunjaya played a significant role in acquiring permission from two of these authors. This has been the highest number of resources ever relicensed under a Creative Commons license to gear the open access movement in the Odia language.

File:Odia-Silalekharu mobile (Documentary).webm

An “Odia Wikisource Handbook” for new contributors that gives brief idea about enabling typing in Odia, input methods and digitizing books on Odia Wikisource.
© Subhashish Panigrahi, freely licensed under CC-by-SA 4.0.

Digitizing the classic Odia book Bhagabata

Odia Bhagabata is one of the early writings that has reached millions of readers over the centuries with the beginning of Bhagabata Tungi culture in Odisha. Authored by Jagannatha Dasa in the 14th century, this twelve volume work has never before been available in Unicode on the Internet. Bhagabata has gone beyond being just a book, people even read the text to the ears of a dying person. A version typed in several legacy fonts was available on portal Odia.org, which came in handy while looking for a digital version. Many followed the digitization work for the book with an emotional call. Encoding converters were built and old converters were modified to cater to the needs of this voluminous work. After converting encoding, proofreading and formatting by at least eight new Wikisourcers, the classic work was digitized.

Odia Wikisource@campus, Kalinga Institute of Social Sciences, Bhubaneswar, Odisha, India

To engage with the students of Kalinga Institute of Social Sciences (KISS), an institution in the Indian state of Odisha’s capital city, Bhubaneswar, and to enrich Wikimedia projects in South Asian languages, CIS-A2K signed a Memorandum of Understanding in January, 2014. This materialized when a 3 months long campus program was initiated in September 9.

Contributing to Odia Wikisource was really helpful for us. This will also help us to document more about our own communities. Stories of our linguistic and cultural heritage has never been told to the world.

— Susanta Majhi, student and Wikisourcer, KISS

Faculty under a coordinator were trained about digitizing books on Odia Wikisource. Faculty then formed nine teams with four to five students from undergraduate and masters classes. Most of the students and some of the faculty had never typed in Odia before taking part in the program. Despite holidays and examinations, these nine teams digitized about four books by Odia author Dr Jagannath Mohanty. It is important to note that all the students speak in various aboriginal languages as their native tongues, and Odia is a link language for them, but as it is the official language of the state, they also learn Odia and are educated in Odia. Learning to type in Odia should be beneficial for them for job opportunities in, for instance, state government offices.

Public gathering “Odia Wikisource Sabha 2014”

Odia Wikimedians with invited guests during “Odia Wikisource Sabha 201
“Odio Wikimedians with guests, Odia Wikisource Sabha 2014 ©” by Saroj Kumar Behera”>ସରୋଜ କୁମାର ବେହେରା Subhashish Panigrahi, under CC-by-SA 4.0
To educate more people about the Odia Wikisource project, the Odia Wikimedia community from Odisha organized a public gathering, “Odia Wikisource Sabha 2014”, on November 28, 2014. Speaking during the event, poet and thinker Haraprasad Das suggested being selective in accepting books for relicensing and digitization rather than blanket move for accepting all the books. Das also emphasized creating a team of language experts for helping to curate, and having computers in every literary center to teach Odia typing and Wikipedia/Wikisource editing. “Being part of this historical moment of seeing so many aboriginals contributing to Odia language is my good luck,” Das said. Soumya Ranjan Patnaik, founder and editor of Odia daily The Sambad, who joined as the chief speaker, announced a collaborative project for a competition among school students where they will be awarded based on their Odia Wikipedia article writing skills starting this new year. “Language should never be a barrier for anyone. Odia Wikisource is a democratic library — unlike the conventional libraries set up by the government,” Patnaik told the audience.
Subhashish Panigrahi, Wikimedian, and Programme Officer, Access To Knowledge.

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

Can you help us translate this article?

In order for this article to reach as many people as possible we would like your help. Can you translate this article to get the message out?

Inline Feedbacks
View all comments

[…] for a typeface, especially in a universal encoding standard like Unicode, became apparent during a three-month digitization project on Odia Wikisource, an Odia-language online library and sister project of Wikipedia. Many of the […]

[…] for a typeface, especially in a universal encoding standard like Unicode, became apparent during a three-month digitization project on Odia Wikisource, an Odia-language online library and sister project of Wikipedia. Many of the […]

[…] Wikisource turned two in October 2016. Started in 2014, the project has over 500 volumes of text including more than 200 books from different genres and […]