How to help save endangered languages in India – a project on oral culture digitization

Open knowledge and Oral Culture

Wikimedia hosts several projects which are aimed to facilitate, to quote the Wikimedia website, “a world in which every single human being can freely share in the sum of all knowledge.” India being a multilingual country has vast possibilities in this context. It is the vision with which the Oral Culture Transcription Toolkit and the currently running grant have been launched. These are in sync with the UNESCO International Decade of Indigenous Language, and specially relevant to a space where 197 languages out of approximately 700 are in danger. People being empowered to digitize and popularize their language and culture is the solution to the worrying situation. 

Wikimedia projects – tools for strengthened presence of linguistic diversity online

Via this project we aim to utilize Wikimedia projects: Wikimedia Commons and Wikisource as repositories of oral culture. Other Wikimedia sister projects can also be included as per new changes. Wikipedia has also been used by people as a means of assertion of their language, it exists in 326 languages. We aim to expand the variety of media that is used by volunteers: apart from text, we aim to include audio, video, and transcribed text. This can be potentially useful for language communities that do not have sufficient volunteers at present. Not to mention, languages that do not have a lot of texts would benefit by representation of their oral culture online. Additionally, this would enable the documentation of languages online by bringing cultural languages on digital platforms. It may potentially burgeon into the revitalization of endangered languages by increased presence and creation of content from diverse languages on Wikimedia projects. 

Wikimedia projects as platforms for non-conventional forms of knowledge

We are interested in using Wiki platforms since it is a proponent of free knowledge. Since we are looking forward to understanding the needs of indigenous language speakers and looking for ways to enable them to document their own language and culture, open knowledge platforms are suitable for it. Wikimedia sister projects truly have the potential to support languages and cultures in various mediums. We believe that we can reach this goal with the joint effort of volunteers and language activists. 

It is with these thoughts that the Wikimedia funded project: Oral Culture Transcription Toolkit was carried out. Angika folk songs uploaded on Commons and transcribed on Wikisource were the proof of concept for this toolkit project. While testing the toolkit, the trainees also recorded folksongs and oral history in their languages. The things that this toolkit includes information on are: preliminary research, navigating the wiki platforms, audio-video documentation, and Interview Questions. These latter two sections are based on two toolkits: OpenSpeaks Toolkit and Jewish Culture Elicitation Protocol. The toolkit is designed to provide consolidated steps to document oral culture and history.

Tej Kaur – Punjabi Speaker sharing her life experinces. Video by Gill jassu, CC BY-SA 4.0

In the current project, we are looking forward to employing and improving this toolkit. We are going to conduct a series of workshops to train people in utilizing Wikimedia platforms for digitization of oral culture. Via these workshops we are going to introduce the Toolkit and aid the attendees in bringing content from their culture over wiki platforms. To achieve this goal, we are carrying out the current project- Needs Assessment for documentation and revitalization of Indian languages using Wikimedia projects. Under this project we are accessing the needs for language documentation and revitalization by interviewing and discussing with three groups of people: Wikimedians, indigenous language speakers, and language experts. Via these conversations we will evaluate the general scenario in India regarding language documentation, barriers to inclusion of languages in digital spaces, and explore how wikimedia projects can be utilized to support various indigenous languages. 

We also aim to expand the Oral Culture Transcription Toolkit so that it is more widely applicable for various languages. Introducing it to a larger audience and measuring its effectiveness at the current stage, creating changes based on feedback and observation is another portion of the project. As a part of this, we are arranging workshops for August 2022 in which both groups-indigenous language speakers and Wikimedians are welcome to participate. The latter can participate in the capacity they choose: as participants, supporters, or volunteers. If you are an interested Wikimedian or know someone who might be, here is the survey form for you to fill or share.

And here is the general form designed for non-Wikimedians.

There are several other ongoing Wikimedia-funded projects for languages and cultures. Here are a few projects of similar nature: A project on Indigenous Audiovisual Tools via Wiki tools in Cambodia, and a project on Grassroots Language Documentation in Nigeria for Wikipedia and its sister projects

How can you contribute?

As mentioned earlier, you can contribute as a supporter, translator, trainer, or as a participant to the workshops being conducted under this project. Apart from the workshops, we are also doing research based on surveys and interviews with language experts, wikimedians, and indigenous language speakers, as a part of this project. We are beginning with this survey designed for wikimedians, it will take a few minutes to fill. As mentioned earlier, we are also conducting interviews and discussions with Wikimedians. If you are interested in contributing, you can also provide your contact details in the survey so that we can contact you for an interview. Your contributions will be acknowledged in the Report that we will create as one of the outcomes of this project, it is separate from the general project report. 

FAQs:

Q: When are the workshops happening?

A: The workshops will be held in August and September 2022.

Q: What would be the duration of the workshops?

A: The workshops would last 6 hours, divided into 2 hour sessions on three days. There will be a gap of 1 week in between each session. 

Q: What are the activities being conducted under the workshops?

A: The participants will be trained about digitization of the oral cultural component of languages via the Oral Culture Transcription Toolkit. They will be given small tasks after each workshop, to be finished before the upcoming workshop. They will also be given the opportunity to share their experiences and insights about language digitization during the meets. 

Q: Where will the workshops be organized?

A: The workshops will be conducted in both online and in-person within India. The first two sessions will be online and the 3rd will be held in-person. The online sessions will be conducted via Google Meet/Zoom. Those who cannot participate in-person during the 3rd session will be provided an online session. 

Editors note: The original publication stated there were 297 languages. The correct count is 197.