Spell checker tool for South Africa’s indigenous official languages released

6th October 2022 By: Rebecca Campbell - Creamer Media Senior Deputy Editor

Spell checker tool for South Africa’s indigenous official languages released

The Department of Science and Innovation (DSI) announced on Thursday that a spell-checking and hyphenation-checking software tool for 10 of the country’s 11 official languages had been released and could be downloaded free, from the South African Centre for Digital Language Resources (SADiLaR). The languages concerned are Afrikaans, isiNdebele, isiXhosa, isiZulu, Sesotho, Sesotho sa Leboa, Setswana, Siswati, Tshivenḓa and Xitsonga. (The eleventh official language is English, for which there are plenty of spell checkers available.)

The development of the spell checkers was funded by the DSI and undertaken by the North-West University’s Centre for Text Technology (CTexT). The project was implemented as an element of the South African Research Infrastructure Roadmap, which is a DSI programme to create research infrastructure, built on current capabilities and strengths, across the complete public research system, to meet future needs.

“By working in close collaboration with linguists at South African universities and the national language bodies, we developed spelling checkers that evaluate words according to the official orthography of each language,” reported CTexT head Dr Martin Puttkamer. “We hope that making the spelling checkers freely available will help strengthen the digital presence of all our indigenous languages, facilitate the production of more digital texts in these languages, and provide access to information technology to all our citizens.”

The new spell and hyphenation checkers work with Microsoft Office Suite. The tool allows a user to choose the South African language they desire to work in, and recognises typing, spelling, and hyphenation mistakes and corrects them. If the spell checker does not recognise a word, it will offer alternatives.

Because languages are constantly evolving, and because South African indigenous languages are under-resourced in terms of the data needed to optimise spell checkers, the tool has a ‘custom dictionary’, to which users can add words that they frequently use but which are not in the current wordlists. Users can, if they wish, share these words with the tool developer, so that they can be verified and included in future upgrades of the wordlists. SADiLaR will welcome such information.

“SADiLaR is a national research infrastructure mandated to support research and development in the domains of language technologies and language-related studies,” explained SADiLaR executive director Professor Khumalo Langa. “It is thus a great triumph for us to be able to make available such a valuable tool to support multilingualism in South Africa and build up the necessary technological resources to ensure our languages remain relevant in the Fourth Industrial Revolution.”