Leveraging on Cross Linguistic Similarities to Reduce Grammar Development Effort for the Under-Resourced Languages: a Case of Kenyan Bantu Languages

Show simple item record

dc.contributor.author Kituku, Benson
dc.contributor.author Nganga, Wanjiku
dc.contributor.author Muchemi, Lawrence
dc.date.accessioned 2022-02-07T11:02:06Z
dc.date.available 2022-02-07T11:02:06Z
dc.date.issued 2022-01
dc.identifier.uri : 10.1109/ICT4DA53266.2021.9672222
dc.identifier.uri http://repository.dkut.ac.ke:8080/xmlui/handle/123456789/4939
dc.description.abstract Rule-based grammar development is labor-intensive in terms of time and knowledge requirements, especially for complex morphology and under-resourced languages. Notwithstanding, these grammars are needed for deep natural language processing, generation of well-formed output, or both. To address the challenge, this paper seeks to develop shared multilingual wide-coverage grammar for a subset of Kenyan Bantu languages in Grammatical Framework (GF) by leveraging on cross linguistic similarities using the grammar engineering strategies: grammar porting and grammar sharing. The shared grammar was developed using the morphology-driven approach, where the lexicons are defined first, followed by inflection regular expression and finally the syntax production rules. The resulting congruent Bantu parameterized grammar had shareability for category linearizations, parameters, paradigms, and syntax rules of 100%, 68.75%, 65.3% and 89.57%, respectively, while portability (modification) was exhibited in paradigms, parameter plus syntax rules at 14.29%, 18.75% and 10.43% respectively. The research concludes leveraging on the cross-linguistic similarities of principles and parameters significantly reduces multilingual grammar's development effort and contributes by developing the Bantu parametrized grammar which demonstrates how the effort of developing the rule base has been significantly reduced in languages where data is a scarce commodity. en_US
dc.language.iso en en_US
dc.publisher 2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA) en_US
dc.title Leveraging on Cross Linguistic Similarities to Reduce Grammar Development Effort for the Under-Resourced Languages: a Case of Kenyan Bantu Languages en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account