Sheng - Urban Mixed Language

Permanent URI for this communityhttps://kencorpus.ke/handle/123456789/27

Sheng is a dynamic urban contact variety that fuses Kiswahili, English, and local Kenyan languages, emerging primarily among youth (in Nairobi primarily as well as other towns) as a vehicle of identity and in-group solidarity. Its pervasiveness across daily conversation, mass media, and digital communication, often used by speakers who simultaneously deny knowing it, reflects both its social vitality and its contested status. Advocates see Sheng as evidence of organic linguistic creativity and societal change; critics, including educationists and researchers, point to empirical evidence that Sheng morphosyntactic structures appear in learners' written and oral compositions in ways that impede attainment of standard English and Kiswahili syllabi objectives. For corpus linguistics and Natural Language Processing, Sheng presents a particular challenge through pervasive code switching, the alternation between two or more languages within a single conversation or sentence, which strains conventional language model architectures trained on monolingual or cleanly bilingual data. What is clear is that any comprehensive Kenyan language corpus must include Sheng data: its documentation is challenging but unavoidable.

Browse

Search Results

Materials under CC BY 4.0. Hip-hop lyric texts archived with artist consent; commercial reproduction requires artist agreement. For enquiries and to learn more, reach out to respective dataset/artefact issuer.