International Journal of Advances in Computer Science and Its Applications
Author(s) : G. A MALEMA , M. LEFOANE , N.P MOTLOGELWA
Lemmatization is a pre-processing stage in several natural language processing applications such as data retrieval. There are a few attempts on Setswana word lemmatization. Developed Setswana lemmatizers do not show in details where lemmatization fails to work well leading to reduced performance. This paper presents a detailed rule-based Setswana verb lemmatizer. Challenges in verb lemmatization are pointed out by word category. The overall results show that rule based Setswana verb lemmatization gives a good performance of 87%. However, reflexive verbs have a significant large percentage of exceptions.