Abstract—Morphological Analysis is a major component in many Natural Language Processing (NLP) applications. The performance of general purpose morphological analyzer (GMA) degrades when used for a particular domain. In this paper we present our effort in developing a domain specific morphological analyzer (DMA) whose architecture is an extension of the existing paradigm based GMA. The method involves identifying domain specific words from a raw text and assigning a paradigm class to them. The proposed method is language independent and has been tested on domain specific Hindi data. The results show 90.60% coverage which is an increase by 6% over GMA and accounts for 25.39% of unanalyzed words.
Index Terms—Morphology, domain adaptation, paradigm table.
The authors are with the International Institute of Information Technology, Hyderabad, India (e-mail: prathyusha.k@research.iiit.ac.in, chandukhyathi.raghavi@research.iiit.ac.in, dipti@iiit.ac.in, nelakuditi.kovida@research.iiit.ac.in).
[PDF]
Cite:Prathyusha Kuncham, Chandu Khyathi Raghavi, Kovida Nelakuditi, and Dipti Misra Sharma, "Domain Adaptation in Morphological Analysis," International Journal of Languages, Literature and Linguistics vol. 1, no. 2, pp. 127-130, 2015.