Khaliq, Bilal and Carroll, John (2013) Induction of root and pattern lexicon for unsupervised morphological analysis of Arabic. In: 6th international joint conference on natural language processing (IJCNLP), 14-18 October 2013, Nagoya, Japan.
This is the latest version of this item.
![]()
|
PDF
Available under License Creative Commons Attribution-NonCommercial ShareAlike. Download (247kB) | Preview |
Abstract
We propose an unsupervised approach to learning non-concatenative morphology, which we apply to induce a lexicon of Arabic roots and pattern templates. The approach is based on the idea that roots and patterns may be revealed through mutually recursive scoring based on hypothesized pattern and root frequencies. After a further iterative refinement stage, morphological analysis with the induced lexicon achieves a root identification accuracy of over 94%. Our approach differs from previous work on unsupervised learning of Arabic morphology in that it is applicable to naturally-written, unvowelled text.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Schools and Departments: | School of Engineering and Informatics > Informatics |
Subjects: | Q Science > QA Mathematics > QA0075 Electronic computers. Computer science |
Depositing User: | John Carroll |
Date Deposited: | 24 Apr 2015 08:39 |
Last Modified: | 24 Apr 2015 08:39 |
URI: | http://sro.sussex.ac.uk/id/eprint/53737 |
Available Versions of this Item
-
Induction of root and pattern lexicon for unsupervised morphological analysis of Arabic. (deposited 28 Feb 2014 09:30)
- Induction of root and pattern lexicon for unsupervised morphological analysis of Arabic. (deposited 24 Apr 2015 08:39) [Currently Displayed]
View download statistics for this item
📧 Request an update