Using linguistic data for English and Spanish verb-noun combination identification

Iñurrieta, Uxoa, Díaz de Ilarraza, Arantza, Labaka, Gorka, Sarasola, Kepa, Aduriz, Itziar and Carroll, John (2016) Using linguistic data for English and Spanish verb-noun combination identification. In: COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan.

[img] PDF - Published Version
Available under License Creative Commons Attribution.

Download (185kB)

Abstract

We present a linguistic analysis of a set of English and Spanish verb+noun combinations (VNCs), and a method to use this information to improve VNC identification. Firstly, a sample of frequent VNCs are analysed in-depth and tagged along lexico-semantic and morphosyntactic dimensions, obtaining satisfactory inter-annotator agreement scores. Then, a VNC identification experiment is undertaken, where the analysed linguistic data is combined with chunking information and syntactic dependencies. A comparison between the results of the experiment and the results obtained by a basic detection method shows that VNC identification can be greatly improved by using linguistic information, as a large number of additional occurrences are detected with high precision.

Item Type: Conference or Workshop Item (Paper)
Keywords: Natural language processing
Schools and Departments: School of Engineering and Informatics > Informatics
Research Centres and Groups: Data Science Research Group
Subjects: Q Science > QA Mathematics > QA0075 Electronic computers. Computer science
Related URLs:
Depositing User: John Carroll
Date Deposited: 11 Oct 2016 13:24
Last Modified: 03 Apr 2017 11:56
URI: http://sro.sussex.ac.uk/id/eprint/64696

View download statistics for this item

📧 Request an update