Detecting a continuum of compositionality in phrasal verbs

McCarthy, Diana, Keller, Bill and Carroll, John (2003) Detecting a continuum of compositionality in phrasal verbs. In: Workshop on Multi-Word Expressions: Analysis, Acquisition and Treatment (ACL 2003), Sapporo, Japan.

Full text not available from this repository.


We investigate the use of an automatically acquired thesaurus for measures designed to indicate the compositionality of candidate multiword verbs, specifically English phrasal verbs identified automatically using a robust parser. We examine various measures using the nearest neighbours of the phrasal verb, and in some cases the neighbours of the simplex counterpart and show that some of these correlate significantly with human rankings of compositionality on the test set. We also show that whilst the compositionality judgements correlate with some statistics commonly used for extracting multiwords, the relationship is not as strong as that using the automatically constructed thesaurus.

Item Type: Conference or Workshop Item (Paper)
Additional Information: Originality: Describes an original approach to determining the degree to which multi-word expressions (phrasal verbs) are compositional in meaning, based on an automatically acquired thesaurus. Proposes a continuum of compositionality. Rigour: Evaluated on a novel dataset with human judgements of compositionality showing a highly significant figure for inter-annotator agreement. Highly significant correlations were obtained between the human judgements and measures proposed in the paper. Significance: The methodology and dataset have been taken up by other researchers, though to date, several of the measures proposed have not been outperformed on this data. Other researchers have adapted the methodology to detect compositionality of other multiword constructions. Impact: 38 Google Scholar citations (not counting two cites by co-authors). The dataset has been made publicly available and several international researchers have used it in subsequent experiments. Outlet: Appeared in the first in a series of 4 workshops to date in the burgeoning field on multiword expressions. The workshop forms part of the ACL conference.
Schools and Departments: School of Engineering and Informatics > Informatics
Depositing User: Bill Keller
Date Deposited: 06 Feb 2012 18:28
Last Modified: 27 Mar 2012 08:10
📧 Request an update