University of Sussex
Browse
2005.01854.pdf (1.43 MB)

Data augmentation for hypernymy detection

Download (1.43 MB)
conference contribution
posted on 2023-06-09, 22:57 authored by Thomas Kober, Julie WeedsJulie Weeds, Lorenzo Scott Bertolini, Weir David
The automatic detection of hypernymy relationships represents a challenging problem in NLP. The successful application of state-of-the-art supervised approaches using distributed representations has generally been impeded by the limited availability of high quality training data. We have developed two novel data augmentation techniques which generate new training examples from existing ones. First, we combine the linguistic principles of hypernym transitivity and intersective modifier-noun composition to generate additional pairs of vectors, such as “small dog - dog” or “small dog - animal”, for which a hypernymy relationship can be assumed. Second, we use generative adversarial networks (GANs) to generate pairs of vectors for which the hypernymy relation can also be assumed. We furthermore present two complementary strategies for extending an existing dataset by leveraging linguistic resources such as WordNet. Using an evaluation across 3 different datasets for hypernymy detection and 2 different vector spaces, we demonstrate that both of the proposed automatic data augmentation and dataset extension strategies substantially improve classifier performance.

History

Publication status

  • Published

File Version

  • Accepted version

Journal

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Publisher

Association for Computational Linguistics

Page range

1034-1048

Event name

16th conference of the European Chapter of the Association for Computational Linguistics (EACL)

Event location

Kyiv / online

Event type

conference

Event date

19 - 23 April, 2021

Department affiliated with

  • Informatics Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2021-02-03

First Open Access (FOA) Date

2021-04-27

First Compliant Deposit (FCD) Date

2021-02-02

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC