University of Sussex
Browse
Scottetal.pdf (594.41 kB)

Corpus annotation as a scientific task

Download (594.41 kB)
presentation
posted on 2023-06-08, 11:44 authored by Donia Scott, Rossano Barone, Rob Koeling
Annotation studies in CL are generally unscientific: they are mostly not reproducible, make use of too few (and often non-independent) annotators and use guidelines that are often something of a moving target. Additionally, the notion of ‘expert annotators’ invariably means only that the annotators have linguistic training. While this can be acceptable in some special contexts, it is often far from ideal. This is particularly the case when subtle judgements are required or when, as increasingly, one is making use of corpora originating from technical texts that have been produced by, and intended to be consumed by, an audience of technical experts in the field. We outline a more rigorous approach to collecting human annotations, using as our example a study designed to capture judgements on the meaning of hedge words in medical records.

History

Publication status

  • Published

File Version

  • Published version

Presentation Type

  • paper

Event name

The Eighth International Conference on Language Resources and Evaluation (LREC'2012)

Event location

Istanbul, Turkey

Event type

conference

Event date

23-27th May, 2011

Department affiliated with

  • Informatics Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2012-06-01

First Open Access (FOA) Date

2012-06-01

First Compliant Deposit (FCD) Date

2012-05-31

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC