University of Sussex
Browse
W14-3411.pdf (169.98 kB)

Chunking clinical text containing non-canonical language

Download (169.98 kB)
presentation
posted on 2023-06-08, 20:35 authored by Aleksandar Savkov, John Carroll, Jackie Cassell
Free text notes typed by primary care physicians during patient consultations typically contain highly non-canonical language. Shallow syntactic analysis of free text notes can help to reveal valuable information for the study of disease and treatment. We present an exploratory study into chunking such text using off-the-shelf language processing tools and pre-trained statistical models. We evaluate chunking accuracy with respect to part-of-speech tagging quality, choice of chunk representation, and breadth of context features. Our results indicate that narrow context feature windows give the best results, but that chunk representation and minor differences in tagging quality do not have a significant impact on chunking accuracy.

Funding

The ergonomics of electric patient records: an interdisciplinary development of methodologies for understanding and exploiting free text to enhance the utility of primary care electronic patient records; G0011; WELLCOME TRUST; 086105/Z/08/Z

History

Publication status

  • Published

File Version

  • Published version

Page range

613-623

Presentation Type

  • paper

Event name

13th Workshop on Biomedical Natural Language Processing (BioNLP)

Event location

Baltimore, MD

Event type

workshop

Event date

26-27 Jun 2014

Department affiliated with

  • BSMS Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2015-04-24

First Open Access (FOA) Date

2015-04-24

First Compliant Deposit (FCD) Date

2015-04-23

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC