Savkov, Aleksandar, Carroll, John and Cassell, Jackie (2014) Chunking clinical text containing non-canonical language. In: 13th Workshop on Biomedical Natural Language Processing (BioNLP), 26-27 Jun 2014, Baltimore, MD.
![]()
|
PDF
- Published Version
Available under License Creative Commons Attribution-NonCommercial ShareAlike. Download (174kB) | Preview |
Abstract
Free text notes typed by primary care physicians during patient consultations typically contain highly non-canonical language. Shallow syntactic analysis of free text notes can help to reveal valuable information for the study of disease and treatment. We present an exploratory study into chunking such text using off-the-shelf language processing tools and pre-trained statistical models. We evaluate chunking accuracy with respect to part-of-speech tagging quality, choice of chunk representation, and breadth of context features. Our results indicate that narrow context feature windows give the best results, but that chunk representation and minor differences in tagging quality do not have a significant impact on chunking accuracy.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Schools and Departments: | Brighton and Sussex Medical School > Brighton and Sussex Medical School Brighton and Sussex Medical School > Primary Care and Public Health School of Engineering and Informatics > Informatics |
Subjects: | Q Science > QA Mathematics > QA0075 Electronic computers. Computer science |
Depositing User: | John Carroll |
Date Deposited: | 24 Apr 2015 08:36 |
Last Modified: | 24 Apr 2015 08:36 |
URI: | http://sro.sussex.ac.uk/id/eprint/53736 |
View download statistics for this item
📧 Request an updateProject Name | Sussex Project Number | Funder | Funder Ref |
---|---|---|---|
The ergonomics of electric patient records: an interdisciplinary development of methodologies for understanding and exploiting free text to enhance the utility of primary care electronic patient records | G0011 | WELLCOME TRUST | 086105/Z/08/Z |