Can we evaluate the quality of generated text?

Scott, D; Hardcastle, D

File(s) not publicly available

Can we evaluate the quality of generated text?

presentation

posted on 2023-06-08, 07:02 authored by D Scott, D Hardcastle

Evaluating the output of NLG systems is notoriously difficult, and performing assessments of text quality even more so. A range of automated and subject-based approaches to the evaluation of text quality have been taken, including comparison with a putative gold standard text, analysis of specific linguistic features of the output, expert review and task-based evaluation. In this paper we present the results of a variety of such approaches in the context of a case study application. We discuss the problems encountered in the implementation of each approach in the context of the literature, and propose that a test based on the Turing test for machine intelligence offers a way forward in the evaluation of the subjective notion of text quality.

History

Publication status

Published

Publisher URL

http://www.sussex.ac.uk/Users/drs22/publications/HardcastleScott-lrec08.pdf

Presentation Type

paper

Event name

6th Language Resources and Evaluation Conference, (LREC'08)

Event location

Marrakech, Morocco

Event type

conference

Department affiliated with

Informatics Publications

Full text available

No

Peer reviewed?

Yes

Legacy Posted Date

2012-02-06

Usage metrics

Keywords

Uncategorised value

Licence

Copyright not evaluated

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) not publicly available

Can we evaluate the quality of generated text?

History

Publication status

Publisher URL

Presentation Type

Event name

Event location

Event type

Department affiliated with

Full text available

Peer reviewed?

Legacy Posted Date

Usage metrics

Categories

Keywords

Licence

Exports