University of Sussex
Browse
fpubh-08-515347.pdf (1.19 MB)

Automatic identification of information quality metrics in health news stories

Download (1.19 MB)
journal contribution
posted on 2023-06-09, 22:33 authored by Majed Al-Jefri, Roger Evans, Joon Lee, Pietro Ghezzi
Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning. Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT. Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F1 measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process. Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment.

History

Publication status

  • Published

File Version

  • Published version

Journal

Frontiers in Public Health

ISSN

2296-2565

Publisher

Frontiers Media

Volume

8

Page range

1-10

Article number

a515347

Department affiliated with

  • Clinical and Experimental Medicine Publications

Full text available

  • Yes

Peer reviewed?

  • Yes

Legacy Posted Date

2020-12-18

First Open Access (FOA) Date

2020-12-18

First Compliant Deposit (FCD) Date

2020-12-18

Usage metrics

    University of Sussex (Publications)

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC