Automatic identification of information quality metrics in health news stories

Al-Jefri, Majed, Evans, Roger, Lee, Joon and Ghezzi, Pietro (2020) Automatic identification of information quality metrics in health news stories. Frontiers in Public Health, 8. a515347 1-10. ISSN 2296-2565

[img] PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB)

Abstract

Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning.

Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT.

Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F1 measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process.

Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment.

Item Type: Article
Keywords: information online, health information, News, Natural Language Processing, Machine Learning
Schools and Departments: Brighton and Sussex Medical School > Clinical and Experimental Medicine
SWORD Depositor: Mx Elements Account
Depositing User: Mx Elements Account
Date Deposited: 18 Dec 2020 12:41
Last Modified: 18 Dec 2020 12:45
URI: http://sro.sussex.ac.uk/id/eprint/95917

View download statistics for this item

📧 Request an update