Using machine learning to extract information and predict outcomes from reports of randomised trials of smoking cessation interventions in the Human Behaviour-Change Project

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Using reports of randomised trials of smoking cessation interventions as a test case, this study aimed to develop and evaluate machine learning (ML) algorithms for extracting information from study reports and predicting outcomes as part of the Human Behaviour-Change Project. It is the first of two linked papers, with the second paper reporting on further development of a prediction system.

Methods

Researchers manually annotated 70 items of information (‘entities’) in 512 reports of randomised trials of smoking cessation interventions covering intervention content and delivery, population, setting, outcome and study methodology using the Behaviour Change Intervention Ontology. These entities were used to train ML algorithms to extract the information automatically. The information extraction ML algorithm involved a named-entity recognition system using the ‘FLAIR’ framework. The manually annotated intervention, population, setting and study entities were used to develop a deep-learning algorithm using multiple layers of long-short-term-memory (LSTM) components to predict smoking cessation outcomes.

Results

The F1 evaluation score, derived from the false positive and false negative rates (range 0–1), for the information extraction algorithm averaged 0.42 across different types of entity (SD=0.22, range 0.05–0.88) compared with an average human annotator’s score of 0.75 (SD=0.15, range 0.38–1.00). The algorithm for assigning entities to study arms ( e.g., intervention or control) was not successful. This initial ML outcome prediction algorithm did not outperform prediction based just on the mean outcome value or a linear regression model.

Conclusions

While some success was achieved in using ML to extract information from reports of randomised trials of smoking cessation interventions, we identified major challenges that could be addressed by greater standardisation in the way that studies are reported. Outcome prediction from smoking cessation studies may benefit from development of novel algorithms, e.g., using ontological information to inform ML (as reported in the linked paper ³).

Related collections

Most cited references 33

Record: found
Abstract: found
Article: not found

Long Short-Term Memory

Jürgen Schmidhuber, Jürgen Schmidhuber (2002)

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

0 comments Cited 6665 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Glove: Global Vectors for Word Representation

Jeffrey Pennington, Richard Socher, Christopher Manning (2014)

0 comments Cited 1072 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: not found

Is Open Access

Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature

T. Chai, R. Draxler (2014)

Both the root mean square error (RMSE) and the mean absolute error (MAE) are regularly employed in model evaluation studies. Willmott and Matsuura (2005) have suggested that the RMSE is not a good indicator of average model performance and might be a misleading indicator of average error, and thus the MAE would be a better metric for that purpose. While some concerns over using RMSE raised by Willmott and Matsuura (2005) and Willmott et al. (2009) are valid, the proposed avoidance of RMSE in favor of MAE is not the solution. Citing the aforementioned papers, many researchers chose MAE over RMSE to present their model evaluation statistics when presenting or adding the RMSE measures could be more beneficial. In this technical note, we demonstrate that the RMSE is not ambiguous in its meaning, contrary to what was claimed by Willmott et al. (2009). The RMSE is more appropriate to represent model performance than the MAE when the error distribution is expected to be Gaussian. In addition, we show that the RMSE satisfies the triangle inequality requirement for a distance metric, whereas Willmott et al. (2009) indicated that the sums-of-squares-based statistics do not satisfy this rule. In the end, we discussed some circumstances where using the RMSE will be more beneficial. However, we do not contend that the RMSE is superior over the MAE. Instead, a combination of metrics, including but certainly not limited to RMSEs and MAEs, are often required to assess model performance.

0 comments Cited 592 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Robert West: Role: ConceptualizationRole: Formal AnalysisRole: MethodologyRole: ValidationRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0001-6398-0921

Francesca Bonin: Role: Data CurationRole: Formal AnalysisRole: MethodologyRole: SoftwareRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

James Thomas: Role: ConceptualizationRole: Data CurationRole: Formal AnalysisRole: Funding AcquisitionRole: MethodologyRole: SoftwareRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0003-4805-4190

Alison J. Wright: Role: Data CurationRole: Formal AnalysisRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

Pol Mac Aonghusa: Role: ConceptualizationRole: Data CurationRole: SoftwareRole: SupervisionRole: VisualizationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0002-7640-9668

Martin Gleize: Role: Formal AnalysisRole: SoftwareRole: Writing – Review & Editing

Yufang Hou: Role: Formal AnalysisRole: SoftwareRole: Writing – Review & Editing

Alison O'Mara-Eves: Role: Formal AnalysisRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0002-0359-6423

Janna Hastings: Role: ConceptualizationRole: Data CurationRole: Formal AnalysisRole: SoftwareRole: VisualizationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0002-3469-4923

Marie Johnston: Role: ConceptualizationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0003-0124-4827

Susan Michie: Role: ConceptualizationRole: Funding AcquisitionRole: Project AdministrationRole: SupervisionRole: ValidationRole: Writing – Original Draft PreparationRole: Writing – Review & Editing

ORCID: https://orcid.org/0000-0003-0063-6378

Journal

Journal ID (nlm-ta): Wellcome Open Res

Journal ID (iso-abbrev): Wellcome Open Res

Title: Wellcome Open Research

Publisher: F1000 Research Limited (London, UK )

ISSN (Electronic): 2398-502X

Publication date (Electronic): 12 October 2023

Publication date Collection: 2023

Volume: 8

Electronic Location Identifier: 452

Affiliations

[1 ]Research Department of Behavioural Science and Health, University College London, London, England, UK

[2 ]IBM Research Europe, Dublin, Ireland

[3 ]EPPI-Centre, Social Research Institute, University College London, London, England, UK

[4 ]Institute of Pharmaceutical Science, King's College London, London, England, UK

[5 ]Institute for Implementation Science in Health Care, Faculty of Medicine, University of Zurich, Zürich, Zurich, Switzerland

[6 ]School of Medicine, University of St Gallen, St. Gallen, St. Gallen, Switzerland

[7 ]Aberdeen Health Psychology Group, University of Aberdeen, Aberdeen, Scotland, UK

[8 ]Centre for Behaviour Change, University College London, London, England, UK

[1 ]Department of Philosophy, University at Buffalo, Buffalo, New York, USA

[1 ]University of Queensland, Herston, Queensland, Australia

[1 ]Department of Health Law, Policy and Management, Boston University, Boston, Massachusetts, USA

Author notes

[a ] robertwest100@ 123456gmail.com

Competing interests: RW and SM are unpaid directors of the Unlocking Behaviour Change Community Interest Company.

Competing interests: No competing interests were disclosed.

Author information

Robert West https://orcid.org/0000-0001-6398-0921

James Thomas https://orcid.org/0000-0003-4805-4190

Pol Mac Aonghusa https://orcid.org/0000-0002-7640-9668

Alison O'Mara-Eves https://orcid.org/0000-0002-0359-6423

Janna Hastings https://orcid.org/0000-0002-3469-4923

Marie Johnston https://orcid.org/0000-0003-0124-4827

Susan Michie https://orcid.org/0000-0003-0063-6378

Article

DOI: 10.12688/wellcomeopenres.20000.1

PMC ID: 11109593

PubMed ID: 38779058

SO-VID: b1d41baf-e85d-4ce0-83db-b7085fb667e2

License:

This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date accepted : 18 September 2023

Funding

Funded by: Wellcome Trust

Award ID: 201524

This work was supported by Wellcome (201524, <a href=https://doi.org/10.35802/201524>https://doi.org/10.35802/201524</a>] ; a collaborative award to the Human Behaviour-Change Project (HBCP): Building the science of behaviour change for complex intervention development).

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Using machine learning to extract information and predict outcomes from reports of randomised trials of smoking cessation interventions in the Human Behaviour-Change Project

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 33

Long Short-Term Memory

Glove: Global Vectors for Word Representation

Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 121

Cited by 1

Most referenced authors 555