Blog post - 1 July 2021 12:59

Rigorous clinical validation of AI and machine learning technologies is vital to ensure devices are fit for clinical use

By Keith Errey, CEO, Isansys Lifecare

Amazon, can you find me something to watch?” I ask clearly. Instantly, a list of recommendations flash up on the screen thanks to state-of-the art recommendation engines. Everything Amazon, Netflix and NOW TV does is driven by data and powered by smart AI algorithms. The algorithms direct what you see next based on your previous viewing, buying and browsing behaviour. The companies use customer viewing data, search history, rating data as well as time, date and the kind of device a user uses to predict what should be recommended to them. Everything is “personalised”. It’s all about “you”.

The same goes for online shopping, using search engines or social media, listening to music and browsing video content on YouTube. This pursuit can have enormous benefits but there are predictions that companies like Amazon will learn so much from you through smart AI algorithms that there will come a time when you won’t actually need to know what you want to buy. Instead, it will just turn up on your doorstep. Even items you didn’t know you “need” but perhaps “desired” will be delivered without you having to actually order them. For busy individuals this may be great, but can have unintended consequences and these types of personalised recommendations can go horribly wrong. A YouTube recommendation algorithm can actively push thousands of users toward suggestive videos for its audience which are completely inappropriate, especially for children. Other online platforms and search engines have in the past promoted terrorist content, foreign state-sponsored propaganda, extreme hatred, and innumerable conspiracy theories.

One area AI has grown three-fold is in healthcare. AI can be used to diagnose cancer[1], predict suicide[2] and assist in surgery[3]and the use case for AI and machine learning in the industry became even more prevalent as the world grappled with the global coronavirus pandemic. The data-intensive nature of healthcare makes it one of the most promising fields for the application of AI and machine learning algorithms[4], [5], [6]. However, as the use of technology and AI proliferates across the healthcare industry, the inevitability of mistakes being made becomes a reality. The whole issue ultimately begs the question of, what happens if, like consumer focussed AI, it goes horribly wrong? Who is liable when AI makes a mistake that is harmful to a patient and what are the consequences? If this was a clinical decision support tool (CDST) assisting physicians, nurses, patients, or other care-givers to make better decisions, the consequences could be disastrous.

Although CDSTs are built on very small datasets compared to those of the commercial world, they have the ability to analyse relatively large volumes of data in order to suggest next steps for treatment, flag potential problems and enhance care team efficiency.

*The Patient Status Engine is validated using real-world and clinical data*

These systems can add significant value to the healthcare industry in terms of patient safety, clinical management, cost containment, administrative functions and diagnostic support. However, if they are poorly designed and not effectively validated in the real – and complex - world of healthcare, users may have a system that provides poor or compromised guidance. This is particularly true for CDSTs where clinical validation is essential. It is important to note that this is a different and much lengthier process than mathematically validating the algorithm itself. For instance, a system designed for use at the point-of-care, when applied to real-world environments, may not be utilised properly or may not even work at all[7].For this reason, most CDSTs must now be considered to be medical devices, even if they are “just software” to ensure safety and efficacy, which is the essential function of medical device regulation.

For AI to flourish in healthcare, developers must establish a firm foundation of trust in their algorithms’ accuracy, objectivity, and reliability. And, for it to be adopted in clinical use, AI algorithms must demonstrate improvement in quality of care and patient outcomes[8], and increasingly, a tangible benefit in terms of health economics.

However, researchers reviewing the results of published studies over the past ten years have found that that some work reported to be valid and effective was not good enough; sample sizes were too small, and there was almost no clinical validation.

In a recent study published in Nature Medicine, it was found that as AI becomes embedded in more medical devices - the FDA approved over 65 AI devices last year - the accuracy of these algorithms isn’t necessarily being rigorously demonstrated[9].

The same study in Nature, reported that of 130 FDA-cleared AI devices over the last five years the majority (126 of 130) had only been tested retrospectively, making it difficult to know how well they operate in clinical and other real-world settings.

A report published in the BMJ earlier this year, said: “Proposed models are poorly reported and at high risk of bias, raising concern that their predictions could be unreliable when applied in daily practice[10].”

While machine learning holds great promise to find solutions for many healthcare problems, proper validation is critical before any claims of utility are made and before widespread adoption in clinical practice. If these goals can be achieved, the benefits for patients, care teams and provider organisations are likely to be transformational.

A ground-breaking study demonstrating how clinical validation can be achieved is the RAPID project at the Birmingham Women’s and Children’s NHS Foundation Trust (BWCH). The overall aim of the RAPID project is to establish that a new predictive indicator using machine learning methods and data collected from body-worn, wireless biosensors can identify, more quickly and precisely, deterioration within paediatric patients in an acute care setting. This would allow more timely initiation of treatment and thereby reduce mortality and morbidity, together with associated human and financial cost[11].

*BWCH and Isansys Lifecare have placed efficacy, safety, and clinical relevance at the centre of the RAPID project*

In this study, the hospital used the Patient Status Engine^TM, designed and developed by Isansys Lifecare, to enable clinicians to collect patient’s physiological data including heart rate, respiration rate and oxygen saturation automatically, continuously and wirelessly. Instead of being attached to cables and wires, patients wore unobtrusive Lifetouch^TM smart patches and pulse oximeters to allow them to move around more freely. The small sensors transmitted data wirelessly using Bluetooth to bedside monitors displaying real-time patient charts that were also connected to central monitoring stations and the hospital IT network by Wi-Fi.

More than 1,000 patients were monitored, collecting around 100,000 hours of continuous patient data recorded into a database in the hospital. An advanced predictive algorithm was developed by mathematicians and data scientists at BWCH and the nearby Aston University using this large physiologic database, substantial parts of which – over 7 million minutes - were the “training” data for the machine learning methods employed by the Aston and BWCH team.

This algorithm became the central core of the RAPID Index, a new personalised early warning score that provides, on an individual basis, the likelihood of that patient trending towards deterioration.

A key part of the study was the establishment of an accurate and comprehensive clinical history of each patient for subsequent analysis and input into the development of the mathematical model. This was carried out by research nursing and clinical staff working alongside their colleagues delivering standard care in the wards. The clinical history data set, based on standard observations, clinical knowledge and clinical judgement was absolutely essential in order for the patterns emerging from the large physiological data sets to have clinical meaning.

*Data being collected in real-time and continuously with the Patient Status Engine*

Validation using real-world and clinical data is key to ensure that AI in healthcare can provide genuine insights. In the real world, health data is unstandardised, patient populations are diverse, and unconscious bias can often be reflected in the data that is used to build the AI models. Because most AI models are built on correlations, biased input data can lead to predictions that fail for certain populations or settings, and might even exacerbate existing inequalities and biases. As the AI industry tends to be gender and race imbalanced, and health professionals are already overwhelmed by other digital tools, there is little capacity to catch such errors arising from bias and lack of evidence. Knowing this, many clinicians rightly remain wary of AI and push back against “black-box” systems that purport to make clinical decisions using proprietary or opaque methods. Hence the need for transparency and validation of AI tools and methods through clinical trials and peer-reviewed publications.

BWCH and Isansys Lifecare have placed efficacy, safety, and clinical relevance at the centre of the RAPID project. Using the Patient Status Engine to monitor thousands of patients has ensured that wireless monitoring and the RAPID Index smart alarm are reliable, accurate and safe for young patients and provide earlier indications of deterioration than current early warning scoring methods. As it moves into its second phase, RAPID2, the clinical team and Isansys now aim to clinically and economically validate the system on 10,000 young patients, setting a new care standard for children in hospitals everywhere.

The continuing appetite for digital change and the utilisation of AI in healthcare is incredibly exciting, but it is only the provider organisations like Birmingham Women’s and Children’s Hospital and companies like Isansys that are prepared to invest the time and considerable resources to drive the change, that will really make a difference to patients, their families, their care teams and to society as a whole.

Rigorous clinical validation of AI and machine learning technologies is vital to ensure devices are fit for clinical use

Topics

Categories