A study published earlier this month in BMJ Open found that primary care practitioners outperformed eight symptom-checking apps when it came to the diagnostic accuracy and safety of the advice.
The study found that apps varied substantially in their metrics, but noted that the best performing ones came close to general practitioners in including the correct diagnosis among their top 3 and top 5 suggestions.
“The nature of iterative improvements to software suggests that further improvements will occur with experience and additional evaluation studies,” wrote the research team.
WHY IT MATTERS
To evaluate the apps and the providers, scientists created 200 clinical vignettes, designed to include both common and less-common conditions relevant to primary care practice. These conditions were created to represent real-world situations in which someone might seek medical information or advice from an app or a physician.
The vignettes included a patient’s age and sex, previous medical history, the primary complaint, current symptoms, and information to be provided “if asked” by the app or the provider. They were externally reviewed by two separate panels of three primary care practitioners, who set the “gold-standard” main diagnosis and triage level for the conditions described.
Based on the information provided in the vignettes, the general practitioners being tested were asked to provide a main diagnosis, up to five other differential diagnoses and a triage level.
Meanwhile, each vignette was entered into eight symptom-checking apps. If an app did not allow entry of the vignette – such as if a hypothetical patient was not in its acceptable age range – the reason for this was recorded.
The practitioners outperformed the apps when it came to accuracy and safety. The researchers found that one app, Ada, was comparable to the providers when it came to including the gold-standard diagnosis among its top three and top five suggestions. Ada, Babylon and Symptomate also had the highest performance when it came to safe advice regarding the next steps a patient should take.
It is worth noting that the lead authors on the study are affiliated with Ada, which is based in Berlin. “[F]uture research by independent researchers should seek to replicate these findings and/or develop methods to continually test symptom assessment apps,” read the paper. Ada employees were also involved in the vignette creation process.
In addition, the team noted that some of the vignettes may have had a U.K. bias, and some of the apps – Buoy, K Health and WebMD – are primarily used in the United States.
“Future research should evaluate the performance of the apps compared with real-patient data – multiple separate single-app studies are a very unreliable way to determine the true level of the state of the art of symptom-assessment apps,” read the paper.
THE LARGER TREND
The coronavirus pandemic triggered a wave of symptom-checking apps, with a number of organizations launching chatbots or other tools to help users differentiate between ailments and connect with a healthcare provider if need be.
As members of the public grew more familiar with common COVID-19 symptoms, some companies began turning to apps to help them ease workforces back into the office.
Of course, such apps are only effective if users are symptomatic. Given that many people with COVID-19 don’t have symptoms, they may not be effective in wholly preventing spread.
ON THE RECORD
“Against the background of an aging population and rising pressure on medical services, the last decade has seen the internet replace general practitioners as the first port of call for health information,” wrote the researchers. However, “online search tools like Google or Bing were not intended to provide medical advice and risk offering irrelevant or misleading information.”