KardiaBand More Accurate Than Apple Watch 4 in Diagnosing AF

The difference likely lies in how software interprets the ECG tracings. How busy clinicians act on the information is unclear.

KardiaBand More Accurate Than Apple Watch 4 in Diagnosing AF

Wearable devices promise the ability to promptly alert patients about potentially dangerous changes in heart rhythm, but new data from the SMART WARS study suggest that not all technologies are created equal.

Led by Christopher Ford, MD (Eastern Health Clinical School, Box Hill, Australia), researchers found that KardiaBand (AliveCor) outperformed the Apple Watch 4 (Apple) when it came to accuracy and sensitivity for detecting atrial fibrillation (AF) in a group of 125 patients tested with both devices. The differences stem not from the ECG-reading capability, they say, but appear to result from the proprietary algorithms used to interpret the information.

Their findings were published online recently in JACC: Clinical Electrophysiology.

“As cardiologists in the current era of readily-available smart devices, we are not infrequently presented with data collected by patients from these devices, but have no objective scientific studies to guide us on their accuracy or how the data should be interpreted, and whether action can be taken based on the results provided by them,” senior author Andrew W. Teh, MBBS, PhD (Eastern Health Clinical School and Austin Hospital Clinical School, Melbourne, Australia), told TCTMD.

While the idea of using these devices as adjunctive tools for diagnosing AF is appealing, their accuracy requires objective validation, he noted in an email.

Teh said that, going into their study, he’d expected to see similar performance by the two products they tested. Apple Watch 4 and KardiaBand, an accessory that attaches to Apple Watch, each are capable of obtaining single-lead ECG recordings. Their manufacturers’ software analyzes these automatic rhythm assessments, reporting their results to users.

Exactly why they differed is a “complex question,” he said. But given the fact that ECG tracings were obtained in a consistent way, and “given the cardiologists who reviewed tracings from both devices were able to accurately diagnose AF and normal rhythm (sinus rhythm), the software algorithm used by each device seems most likely to be the distinguishing feature that caused the lower accuracy of the Apple Watch.”

Kalyanam Shivkumar, MD, PhD (University of California, Los Angeles), editor-in-chief of JACC: Clinical Electrophysiology, said the journal was drawn to the SMART WARS study because its topic is timely and “of great interest to everyone.” Ahead of the comparison between the two devices, there was no reason to think one would perform better than the other, he commented to TCTMD. “We didn’t have a bias going in.”

Shivkumar stressed that it’s important to know, irrespective of the specific brands, whether these “over-the-counter medical technologies in the diagnostic space” offer reliable information. “More and more of these devices are going to be used, so are we getting complete garbage?” he asked, adding that these data show that “it’s not garbage.”

KardiaBand’s automated algorithm indeed outshined those of the Apple Watch 4, however, said Shivkumar. “That’s a useful point to file away in the back of our mind.”

The “joy of what will happen in the future,” he noted, is seeing how artificial intelligence, machine learning, and other approaches inform innovation in this area. “Automated rhythm-detection algorithms are a work in progress.”

Potential for Missed Diagnoses

For the SMART WARS study, researchers took consecutive outpatient clinic recordings from 125 patients (mean age 76 years; 62% men) using Apple Watch 4 and KardiaBand as well as 12-lead ECG. They analyzed both automated diagnoses and two cardiologists’ blinded interpretation of the wearables’ results.

A diagnosis of AF was confirmed in 27 patients (24 persistent and three paroxysmal) using 12-ECG, and atrial flutter in four patients, all of whom had preexisting diagnoses. The remaining patients were in sinus rhythm.

When assessing all recordings done by the wearable devices, 66% of those obtained by Apple Watch 4 and 74% of those obtained by KardiaBand were accurate. Excluding the inconclusive readings from analysis increased the diagnostic accuracy to 93% and 94%, respectively. Introducing clinician adjudication to the automated readings if the device offered no diagnosis—a hybrid approach—resulted in accuracy rates of 87% and 91%, respectively.

Diagnostic Accuracy vs 12-Lead ECG for AF Detection

 

Sensitivity

Specificity

PPV

NPV

Apple Watch 4

    Automated

    Hybrid

 

50%

68%

 

100%

93%

 

100%

75%

 

92%

90%

KardiaBand

    Automated

    Hybrid

 

96%

94%

 

93%

90%

 

84%

76%

 

99%

98%


Agreement between the KardiaBand algorithm and 12-lead ECG for AF diagnosis was “excellent,” the researchers say, with a k value of 0.82. Between the Apple Watch 4 algorithm and ECG, agreement was “fair,” with a k value of 0.64. Adding clinician input on top of these readings produced k values of 0.75 and 0.78, respectively.

Overall, they conclude, “these findings suggest that although these devices’ tracings are of sufficient quality, automated diagnosis alone is not sufficient for making clinical decisions about atrial fibrillation diagnosis and management.”

Ford et al highlight the mere 50% sensitivity of Apple Watch 4 with the automated readings, citing the risk of missed diagnoses. The only remedy when the device doesn’t offer a clear diagnosis is manual interpretation of tracings, they say. “This has potential to create a high workload for clinicians, as they may not be able to rely upon the device’s automated diagnosis of sinus rhythm.”

It would seem that patients agree on the need for physician input. A survey of study participants found 87% said they’d consider seeking medical advice if their device had persistently abnormal readings, while most expected their cardiologist (85%) or general practitioner (74%) to review their smartphone results. Moreover, they expected feedback within 16 days on average (median 1 day).

Shivkumar also drew attention to Apple Watch 4’s lack of sensitivity in this study, though he said this is less concerning in a screening tool than it would be with a device designed to monitor patients with a known disease.  “Some of these kinds of wearables will get to the point where you can actually even use it for medical indications, where you’re really nervous to monitor arrhythmia burden and so forth, where you’re going to make therapeutic decisions,” he said.

As of now, “what these technologies do is bring a problem to medical attention. But once it comes to attention . . .  how you act on that information will probably require a higher threshold,” he noted.

Shivkumar says he frequently hears from patients with questions about their smartwatch results. “All the time people come with this, and our #1, #2, [and] #3 response when we see [them] come in is: are you really having an arrhythmia? . . . That’s where the specificity number becomes interesting. There’s generally something that triggered it,” he explained, but often that something turns out to be normal sinus rhythm.

“Ultimately,” said Shivkumar, “we as physicians are going to act on the information, and what matters for us are the true positives.”

Evolving Software, Hardware

AliveCor stopped selling the KardiaBand watch accessory in 2019, several months after the Apple Watch 4’s ECG feature entered the market, but the company continues to sell various KardiaMobile devices that interact with smartphone apps. In April 2021, AliveCor also “filed a complaint with the US International Trade Commission (ITC), alleging Apple’s infringement of three AliveCor patents,” a company press release notes. An ITC investigation commenced in May 2021. Late last month, a federal judge refused to dismiss an antitrust lawsuit alleging Apple has tried to monopolize the market for heart rate analysis.

Looking to the future, Teh predicted that with further advances in hardware and software, diagnostic accuracy of wearables will improve. “However, the ability for a healthcare professional to review the information will remain important when key clinical decision-making is required such as the diagnosis of AF, which may lead to significant implications for treatment. For now, the devices can’t be relied upon to routinely replace standard diagnostic tools such as 12-lead ECGs and 24-hour Holter monitoring,” he added.

Teh said he respects the investment involved in developing sophisticated technologies for heart rate assessment, and as such appreciates why the details would be proprietary. Still, he added, “I think our study provides some objective data, which the companies may use to further refine their algorithms.”

If software can be updated without the need for consumers to purchase new devices for each version, it’s possible that algorithms will be optimized more regularly, the paper points out. “However, this in turn makes real-world validation of these devices extremely difficult, as each software iteration would require its own validation study, which is costly and time-consuming.”

With the latest-generation smartwatches arriving, too, on a yearly basis, it can be hard to keep up, the investigators write. “All of this leaves clinicians with much uncertainty surrounding these devices’ accuracy, despite a growing expectation from our patients to tailor their management to the results.”

Caitlin E. Cox is News Editor of TCTMD and Associate Director, Editorial Content at the Cardiovascular Research Foundation. She produces the…

Read Full Bio
Disclosures
  • The study received funding from the Eastern Health Foundation.
  • Ford and Teh report no relevant conflicts of interest.

Comments