COMMENTARY

Mar 22, 2024 This Week in Cardiology Podcast

John M. Mandrola, MD

Disclosures

March 22, 2024

Please note that the text below is not a full transcript and has not been copyedited. For more insight and commentary on these stories, subscribe to the This Week in Cardiology podcast, download the Medscape app or subscribe on Apple Podcasts, Spotify, or your preferred podcast provider. This podcast is intended for healthcare professionals only.

In This Week’s Podcast

For the week ending March 22, 2024, John Mandrola, MD, comments on the following news and features stories: Obesity drugs as atherosclerotic cardiovascular disease (ASCVD)-modifiers, heart rate (HR) monitors, when journals publish obvious facts, and effect scores and sorting out signals from randomized controlled trials (RCTs).

An Obesity Treatment that reduces CV Outcomes

Last week, the US Food and Drug Administration (FDA) approved semaglutide to reduce major adverse cardiac events (MACE) in patients with obesity and established heart disease.

This decision stems from the 17,000-patient SELECT trial. It’s a big deal because FDA approval begins the process of clearing this drug for insurance coverage.

One of my six lectures during my trip to Denmark this past week was titled, “Should Doctors Prescribe Weight-LossTherapies to Patients With HF?”

I reviewed SELECT, and it’s worth briefly going over now, because it is a true paradigm-shifting trial.

  • Patients were about 62 years old; 72% male. The mean body mass index was 33. Two-thirds qualified for the study due to previous myocardial infarction (MI); 18% with previous stroke.

  • 90% of patients were on lipid-lowering meds, 86% on antiplatelet drugs. This is notable because the gains from semaglutide happened while patients were already on disease-modifying drugs.

  • After a mean follow-up of 3.3 years, a primary outcome event of CVD, or MI, or stroke occurred in 6.5% of those on semaglutide, vs 8.0% of those on placebo. Hazard ratio (HR) was 0.80 with a confidence interval (CI) ranging from 0.72-0.90 and a highly significant P-value.

  • The first component of that primary outcome event was CVD. It was 2.5% (semaglutide) vs 3.0% (placebo); HR 0.85 with CI from 0.71-1.01. That’s kind of important, because the CVD reduction was quite small, and the upper bound was greater than 1. The P-value was 0.07, and since this was a confirmatory endpoint, the prespecified P threshold for superiority was 0.023. So, while the primary endpoint was significant, the first secondary (confirmatory) endpoint was not significant.

  • I speak of this technical detail because the rules of the trial said if the confirmatory endpoint did not reach the significance testing, superiority testing was not performed for the remaining confirmatory secondary endpoints. That’s important because the HR for the heart failure composite, and all-cause death all looked to be statistically significant in favor of semaglutide. Death was 4.3 (semaglutide) vs 5.2 (placebo), which is more than CVD.

A subgroup analysis showed no obvious heterogeneous treatment effects (the). The weight reduction was about 8.5%, and it was remarkably consistent throughout the trial.

Adverse effects leading to stopping the medication or placebo were 16% vs 8% and were dominated by gastrointestinal disorders. There were no differences in cancer diagnoses.

Comments. The interpretation here is easy. Semaglutide, in this patient population, is a disease-modifying treatment. Period.

It reduced MACE on top of other disease-modifying therapy. The risk reduction is modest but clinically meaningful; in fact, a 20% MACE reduction is similar to statins.  And it was statistically persuasive.

There is debate about whether it’s a pleiotropic effect of the GLP-1 agonist effects or simply through weight loss.

  • Use of these drugs generates a lot of discussion. Appropriate concerns are long-term side effects, (remember, trials only go for 3 years), the high costs, and of course, the matter of weight regain after stopping the drug.

  • One of the lecturers I listened to this week asked us to think of weight regain after stopping the GLP-1agonists as similar to the rise of LDL cholesterol after stopping a cholesterol lowering drug.

  • Another doctor remarked that how we think of weight gain after stopping the drug turns on whether we consider obesity a medical condition or behavioral condition. If it’s a medical condition, we simply use GLP-1 agonists as we use lipid-lowering drugs. Long-term.

  • But if we think of obesity as a behavioral condition, we hope to use the drugs to change behavior that might allow coming off the drugs. However, if the GLP-1 agonist has important non-weight-related effects, then perhaps it should be used long-term. There is uncertainty here.

To be honest, ultimately, tomorrow in clinic, I don’t think there is much mental struggle in this case. These are 60-ish year-old patients who have established heart disease. If your patient fits this study entry criteria, and has obesity, it’s likely that semaglutide gives them a statin-like risk reduction.

Translating this data to younger patients, who are healthier, or to children, is unwise. We know that GLP-1 agonists reduce weight in those groups, but we don’t know if it translates to lower outcomes. SELECT enriched its population by recruiting patients with established CVD. A CV outcomes trial in a group with a lower placebo-event rate would surely require many more patients.

Wearable HR Monitor Accuracy

Canadian authors from Toronto and Calgary (first author Ryan Quinn) have a short but important research letter in the Journal of the American College of Cardiology (JACC) regarding the accuracy of wearable HR monitors in sinus rhythm (SR) and atrial fibrillation (AF).

This should be one of the foundational principles in cardiology, so I hesitate even giving a sentence to it, but when you measure a pulse in the periphery of the body — usually the wrist — during AF with higher rates, there is likely to be what’s called a pulse deficit.

Meaning, when you feel a pulse, you are measuring the arterial pulse wave that stems from cardiac contraction. In SR and at reasonable rates, it’s a great surrogate for HR. Because one pulse equals one QRS or heartbeat.

In AF, there is an irregularly irregular rhythm, and when there are two tightly coupled heart beats, the second one may not generate a good pulse in the periphery.

Yet, wrist smartwatches that output HR’s are common. Fitness trackers most often rely on photoplethysmography (PPG) technology to calculate HR; in other words, a pulse in the wrist.

  • The research team, led by Paul Dorian, wondered how the PPG monitors compared with a standard ECG during exercise.

  • They studied patients who were scheduled for routine treadmill tests; 81 patients had device HRs compared with the ECG. They studied six different wearable HR monitors.

  • It wasn’t good for the fitness trackers.

  • At rest, the mean of absolute differences between the ECG-measured HR and the device-displayed rate were 4.6 ± 8.4 beats/min in SR and 7.0 ± 11.8 beats/min in AF.

  • At peak exercise, these differences were substantially larger: 13.8 ± 18.9 beats/min in SR and 28.7 ± 23.7 beats/min in AF (P < 0.01).

  • As for correlation, this too was bad. At rest, the correlation coefficient was 0.931 for and 0.504 for AF.

  • During exercise, the correlation coefficient was 0.726 for SR and 0.301 for AF. As the HR measured by ECG increases, the device-detected HR becomes less accurate, especially for subjects in AF and during exercise.

Comments. This is nice work. It’s a modest experiment, but the systematic nature allows us to confirm what we all suspected: That watch-based HRs during AF are not reliable enough to act on.

If your patient with AF brings you one of those graphs from a fitness app that shows good HR control, it would be wise to confirm this with a proper ECG tracing.

The authors don’t write this, but I suspect a chest strap monitor would do better, as would a watch capable of recording an actual ECG.

I am pretty negative about many aspects of digital health, because of the medicalization of normal life, but being mad about smart watches, etc, is like being mad at rain. So, studies like this help us better understand new technology.

When Top Journals Publish Established Facts

Journals must be struggling to publish science. JACC is one of the top journals in cardiology. It has now published an observational study that finds — sit down for this — among patients with atrial fibrillation (AF), treatment with antiarrhythmic drugs (AADs) is associated with higher rates of bradycardia, syncope, and pacer implantation.

And the authors’ concluding sentence reads: “Precise evaluation of such risk should be undertaken before prescription of AADs.”

  • I mean, come on. We learned in medical school, before there was an Internet, that nearly every AF rate or rhythm control drug predisposes to bradycardia.

  • But even more core to the job of clinical medicine is to always balance benefits and harms of a proposed intervention.

  • My friends, bradycardia is the chief adverse effect of nearly every heart rhythm drug. Dofetilide, a pure class III drug, is the only exception I can think of.

  • Flecainide, propafenone, sotalol, dronedarone, amiodarone, all predispose to bradycardia. You could almost say bradycardia is an effect as well as an adverse effect. 

  • Brady-effects from AF drugs is literally the number one thing we think about when using AF drugs. And it has been this way for at least half a century.

  • I mean no malice to the authors, and I totally agree with their findings, as would hundreds of thousands of doctors worldwide, now and in 1990.

My questions are why do such a study, and why would JACC publish this and promote it?

This podcast would not be worth much if I did not say what I thought. So, the thought entered my mind — I am not saying it is true, it is just a thought — that AADs are a competitor to AF ablation. And there are no industry ads for AADs but there are tons of ads for ablation.

Anyway, I hope the business model had nothing to do with publishing a study that equates with finding that walking in the rain is associated with being wet.

Do all Patients in an RCT Get the Average Effect?

The number one job of an evidence-based medicine clinician is applying trial-data to the person in your exam room.

RCTs provide treatment effects of an intervention. We then take that effect and apply it to our patient. But. There are oodles of complexities along this path, from effect size in a trial and our patient.

For one, trials produce an average effect size. Say there are 10,000 patients in a trial; there could be groups of patients benefiting a lot and groups that are harmed.

  • The best example of this is the DANISH trial of implantable cardioverter defibrillators (ICDs) in patients with nonischemic cardiomyopathy; 1100 patients were enrolled with half randomly assigned to ICD and half to standard care.

  • The average effect size was a nonsignificant 13%. The Kaplan Meier curves were close. Essentially, it was a negative trial.

  • But the subgroup analysis showed a strong benefit for younger patients and a trend for harm in older patients. Benefit and harm averages to null.

  • We say there was likely an HTE. That’s jargon for saying the same intervention can exert different effects based on patient characteristics.

Experienced clinicians can sometimes inherently sense that our patient in the office is different in important ways from the average person in the trial.

Subgroup analyses (which are usually Table 3 or 4 in papers) attempt to get at this, but a subgroup analysis only uses one thing, such as diabetes or no diabetes. Or a blood pressure (BP) cutoff. One thing.

The problem is that trials are powered to sort signal from noise in the overall groups, but subgroups are smaller slices of the overall population in the trial. Hence, leaning too hard on subgroups is going to give lots of false negatives and false positives (like the astrological sign signals seen with aspirin in ISIS 2).

JAMA has published perhaps a better way to look for these special patients who may benefit, more or less.

It’s a complicated paper and it involves computer science and machine learning and modeling, but the idea is truly beautiful.  I admit to being intrigued, perhaps because I have so little experience with the inner workings of machine learning.

Here’s the deal.  It’s called an effect score. A group of authors did it with two separate trials of O2 saturation targets in patients with critical illness requiring intubation and mechanical ventilation.

One paragraph background: The ideal O2 saturation to titrate a patient on a ventilator to is a matter of great debate. When I did critical care medicine back in training, we thought higher sats were better because of course, more oxygen in the body must be good.

But trials have not borne that out. Titrating to lower O2 saturation targets has proved in trials to be similar. But. Are these trials negative because some patients are harmed by lower targets and others have benefited and, consequently, should clinicians try to individualize patients? Or should we just assume most or all patients in the trial got the average (or null) treatment effect, and then translate that data in our intensive care units?

Well, observational studies have suggested that the optimal oxygen targets may differ —say, in those with hypoxic brain injury vs those with septic shock.

The study in JAMA, from a vast group of authors, first author Kevin Buell, University of Chicago, senior author, Mathew Churpek, Melbourne, Australia, set out to use machine learning or patient characteristics and effect size in two oxygen target RCTs.

  • I will surely oversimplify it but the way I understand it is they took one trial and used the computer to associate a group of characteristics to form three strata based on whether they’d do better with either low saturation-target, have no difference, or do better with higher targets.

  • Basically, this was a subgroup analysis but instead of one feature, they used many features, determined by machine learning. I guess you’d say they ‘derived’ the groups from one trial.

  • Then came validation. They then took these “effect scores” and used them in a similar but different RCT. Lo and behold, the effect scores taken from one trial predicted heterogeneous treatment effects HTE in the second trial.

The specific results were:

The predicted effect of treatment with a lower vs higher Spo2 targets for individual patients ranged from a 27.2% absolute reduction to a 34.4% absolute increase in 28-day mortality. For example, patients predicted to benefit from a lower Spo2 target had a higher prevalence of acute brain injury, whereas patients predicted to benefit from a higher Spo2 target had a higher prevalence of sepsis and abnormally elevated vital signs.

That is a huge spread. It’s enticing.

To test the waters, I put a Tweet out Wednesday saying that this looked cool. Boom.

The majority of evidence-based medicine clinicians I interact with on X all cautioned me to be very skeptical not only of this method, with its fancy computer science, but I should also be skeptical of the whole idea of trying to find such signals in trials. The message I heard from some very senior doctors was to assume that most patients in trials get the average effect. And if you find potential HTE you should not act on it until it was proven in future trials.

The authors of the oxygen saturation target trial say this explicitly in the paper, that proof of non-average effects in trials requires proof in future trials.

Still though, I am not so negative. I want to remind you of a nice paper in the European Journal of Preventive Cardiology in 2018.

Lafflin and colleagues studied individual patient data from the ACCORD and SPRINT trials (BP target trials) and beautifully showed that you could use patient characteristics in ACCORD to create a SPRINT trial score — the difference between how far an ACCORD trial subject was from the average SPRINT subject using six characteristics.

They did this and learned that in ACCORD, which found no difference with lower BP goals, there were plenty of patients like those in SPRINT, and for those patients, lower BP was better. Take a look at this paper. It’s really cool.

I am a novice here so let me know what you think. Average effect size is all we should assume, or can we find important signals of HTE in trials?

Teaser

I have run out of time but want to preview a study I will discuss next week. This is a comprehensive meta-analysis of only RCTs that studies the interactions of heart failure and AF with use of mineralocorticoid receptor antagonist (MRA) drugs.  The study first author is Alireza Oraii and senior author William McIntyre from McMasters. It’s in the European Heart Journal. It’s good. And I call MRAs my ‘secret weapon.’

Comments

3090D553-9492-4563-8681-AD288FA52ACE
Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.

processing....