Recorded March 6, 2023. This transcript has been edited for clarity.
Robert A. Harrington, MD: Hi. This is Bob Harrington on theheart.org | Medscape Cardiology, and I'm here at the American College of Cardiology meetings in New Orleans, having a great time, by the way. It's really fun to be back live, in person, getting to see friends and colleagues, seeing live presentations, etc. If you've not been to a live meeting yet over the course of the past couple of years, please do start coming again, whether it's American College of Cardiology, American Heart Association, or European Society of Cardiology. It's fantastic.
Putting that aside, I've been learning many things at this meeting, particularly around machine learning, artificial intelligence (AI), and some of the advanced computational tools that people in the data-science space are using.
I'm fortunate to have an expert and, really, a rising thought leader in this field, Dr Jenine John. Jenine is a machine-learning research fellow at Brigham and Women's Hospital, working in Calum MacRae's research group.
What she talked about on stage this morning is what do you have to know about this whole field. I thought we'd go through some of the basic concepts of data science, what machine learning is, what AI is, and what neural networks are.
How do we start to think about this? As practitioners, we're going to be faced with how to incorporate some of this into our practice. You're seeing machine-learning algorithms put into your clinical operations. You're starting to see ways that people are thinking about, for example, Can the machine read the echocardiogram as good as we can? What's appropriate for the machine? What's appropriate for us? What's the oversight of all of this?
We'll have a great conversation for the next 12-20 minutes and see what we can all learn together. Jenine, thank you for joining us here today.
Jenine John, MD: Thank you for having me.
From Epidemiology to Machine Learning
Harrington: Before we get into the specifics of machine learning and what you need to know, give me a little bit of your story. You obviously did an internal medicine residency. You did a cardiology fellowship. Now, you're doing an advanced research fellowship. When did you get bitten by the bug to want to do data science, machine learning, etc.?
John: It was quite late, actually. After cardiology fellowship, I went to Brigham and Women's Hospital for a research fellowship. I started off doing epidemiology research, and I took classes at the public health school.
Harrington: The classic clinical researcher.
John: Exactly. That was great because I gained a foundation in epidemiology and biostatistics, which I believe is essential for anyone doing clinical research. In 2019, I was preparing to write a K grant, and for my third aim, I thought, Oh, I want to make this complex model that uses many variables. This thing called machine learning might be helpful. I basically just knew the term but didn't know much about it.
I talked to my program director who led me to Dr Rahul Deo and Dr Calum MacRae's group that's doing healthcare AI. Initially, I thought I would just collaborate with them.
Harrington: Have their expertise brought into your grant and help to elevate the whole grant? That's the typical thing to do.
John: Exactly. As I learned a bit more about machine learning, I realized that this is a skill set I should really try to develop. I moved full-time into that group and learned how to code and create machine-learning models specifically for cardiac imaging. Six months later, the pandemic hit, so everything took a shift again.
I believe it's a shift for the better because I was exposed to everything going on in digital health and healthcare startups. There was suddenly an interest in monitoring patients remotely and using tech more effectively. I also became interested in how we are applying AI to healthcare and how we can make sure that we do this well.
Harrington: There are a couple of things that I want to expand on. Maybe we'll start this way. Let's do the definitions. How would you define AI and its role in medicine? And then, talk about a subset of that. Define machine learning for the audience.
AI vs Machine Learning
John: Artificial intelligence and machine learning, the two terms are used pretty much synonymously within healthcare, because when we talk about AI in healthcare, really, we're talking about machine learning. Some people use the term AI differently. They feel that it's only if a system is autonomously thinking independently that you can call it AI. For the purposes of healthcare, we pretty much use them synonymously.
Harrington: For what we're going to talk about today, we'll use them interchangeably.
John: Yes, exactly.
Harrington: Define machine learning.
John: Machine learning is when a machine uses data and learns from the data. It picks up patterns, and then, it can basically produce output based on those patterns.
Harrington: Give me an example that will resonate with a clinical audience. You're an imager, and much of the work so far has been in imaging.
John: Imaging is really where machine learning shines. For example, you can use machine learning on echocardiograms, and you can use it to pick up whether this patient has valvular disease or not. If you feed an AI model enough echocardiograms, it'll start to pick up the patterns and be able to tell whether this person has valvular disease or not.
Harrington: The group that you're working with has been very prominent in being able to say whether they have hypertrophic cardiomyopathy, valve disease, or amyloid infiltrative disease.
There are enough data there that the machine starts to recognize patterns.
John: Yes.
Harrington: You said that you were, at the Harvard School of Public Health, doing what I'll call classic clinical research training. I had the same training. I was a fellow 30-plus years ago in the Duke Databank for Cardiovascular Diseases, and it was about epidemiology and biostatistics and how to then apply those to the questions of clinical research.
You were doing very similar things, and you said something this morning in your presentation that stuck with me. You said you really need to understand these things before you make the leap into trying to understand machine learning. Expand on that a little bit.
John: I think that's so important because right now, what seems to happen is you have the people — the data scientists and clinicians — and they seem to be speaking different languages. We really need more collaboration and getting on the same page. When clinicians go into data science, I think the value is not in becoming pure data scientists and learning to create great machine-learning models. Rather, it's bringing that clinical thinking and that clinical research thinking, specifically, to data science. That's where epidemiology and biostatistics come in because you really need to understand those concepts so that you understand which questions you should be asking. Are you using the right dataset to ask those questions? Are there biases that could be present?
Harrington: Every week, as you know, we all pick up our journals, and there's a machine-learning paper in one of the big journals all the time. Some of the pushback you'll hear, whether it's on social media or in letters to the editors, is why did you use machine learning for this? Why couldn't you use classical logistic regression?
One of the speakers in your session, I thought, did a nice job of that. He said that often, standard conventional statistics are perfectly fine. Then there are some instances where the machine is really better, and imaging is a great example. Would you talk to the audience a little bit about that?
John: I see it more as a continuum. I think it's helpful to see it that way because right now, we see traditional biostatistics and machine learning as completely different. Really, it's a spectrum of tools. There are simple machine-learning methods where you can't really differentiate much from statistical methods, and there's a gray zone in the middle. For simpler data, such as tabular data, maybe.
Harrington: Give the audience an example of tabular data.
John: For example, if you have people who have had a myocardial infarction (MI), and then you have characteristics of those individuals, such as age, gender, and other factors, and you want to use those factors to predict who gets an MI, in that instance, traditional regression may be best. When you get to more complex data, that's where machine learning really shines. That's where it gets exciting because they are questions that we haven't been able to ask before with the methods that we have. Those are the questions that we want to start using machine learning to answer.
Harrington: We've all seen papers again over the past few years. The Mayo Group has published a series of these about information that you can derive from the EKG. You can derive, for example, potassium levels from the EKG. Not the extremes that we've all been taught, but subtle perturbations. I think I knew this, but I was still surprised to hear it when one of your co-speakers said that there are over 30,000 data points in the typical EKG.
There's no way you can use conventional statistics to understand that.
John: Exactly. One thing I was a little surprised to see is that machine learning does quite well with estimating the age of the individual on the EKG. If you show a cardiologist an EKG, we could get an approximate estimate, but we won't be as good as the machine. Modalities like EKG and echocardiogram, which have so many more data points, are where the machine can find patterns that even we can't figure out.
Harrington: The secret is to ingest a huge amount of data. One of the things that people will ask me is, "Well, why is this so hot now?" It's hot now for a couple of reasons, one of which is that there's an enormous amount of data available. Almost every piece of information can be put into zeros and ones. Then there's cloud computing, which allows the machine to ingest this enormous amount of information.
You're not going to tell the age of a person from a handful of EKGs. It's thousands to millions of EKGs that machines evaluated to get the age. Is that fair?
John: This is where we talk about big data because we need large amounts of data for the machine to learn how to interpret these patterns. It's one of the reasons I'm excited about AI because it's stimulating interest in multi-institution collaborations and sharing large datasets.
We're annotating, collecting, and organizing these large multi-institutional datasets that can be used for a variety of purposes. We can use the full range of analytic approaches, machine learning or not, to learn more about patients and how to care for them.
Harrington: I've heard both Calum and Rahul talk about how they can get echocardiograms, for example, from multiple institutions. As the machine gets better and better at reading and interpreting the echocardiograms or looking for patterns of valvular heart disease, they can even take a more limited imaging dataset and apply what they've learned from the larger expanded dataset, basically to improve the reading of that echocardiogram.
One of the things it's going to do, I think, is open up the opportunity for more people to contribute their data beyond the traditional academics.
John: Because so much data are needed for AI, there's a role for community centers and other institutions to contribute data so that we can make robust models that work not only in a few academic centers but also for the majority of the country.
Hype vs Hope of AI
Harrington: There are two more topics I want to cover. We've been, in some ways, talking about the hope of what we're going to use this for to make clinical medicine better. There's also what's been called the hype, the pitfalls, and the perils. Then I want to get into what do you need to know, particularly if you're a resident fellow, junior faculty member.
Let's do the perils and the hype. I hear from clinicians, particularly clinicians of my generation, that this is just a black box. How do I know it's right? People point to, for example, the Epic Sepsis Model, which failed miserably, with headlines all over the place. They worry about how they know whether it's right.
John: That's an extremely important question to ask. We're still in the early days of using AI and trying to figure out the pitfalls and how to avoid them. I think it's important to ask along the way, for each study, what is going on here. Is this a model that we can trust and rely on?
I also think that it's not inevitable that AI will transform healthcare just yet because we are so early on, and there is hype. There are some studies that aren't done well. We need more clinicians understanding machine learning and getting involved in these discussions so that we can lead the field and actually use the AI to transform healthcare.
Harrington: As you push algorithms into the healthcare setting, how do we evaluate them to make sure that the models are robust, that the data are representative, and that the algorithm is giving us, I'll call it, the right answer?
John: That's the tough part. I think one of the tools that's important is a prospective trial. Not only creating an algorithm and implementing right away but rather studying how it does. Is it actually working prospectively before implementing it?
We also need to understand that in healthcare, we can't necessarily accept the black box. We need explainability and interpretability, to get an understanding of the variables that are used, how they're being used within the algorithm, and how they're being applied.
One example that I think is important is that Optum created a machine-learning model to predict who was at risk for medical complications and high healthcare expenditures. The model did well, so they used the model to determine who should get additional resources to prevent these complications.
It turns out that African Americans were utilizing healthcare less, so their healthcare expenditure was lower. Because of that, the algorithm was saying these are not individuals who need additional resources.
Harrington: It's classic confounding.
John: There is algorithmic bias that can be an issue. That's why we need to look at this as clinical researchers and ask, "What's going on here? Are there biases?"
Harrington: One of the papers over the past couple of years came from one of our faculty members at Stanford, which looked at where the data are coming from for these models. It pointed out that there are many states in this country that contribute no data to the AI models.
That's part of what you're getting at, and that raises all sorts of equity questions. You're in Massachusetts. I'm in California. There is a large amount of data coming from those two states. From Mississippi and Louisiana, where we are now, much less data. How do we fix that?
John: I think we fix it by getting more clinicians involved. I've met so many passionate data scientists who want to contribute to patient care and make the world a better place, but they can't do it alone. They can't recruit health centers in Mississippi. We need clinicians and clinical researchers who will say, "I want to help with advancing healthcare, and I want to contribute data so that we can make this work." Currently, we have so many advances in some ways, but AI can open up so many new opportunities.
Harrington: There's a movement to assure that the algorithm is fair, right? That's the acronym that's being used — to make sure that the data are representative of the populations that you're interested in and that you've eliminated the biases.
I'm always intrigued. When you talk to your friends in the tech world, they say, "Well, we do this all the time. We do A/B testing." They just constantly run through algorithms through A/B testing, which is a randomized study. How come we don't do more of that in healthcare?
John: I think it's complicated because we don't have the systems to do that effectively. If we had a system where patients come into the emergency room and we're using AI in that manner, then maybe we could start to incorporate some of these techniques that the tech industry uses. That's part of the issue. One is setting up systems to get the right data and enough data, and the other is how do we operationalize this so that we can effectively use AI within our systems and test it within our systems.
Harrington: As a longtime clinical researcher and clinical trialist, I've always asked why it is that clinical research is separate from the process of clinical care.
If we're going to effectively evaluate AI algorithms, for example, we've got to break down those barriers and bring research more into the care domain.
John: Yes. I love the concept of a learning health system and incorporating data and data collection into the clinical care of patients.
Harrington: Fundamentally, I believe that the clinicians make two types of decisions, one of which is that the answer is known. I always use the example of aspirin if you're having an ST-segment elevation MI. That's known. It shouldn't be on the physician to remember that. The system and the algorithms should enforce that. On the other hand, for much of what we do, the answer is not known, or it's uncertain.
Why don't we allow ongoing randomization to help us decide what is appropriate? We're not quite there yet, but I hope before the end of my career that we can push that closer together.
No Coding Required
All right. Final topic for you. You talked this morning about what you need to know. Cardiology fellows and residents must approach you all the time and say, "Hey, I want to do what you do," or, "I don't want to do what you do because I don't want to learn to code, but I want to know how to use it down the road."
What do you tell students, residents, and fellows that they need to know?
John: I think all trainees and all clinicians, actually, should understand the fundamentals of AI because it is being used more and more in healthcare, and we need to be able to understand how to interpret the data that are coming out of AI models.
I recommend looking up topics as you go along. Something I see is clinicians avoid papers that involve AI because they feel they don't understand it. Just dive in and start reading these papers, because most likely, you will understand most of it. You can look up topics as you go along.
There's one course I recommend online. It's a free course through Coursera called AI in Healthcare Specialization. It's a course by Stanford, and it does a really good job of explaining concepts without getting into the details of the coding and the math.
Other than that, for people who want to get into the coding, first of all, don't be afraid to jump in. I recently talked to a friend who is a gastroenterologist, and she said, "I'd love to get into AI, but I don't think I'd be good at it." I asked, "Well, why not?" She said, "Because men tend to be good at coding."
I do not think that's true.
Harrington: I don't think that's true either.
John: It's interesting because we're all affected to some extent by the notions that society has instilled in us. Sometimes it takes effort to go beyond what you think is the right path or what you think is the traditional way of doing things, and ask, "What else is out there. What else can I learn?"
If you do want to get into coding, I would say that it's extremely important to join a group that specializes in healthcare AI because there are so many pitfalls that can happen. There are mistakes that could be made without you realizing it if you try to just learn things on your own without guidance.
Harrington: Like anything else, join an experienced research group that's going to impart to you the skills that you need to have.
John: Exactly.
Harrington: The question about women being less capable coders than men, we both say we don't believe that, and the data don't support that. It's interesting. At Stanford, for many years, the most popular major for undergraduate men has been computer science. In the past few years, it's also become the most popular major for undergrad women at Stanford.
We're starting to see, to your point, that maybe some of those attitudes are changing, and there'll be more role models like you to really help that next generation of fellows.
Final question. What do you want to do when you're finished?
John: My interests have changed, and now I'm veering away from academia and more toward the operational side of things. As I get into it, my feeling is that currently, the challenge is not so much creating the AI models but rather, as I said, setting up these systems so that we can get the right data and implement these models effectively. Now, I'm leaning more toward informatics and operations.
I think it's an evolving process. Medicine is changing quickly, and that's what I would say to trainees and other clinicians out there as well. Medicine is changing quickly, and I think there are many opportunities for clinicians who want to help make it happen in a responsible and impactful manner.
Harrington: And get proper training to do it.
John: Yes.
Harrington: Great. Jenine, thank you for joining us. I want to thank you, the listeners, for joining us in this discussion about data science, artificial intelligence, and machine learning.
My guest today on theheart.org | Medscape Cardiology has been Dr Jenine John, who is a research fellow at Brigham and Women's Hospital, specifically in the data science and machine learning realm.
Again, thank you for joining.
Robert A. Harrington, MD, is chair of medicine at Stanford University and former president of the American Heart Association. (The opinions expressed here are his and not those of the American Heart Association.) He cares deeply about the generation of evidence to guide clinical practice. He's also an over-the-top Boston Red Sox fan.
Follow Bob Harrington on Twitter
Follow theheart.org | Medscape Cardiology on Twitter
Follow Medscape on Facebook, Twitter, Instagram, and YouTube
© 2023 WebMD, LLC
Any views expressed above are the author's own and do not necessarily reflect the views of WebMD or Medscape.
Cite this: Robert A. Harrington, Jenine John. AI and Machine Learning in Healthcare for the Clueless - Medscape - Apr 10, 2023.
Comments