To me, it's all about precision medicine. If we know the right test to order for you the first time, then you shouldn't need a prior authorization. There won't be a delay in diagnosis if this is the right med for your hypertension because we understand you and people like you and we understand your pharmacogenomic profile so we know the dose to give you, and we don't have to wait six weeks for your next appointment to see if it works. We know it's going to work. I mean, to me, that's ultimately, but it's not like prescribing and all this stuff is really important. The most important thing, the most important thing is that we take care of people the right way the first time with real time evidence that's specific. So to me, that's what I'm really excited about that we could never do without these kinds of tools. To me, that's the holy grail. I mean, that's what we're excited about. That's what I'm excited about. That's where I want all this to go.
Welcome to How I Doctor, where we're bringing joy back to medicine. For two decades, healthcare got digitized, which turns out to be a very different thing from getting transformed. Dr. Mike Pfeffer is trying to close that gap. He's the chief information officer at Stanford Healthcare and my alma mater, the Stanford University School of Medicine. Stanford is one of the most AI forward health systems in the country. I'm excited to ask him what that means to be at the bleeding edge of the bleeding edge of healthcare technology. His teams have put real generative AI into the hands of Stanford clinicians, ambient documentation, a tool that lets you query the patient's chart in plain language, AI drafts of replies to patients and, unusually, they've published all the outcomes, including the parts that didn't go the way maybe people suspected. He's also still a practicing hospitalist and, as everybody on this podcast knows, you can't be a good ER doctor without a good hospitalist as a partner. Mike is, I think, like me, the person that likes to build and shape things, but also has to use it at 2:00 AM. Today we'll get into what Stanford has actually built and what the data actually shows, and the difference between a health system that deploys AI responsibly, and one that just deploys it. I'll be right back with Dr. Pfeffer after a word from our sponsor. Dr. Pfeffer, welcome to How I Doctor.
Thank you, Graham.
I'm really excited to talk to you today. You're one of the only people in American medicine who rounds on inpatient wards in the morning and then goes to the work building, AI tools, with those same physicians they're going to use in the afternoon. Mike, what does it feel like to be one of, I would say, the top one, two, three, the most cutting edge health systems that is, I think, building and using AI, but also kind of experimenting with it at the same time.
Yeah, thanks, Graham. It's great to be here and it is a really exciting time and very lucky to be at a place like Stanford that is really about innovation. I was actually just on service last week and I had three amazing residents and two medical students, and it's just so much fun to see how excited they are to embrace the new technology, but also love going and talking to the patients and really sharing that experience with them. So it's a lot of fun to see the future generations of physicians doing some amazing things and leveraging these new technologies. It's just really exciting to be able to think about ways to leverage this technology in real life and then do it in a responsible way and then measure it and learn from it. And it takes an enormous team of people to really think through this. Everyone from really incredible data engineers, data scientists, research faculty, informaticists to come together and say, "How are we going to do this in a way that makes sense for our clinicians, our patients? And also we can measure it and see how we can make it better."
I mean, we'll get into the paper that you guys just published a few days ago in the New England Journal, but the sense I got was A, this is a ton of work. This is not, oh, you just set it and forget it, or you just deploy it and things just magically work. This whole idea that AI is just going to make us spend less in medicine or spend less in technology I think couldn't be further from the truth. Maybe we're going to move how we spend or divert resources from this place to that place, but it's a lot of work to deploy all of the models that you guys deploy at Stanford, whether you are doing them or you're partnering with a company that's deploying them.
I agree. There's actually a really interesting graph I saw where the spend on AI tools from healthcare far outweighs, it's 3X every other industry. And so, one could look at that graph and say, "Wow, health systems are really embracing this. Healthcare is really embracing AI." Or it's that, like you said, we've been digitizing the way we've been doing things forever, that we're just using AI to further automate the things that really need to change. And so other industries have already gone down that road and they're really trying to say, "Where can I find value in AI?" Where I think in healthcare there's a ton of just out of the box stuff, writing a prior authorization letter or whatever, that's great. AI can do a good job with that, but should we even be doing that is really the question. And so I think we have to undergo that transformation before we're going to see the true value of AI in terms of, at least, cost, I think, in healthcare.
I'm always envisioning this really sad kind of depressing Wally version of the future where there's just a little robot fax machine and it's just putting out a TPS report and nobody knows why, but it just is, and we are way so abstracted away from why this report exists that the robot just... humanity has died out and it's just continuing to fax things back and forth. I've never thought of it that way, that some of the investment isn't necessarily, "Oh, healthcare is first or healthcare is the most excited." It's almost like A, that we're trying to automate stuff that should have been automated a long time ago, and that that's actually maybe an indicator of lag that the healthcare's always 10 years behind, and we're trying to catch up to be five years behind or something like that.
I mean, if you think about AI scribes are amazing, huge adoption very, very quickly and we could argue whether that's good or bad or whatever. I think it's actually really good because I think we're going to get better data and better notes, but it really just digitized the same process we've been doing forever, which is talking to patients and gathering a history. And so is there a better way to do that at scale rather than just digitizing what we've always been doing? The simple digitization of, I'm going to listen and create a note to now a whole ecosystem around what that encounter is going to be is really going to be the transformation from digitization to digital.
How do you think about what to deploy in terms of an AI model or a predictive model, a generative model when it comes to how good it needs to be compared to a human?
Yeah, I think it varies depending on what the problem is we're trying to solve and what tools, then, we're going to go solve it with. In fact, there are plenty of problems that we could solve in healthcare that don't even require AI. They just require people to get in a room and decide on, this is how we're going to do it. Scheduling is a great example of that. Percent of appointments actually scheduled online is quite low across the country. And so that doesn't really need AI, but if we could get to that level where it's the first way appointments are scheduled, now you can start to layer AI on in really creative ways. So it really goes back to what problem are you trying to solve? Can it and should it be solved with AI? And then what is the risk of solving that problem with AI? And to do that, you really need a framework. And we use the FURM framework here, which is fair, useful, reliable models that was developed by Nigam Shah's lab, our chief data scientists, but that allows us to objectively determine how useful is the model going to be in many different dimensions. And that also helps us determine risk. If it's a very low risk use case, then we don't have to send it through all the governance and all the frameworks and pieces of the framework. We can say, okay, well, this is going to really have benefit. It's low risk. Let's move forward versus things that are going to be much more high risk like clinical care decision making.
I like the FURM, fair, useful, reliable models framework. It's similar to how I think as the lowest bar we would all expect for healthcare. And the way I've said it before, without as good of an acronym, is like it's unbiased or less biased, it is accurate and it is safe or not harmful. I think that's the bar that we all expect and that I think are kind of probably the shared views of physicians and nurses that we would expect any tool, whether it's a CT scanner or an ultrasound or a new medicine to have those values behind them. Stanford has access to, I would say, many of the very top researchers, academics, data scientists in the country. From your perspective as the CIO and also as a colleague of all these smart people, do you have a sense for how much of clinical work, clinical tasks that a hospitalist and ER doctor does, will in, say, I don't know, five years be automated?
There are definitely administrative tasks that I think and hope go away like billing.
Yes.
If you write a note or dictate a note or scribe a note or however it's going to be, so it's more flexibility for clinicians now than I think there ever was. But I think billing, choosing the diagnosis, coding our favorite queries that we get three weeks later, we're like, "Did that patient really have hypochloremia?"
Was this decompensated congestive heart failure, Dr. Pfeffer?
Yeah. That should all go away. I mean, hopefully that will all go away because that really can be done, I think, fairly accurately with AI. And the stakes are low. I mean, if you miss a level three versus a level two once in a while, you're not going to have a patient harm, and that can be audited piece. And then it's going to be really delivering much more precision focused insights to clinicians in real time for them to make better decisions. I think you're going to see a lot of that. And then we could dream a little bit about agents, and could you order an agent? Could you order the DVT prophylaxis agent that then goes and analyzes the patient, knows the guidelines, makes a determination what anticoagulant they need to be on for DVT prophylaxis, if any, orders it, knows that, "Oh, there's a surgery coming up. I better hold that antibiotic." That's actually really hard to do, to create an agent to do that. But my hope is that we start to offload some of these kind of tasks that are really guideline-based and are really beneficial to patients that can be ordered by clinicians and then the agent goes and does its thing.
You're right. Things like anticoagulating in a hospitalized patient are mostly guideline and algorithm. And then occasionally there's a little medical factor that's like, "Maybe I won't put this person on blood thinners or maybe I will." So a task specific agent that is just kind of owning just this little carve out that we think would be really helpful to the humans. Certainly humans don't do inpatient anticoagulation perfectly either.
Right, exactly. I'm pretty excited about, how do you start to build that into the toolbox that clinicians have that offload kind of... I'm not going to call it busy work because it's really important, but more guideline driven work that should be done for every patient or in the case of DVT prophylaxis to really accurate dosing say in certain medications for other particular patients. And pharmacogenomics is another great example that not necessarily AI, but really can be incorporated into how we think about prescribing for patients. I think it's really going to have a lot of benefit.
Well, in your New England Journal Catalyst, I guess you guys are running a model that will say which lab tests you think can get safely canceled?
Yeah.
That seems like another outstanding use like, oh, let's draw less blood, let's cause less iatrogenic anemia in our patients, and we don't think it's going to change the outcome anyway.
Yeah, I mean, that's a great example. I mean, who wants to be stuck more than they have to be? I'm supposed to get my annual labs and I'm like three weeks out because I don't like that either. And so anything we can do to make... If we don't need the blood, it's not going to change management, and it's going to be better for the patients. These are really good examples where I think daily labs are ordered and it's meant well, but people forget and then they don't get canceled, and you know how that works. So I think there's opportunities around that. It's cost savings, obviously. Phlebotomists need to come and do that work, but more importantly, it's better for patients. So I think we'll see a lot of those things kind of start to happen in the next five years for sure.
Let me ask you about ChatEHR. I think it was really cool when you guys announced it. Can you just describe what ChatEHR is and where it came from?
The story goes that... So early on we deployed something we called SecureGPT, which was, we basically took a bunch of models, made it HIPAA-compliant, we wrapped it in a shell and deployed it out to Stanford Medicine. We said, "Go ahead and use it." And this was pretty early on, but we really wanted to give people a playground to really start to experience these things and how they can be used in clinical care. And then we learned what people were doing. We said, "Don't use it for clinical decision making." And of course people were curious how these things were going to perform. Anyway, it was clear that by combining the capabilities that required moving data from one system to another system. So it was dreamed up, okay, well, let's bring this stuff together. And it was really like an incredible, again, collaboration between many, many people. I mean, it's a really fascinating project that led to what we're calling ChatEHR and it basically has two major functions. One is you can, within a patient, if you're caring for a patient, you can ask ChatEHR questions about the patient like, summarize the hospital course for me or when's the last time the patient got an echocardiogram? Can you trend the ejection fractions? I mean, the list goes on and on and on, you can imagine. But the other side is this automation platform where you can use highly complex rules or criteria and apply that in real time to any set of patients you want, like the lab example that you talked about. And that's really powerful because whenever you in the past were doing clinical decision support or reporting, whatever is very much rules-based. If this, then this, display this, whatever. That's just an amazing capability of using large language models to do these very complex tasks that we couldn't do before. So now it's live for anybody who has access to electronic health record at Stanford Healthcare and they just need to watch a training video because you have to learn how to use these things and do an attestation and then you get access to it.
Is Stanford trying to be a test bed, an experimental place that helps push the industry forward? Or do you view it as like, oh, we're trying to set a vision for how we think other health systems should work or both? Stanford's obviously unique for lots of reasons, but how do you see your vision or mission for what you're deploying?
No matter where you go for healthcare, you should get amazing care. So what we're pushing forward is trying to learn from these tools, what works, what doesn't work, how do you measure it? What is the infrastructure you need in place, and then share all that knowledge. And of course, we're not the only ones. There's many health systems, my peers, doing amazing stuff, but I think this is one opportunity where together we're going to be way stronger than if we try to do this apart. I think we're going to have to continue to learn from what everybody's doing. There's a ton of vendors in the space now. So what works well, what doesn't work well, what should you do? What should you build yourself? What shouldn't you build yourself? These are all ongoing questions that I think are really important. AI is not cheap. This stuff costs a lot of money. So you really want to make good decisions that are going to benefit your patients, your clinicians and your health system, the win-win-win basically. And to do that, we need to collaborate and learn. So yes, it's about pushing the boundaries, but then learning from that and sharing. And that's why we do a lot of these publications because we want people to learn from what we're doing, give us feedback, we learn from others, keep iterating. And I think we're going to end up in a much better place if we all do that together.
I know in your paper it says Stanford Healthcare runs 1,500 software applications with 3,100 interfaces.
Yes.
Is that you guys being extra or is that standard for an academic medical center?
It's probably a little bit extra than we'd like to be, and it all depends on what you count as an application and we can have a debate on that. Definitions are going to have to change. So we're actually working through this process right now. Our chief applications officer, Heidi Collins, who's amazing and really thinking through this is like, well, what is an agent? Is that an application? Is an AI model producing result in application? What's an application anymore, and what is this going to look like? And are we really talking about platforms, applications on those platforms, agents and apps as separate? I mean, people are vibe coding apps now. What are those? Are those applications? Are they apps? So redefining what this space is going to look like is also really important as times are changing, so we better understand where we need to go, what do we need, what do we don't need, where can we consolidate?
Mike, you have lots of very smart people at Stanford who all have great ideas and they all have titles about how incredible they are, and they're all NIH funded, and they're all brilliant and truly they are. But how you guys have 1,500 applications I bet you every attending at Stanford thinks they want one for just the way they want to do things, and then their approach to surgical site risk, or whatever, is the right way and everyone else is wrong. How do you herd the cats of Stanford Healthcare of everybody wanting to build their own no-code, low-code solution?
I would say yes, there are brilliant people and sometimes I'm like, "What am I doing here?" There's Nobel laureate walking down the street. And I mean, it's really fun. It's really fun to be in a place where people are really challenging you in very positive ways. But at the same time, there's an understanding, I think, here of collaboration that has to happen, and that not every idea is going to make it. And so I think people are really understanding of back to the, is it a problem that needs to be solved? And even if it is, is there any value to solve that problem? Because there's lots of problems that need to be solved.
Totally. Yeah.
So I think people do really understand that here and really want the best thing for the patients and the clinicians. Just an example, our radiologists and pathologists agreed on a single PACS platform. So now we have everyone working together on one platform, which gives us great capabilities. And that level of collaboration does happen here, and it's really amazing to see just how much leadership, in both chairs of radiology and pathology are amazing, really led that to say, "Let's do this together. We're better together. We're better if we consolidate some of these things."
I was listening to... Ethan Molack gave a lecture at a conference I was at and one of the things he pointed out that is going to be an interesting challenge in every industry, but also healthcare is just the lines are going to be blurring around what someone's job description entails or what does it mean that a physician that works for Stanford can start coding, not diagnostic coding, can just start programming through vibe coding and these tools. And similarly, what does it mean for an engineer to start having data science or marketing expertise through leveraging these tools as well? What do you think the future holds for healthcare when it comes to generative AI in the next few years? What are you excited by?
Well, I mean, ultimately, to me, it's all about precision medicine. If we know the right test to order for you the first time, then you shouldn't need a prior authorization. There won't be a delay in diagnosis. If this is the right med for your hypertension because we understand you and people like you, and we understand your pharmacogenomic profiles so we know the dose to give you, then we don't have to wait six weeks for your next appointment to see if it works. We know it's going to work. I mean, to me, that's ultimately what all this is about. I mean, automating coding and scribing and all this stuff is really important, but it's not the most important thing. The most important thing is that we take care of people the right way the first time with real-time evidence that's specific. So to me, that's what I'm really excited about that we could never do without these kinds of tools because the data sets are massive. The last I heard is if you're in internal medicine, you want to keep up with the literature, you have to read articles 27 hours a day. So of course that's impossible. So how do we bring all that so that the best decisions can be made for the patients? And to me, that's the holy grail. I mean, that's what we're excited about. That's what I'm excited about. That's where I want all this to go. And on the way there, let's get rid of all these tasks that we could probably automate, that we don't really need to do anymore.
Mike, I ask this of everybody that works with trainees, students and residents and fellows, how do you think medical training changes? And I guess how do you view AI, generative AI predictive models impacting the way that people learn, how their brains work, how they fundamentally practice medicine? If the AI model says, "Oh, 99% chance of admission," when they walk in the door at the ER, how hard is the ER doctor going to think about it? They're probably just going to say, "Hey, Mike, I've got a patient for you. Why are you admitting them? I don't know. The model just-"
Does my hospitalist model then say the ED model is wrong and we just have the models fight on whether the patients [inaudible 00:25:09]?
Yeah, that's the specialty version of the health system versus the insurer prior auth.
So we kind of know a little bit about this, I think, with decision support of order sets. Remember order sets. A great example I remember is if you have a community acquired pneumonia order set back in the day when we would prescribe a fluoroquinolones, line one would be Ceftriaxone azithromycin, and the next line would be levofloxacin, and then 70% of people always chose the first one. And then we know with drug-drug alerts and everything, 95% of them are overwritten. So we kind of know a little bit of how people interact with decision support.
That's interesting. Yeah.
... In the past and there always have been tools, more recently, where you can look things up about the medical literature and get advice on what to do. So that's kind of always been there. Now we're making it, I think, a little more easy to get that information. I think we're going to be providing more and more kind of predictions to help steer you. I guess I'm not as worried about it as a lot of people are because we still have a long way to go and there's still so much work that happens at the bedside and on rounds and thinking through patients. And I mean, of course, our surgeons who are in the OR, they're still doing the surgeries. I think it's really, how do we embrace the technology to make better decisions, more accurate decisions, and still be curious. I mean, if we lose our curiosity, then I think we got a problem. But I got to tell you, I was on service last week and there was a ton of curiosity, and there was a ton of respect for what it means to be a doctor and take care of people. And to see that with our residents and med students makes me feel like we're going to be okay.
Yeah, I love that. That's great. That's great optimism that I would say I mostly share. Mike, you're exposed to probably the most AI technology pitches and have all the brightest minds at your disposal. In maybe, let's say, five years, you're still practicing as a hospitalist. What does the ideal perfect hospitalist day look like because it's all technology enabled to let you do the stuff you enjoy, and that you think is most valuable for you and your patients?
The best part is getting to talk with the patients, understand how we can help them and then focusing on that as opposed to all of the insurance-related of tasks, billing-related tasks, DME forms, all of the stuff that really doesn't add to the care, I think, is what I'd love to see go away so we can really focus on listening to the patients, providing that really excellent care, right test first time, making those diagnoses. And then I think a lot of care is going to continue to transition to the home, and that may be after they get admitted and then we finish up care at home or right from the ED or whatever. But that partnership, I think, especially between ED and internal medicine and how we do that, I think is going to continue to grow and the lines are going to blur. I'm still going to print my list out. So don't take [inaudible 00:28:43].
You can take the fax machine, but not the printer.
Take the fax. Faxes should be done. Yeah, I still need to... Maybe we have one printer in the hospital and we just line up to get the list for today [inaudible 00:28:53].
Finish this sentence. The biggest mistake medicine could make with AI is...
Buy a lot of stuff with not understanding the value.
Yeah. Dr. Pfeffer, we opened on digitized isn't the same as transformed. If we sit down in five years, what's the one thing that'll tell us that Stanford actually transformed the work that their clinicians are doing and didn't just build a nicer interface?
I think we're doing that already, and it's because some of the emails I get from our clinicians saying just how they love the tools we have and how it's changed their life. I think the harder thing is going to be, we can transform within our four walls, but it's transforming healthcare. I mean, I can't get rid of prior authorizations at Stanford, and I think there's so many opportunities that we need to look at as a country to say, how are we going to do things differently that we can all take advantage of? And that includes payers, includes government, includes health systems. But I think we'll be able to transform internally up until a point and then obviously we're going to hit the wall on what we can do. So on the flip side, I think drug discovery and what's happening on the basic science side here, I mean, sky's the limit. We can talk about AI scribes and we'll automate billing, but drug discovery, and new molecules and peptides, and cell-based therapies and all this stuff, that is going to be, I think, far more impactful to healthcare than getting rid of prior authorization.
What's the one thing that you wish other stakeholders or other health systems would adopt from Stanford?
I think it's having a framework that you really abide by and you continue to monitor these things, and to just keep asking really great questions and share. So if something works really well that you've done at your health system, and there's tons of amazing stuff going on at health systems across the country, share it, publish it, say why it worked, and how we can adopt it. That would be amazing because again, we're all here for the same reason, benefit our patients. So if we can share some really amazing things of what's working really well, that's fantastic. So it really is about sharing things that work and sharing things that don't work. And that's actually kind of a problem because we don't really publish negative studies. So how do we do that better, I think, is a learning point for medicine when things don't work really well. And there's some good papers on the sepsis algorithm and how it doesn't work or does work or whatever. And we could argue about sepsis algorithms all day long, but the point is, there were some really good publications on, "Hey, maybe this isn't the right thing. Maybe this isn't actually working, and we really need to ask why isn't it working?" And so I think we need more of that, just floored by the amazing innovation going on at so many health systems, and it doesn't have to be academic medical centers either. It's just great innovations. I love the fact that clinicians are diving into this and taking the reins because we all know what happens when clinicians aren't at the forefront of technology in healthcare. Let's continue to be deeply invested in this and drive where this technology needs to go. That's really exciting to me, just how many people are engaged. So I'm very hopeful that we're going to see significant wins by sharing across different systems as it comes to AI.
Well, what a perfect way to end it there. Dr. Michael Pfeffer, thank you so much for being here on How I Doctor.
Thank you, Graham. It's been a pleasure.
Thanks for joining me today. For interviews with physicians creating meaningful change, check out offcall.com/podcast. You can find How I Doctor on Apple, Spotify, or wherever you listen to podcasts. We'll have new episodes weekly. This has been and continues to be Dr. Graham Walker. Stay well, stay inspired, and practice with purpose.
Dr. Michael Pfeffer is Stanford Healthcare's Chief Information Officer. He is also a practicing hospitalist, which means he builds AI tools in the morning and uses them on patients that afternoon. That combination is rarer than it should be, and it shapes everything about how he thinks about what gets built, what gets deployed, and what actually changes care.
Stanford is one of the most AI-forward health systems in the country. It runs 1,500 software applications with 3,100 interfaces, has put generative AI into the hands of its clinicians through ambient documentation and a plain-language chart querying tool called ChatEHR, and has published the outcomes of all of it, including the parts that did not go as expected. The paper Mike's team recently published in the New England Journal Catalyst is not a success story. It is a working document from a health system trying to figure out what responsible deployment actually looks like at scale.
The conversation starts with a number that stops you cold. Healthcare spends three times more on AI than any other industry. Mike's read on that statistic is not flattering. Other industries, he argues, have already gone down the road of automating their existing processes and are now asking harder questions about where AI actually creates value. Healthcare is still automating things that should have been redesigned, or eliminated, years ago. Writing prior authorization letters is the example he uses. AI can do it. But the more important question is whether prior auth should exist at all. Digitizing a broken process is not transformation.
The episode covers what Stanford has actually built, the governance framework behind every deployment decision, the early results from ChatEHR, and the vision for what precision medicine looks like when the data sets are finally large enough to tell you the right test, the right drug, and the right dose for a specific patient before they leave the room. Mike is careful to distinguish between the work that is already happening inside Stanford's walls and the harder, system-level transformation that requires payers, government, and health systems to change together. The internal wins are real. The wall at the edge of those walls is also real.
The episode closes on two questions Graham puts to every guest who works with trainees. Is AI going to deskill the next generation of physicians? And what does medicine look like if it gets this right? Mike's answer to the first is less alarmed than most. Physicians have always had decision support tools. What matters is whether the people using them stay curious. His answer to the second is the one he opened the episode with: the right test, ordered once, for the right patient, with real-time evidence that is specific to them. That is the goal everything else is working toward.
Thank you to our wonderful sponsors for supporting the podcast:
Evidently - Leading AI-powered clinical data intelligence https://evidently.com/
Here is what Mike's experience building and using these tools at one of the country's most AI-forward health systems actually adds up to.
Join Offcall to keep reading and access exclusive resources for and by the medical community.
Offcall Team is the official Offcall account.
See what your colleagues are saying and add your opinion.