27 Febbraio 2025

A new era of ambient intelligence in healthcare

Next time you’re in a public place, stop and look around. Notice how many people are head’s down, staring at their phones. This is one of the unintended consequences of technology: while the intent is to connect us more to the world, it often distracts us from what’s actually happening around us.   

This unintended technological distraction has also had a negative impact in healthcare. Over the last decade, increasing regulations and mounting administrative burdens placed upon doctors, nurses, and radiologists, have come at a high cost to those who had dedicated their lives to caring for others. The effects of this have been well documented, with rising job dissatisfaction and burnout rates, increasing staffing shortages as clinicians leave the workforce, and the continued erosion of doctor-patient connection.1

As a technologist who has been working on cracking some of the thorniest problems in healthcare, it’s painful to know that for years, despite our best efforts, technology has seemed one step behind in being able to restore the joy of caring for patients while simultaneously providing a more connected digital experience. 

That is, until the introduction of GPT. With generative AI, we’ve seen an incredibly positive and disrupting force in healthcare, and these gains will only increase as this critical innovation is applied to some of the most complex problems in healthcare. In fact, over the next three years, we will begin to see a tectonic shift in the entire user experience, moving from technology that is injected into various use cases to the pervasive infusion of AI that is seamlessly embedded into the ways we live and work.   

In healthcare, ambient intelligence will be the driving force for restoring the joy of practicing medicine and providing a better experience for patients. 

The real story of ambient intelligence  

There’s a lot written about technology curves and AI in healthcare, but I want to tell you the story that isn’t in the history books. The real story of how ambient intelligence was born. 

Some of us are old enough to remember the original Star Trek from the 1960’s where there was a computer that would be listening to the crew have a conversation and then weigh in with any guidance related to the situation at hand. It wasn’t trying to take over, it wasn’t replacing the captain and officers on the bridge, it was just supporting the team by adding insights in real time to augment the decision-making process.   

Most of us saw this as a cool sci-fi idea until one day, during a meeting with Epic, we talked about finding a way to make healthcare more intuitive, like the AI in Star Trek. The gauntlet had been thrown, and we were in.

Charting a new course in healthcare technology 

Inherent in ambient intelligence are two equally important variables, accurately transcribing a conversation between the doctor and patient into a text, and then turning that transcript into a clinical note.  

That was back in 2014, when there were no large language models, patient data wasn’t widely available, systems were extremely siloed, there wasn’t a way to even capture the recording and, even if those other aspects were possible, speech recognition for clinical conversations were running at about 50% word error rate (WER). This meant that the speech recognition system was getting only correctly capturing about half of the words spoken. That was essentially the state-of-the-art for ambient medical speech recognition and simply put, it didn’t work.

We weren’t sure if and when we’d ultimately be successful, but we knew the first challenge that we needed to tackle was getting more data to feed our models so that we could understand this emerging ambient workflow. We started a research program to boost recognition performance for ambient conversational medical speech because at that time, the major breakthroughs were being made in neural computing.

We then turned our attention to abstractive summarization, or essentially trying to figure out how to convert the conversational transcript between the doctor and patient into a structured clinical note, which is subject to a variety of constraints and requirements necessary for appropriate documentation.

Back then summarization was in its infancy, but the new neural summarization technology showed a lot of promise when large in-domain data sets comprised of millions of input and summarized output pairs were available. Although these data sets didn’t exist yet, there were virtual scribing workflows, where doctor-patient conversations were recorded and manually processed by human scribes. So, we made the decision to use clinical scribes to train the increasingly powerful models that were tailored to the task and then observe how their application accelerated clinical documentation. Essentially, the scribes were generating in-domain data that was then used by neural summarization machine learning to develop ambient summarization.

Given the complexities of a clinical encounter, we started with medical specialties that had highly-repetitive scenarios, like orthopedics, and then expanded to cover all ambulatory specialties across a larger population of doctors.

While we were making gains, they were incremental. To give you a sense of what this looked like, here is a chart that shows each new model revision as a plot point and you can see the percent of clinical encounters processed by AI and resulting human-in-the-loop edit rates, versus our forecast of where those figures would be.

Image source: HLS Solutions Research, January 2025
Image source: HLS Solutions Research, January 2025

The dawn of a new era  

It’s inevitable that anyone who’s tried to tackle an extremely thorny problem at some point will hit a wall where they ask themselves the question: Are we beating the problem or is the problem beating us? Although we had parity in converting a doctor-patient conversation to text, converting transcripts into customized clinical notes across specialties was challenging, and progress was slower than we would have liked.  We were using a human-in-the-loop to improve the quality of our model output, which wasn’t a scalable long-term solution, and we had stalled at an error rate that would not produce automation. We didn’t know the exact formula to make the problem yield.

Then, GPT happened.

Overnight, the scaling laws of AI changed. Major technological gains went from happening every one-and-a-half years to happening four times a year. While at the time, it had felt like we were hitting a wall, in hindsight, that time allowed us to deeply understand the requirements of how this technology would show up in the doctors’ workflow, and we partnered with the EHR companies to work through the technical details and optimize the user experience.

We immediately put a stake in the ground and began leveraging this new AI.

We used GPT as a shortcut to fine tune models and customize output, which allowed us to move faster while dramatically improving outcomes. We were also getting real-time feedback from clinicians who let us know what was working well and, most importantly, where the experience wasn’t optimized. It’s that latter feedback that is always the most helpful, because it enables us to triangulate the problems and work on ways to fine tune and improve the experience.

Based on the foundational models, we could see we would have a prototype in six months, but the challenge was that out-of-the-box GPT—while good—was not as performant as our bespoke models. That’s when we decided to combine generative AI and our unique training corpus. Within six months of a blistering R&D cycle, the team delivered a level of automation that had previously been unachievable in the prior six years. It was one of the first times in history that GPT-4 had been fine tuned for healthcare.   

The new scaling laws were bending the curve of innovation. We were at the dawn of a new era: The ambient AI market.

Image source: Epoch, ‘Parameter, Compute and Data Trends in Machine Learning’​ 
Image source: Epoch, ‘Parameter, Compute and Data Trends in Machine Learning’​ 

Over the course of 11 months, we went from zero users to creating the first clinical ambient intelligence experience for doctors that is trusted by more than 600 major healthcare systems, and producing more than 3 million episodes of care per month and growing. 

We achieved human parity, and had achieved a level of performance that enabled automation that provided doctors with a draft clinical note that required minimal editing, the automation problem had begun to yield. 

The future is now 

The future that we had classified as science fiction is here today, and ambient listening has already become table stakes. In fact, we release AI improvements weekly to our speech and listening technologies, which have been trusted and used by hundreds of thousands of clinicians for years.   

But more than that, we are witnessing a massive pivot unlike anything we’ve seen before: a new form of user experience—the combination of natural interaction and the infusion of real-time intelligence. 

As exciting as this all is, the true promise of addressing clinician burnout, improving the patient experience, and delivering better health outcomes hinges on collaboration and partnership. Every company operating in this space is limited by the laws of single company physics, which is why it’s an exciting time to be at a partner-led company. By opening up our ecosystem, we are harnessing the power of the Microsoft platform and extending it to thousands of companies worldwide that are focused on building applications and capabilities to improve the doctor-patient experience and positively impact the episode of care.   

We are enabling partners in the ecosystem to publish their capabilities directly into our ambient dial tone—the power of thousands of incredible minds all working to help clinicians, and solving for high-value use cases ranging from clinical condition diagnosis, autonomous clinical coding, and automating outbound healthcare consumer messaging, to enhancing data analytics and interpretation, medical literature discovery, autogenerating personalized patient educational materials, and automating clinical trial patient identification. These are just a few of the thousands of areas of innovation that are being actively worked on by healthcare companies worldwide. And this is the power of the platform. This is the ecosystem that will transform the way care is delivered, enhance patient experiences, support better outcomes across the health and life science ecosystem, and restore the joy of practicing medicine to clinicians around the world.   

Trust above all else 

No conversation about generative AI should happen without talking about responsibility, and no technology should be deployed without a detailed examination around what is contained in the data and how it is being used. Key responsible AI standards around fairness, reliability and safety, privacy and security, inclusiveness, and transparency must take the center stage in every discussion. AI is like a massive power tool, and data is the current powering it—so everyone handling it needs to be trained properly and aware of any unintended consequences or potential harm it could cause.  

Creating high-value use cases that deliver real outcomes 

In the end, the real testament to building outcomes-based technology comes down to one simple fact: does using it empower the person to do and be the best version of themselves? To that end, we carefully track the performance of all our solutions to make sure we’re building technology that is living up to its promise and exceeding expectations. I recommend that anyone who is advancing an AI agenda should do the same, because this is the real path to advancing human abilities and improving the healthcare ecosystem.   

Not every day is a win, and that’s okay—this is a marathon, not a sprint—but we continue to see powerful outcomes reported back by the people we serve. We’re seeing:  

  • 70% improvement in work-life balance for clinicians and reduced feeling of burnout and fatigue.2
  • 80% feel it reduces cognitive burden.3
  • 5 minutes save per clinician per encounter (on average).4
  • 93% of patients say their physician is more personable and conversational.5

Hear what clinicians have to say about this AI-powered clinical automation solution:

As great as these results are, we’re not settling. We’re going to keep pushing ahead, refining our models, working with doctors, nurses, radiologists, and leaders across the health care and life sciences ecosystem to deliver the best technologies for those who continue to dedicate their lives to helping others. We’re just at the beginning of our journey, and we will continue to relentlessly innovate, and find new ways to streamline documentation, surface information, and automate tasks for clinicians worldwide. 

Learn more 

Three doctors meet in the corridor and chat along the way looking at a digital tablet.

Microsoft Cloud for Healthcare

Accelerate innovation and improve healthcare experiences


1 AMA, Burnout benchmark: 28% unhappy with current health care job, May 17, 2022.

2 Microsoft survey of 879 clinicians across 340 healthcare organizations using DAX Copilot; July 2024.

3 Microsoft survey of 879 clinicians across 340 healthcare organizations using DAX Copilot; July 2024.

4 Microsoft survey of 879 clinicians across 340 healthcare organizations using DAX Copilot; July 2024.

5 Survey of 413 patients conducted by multiple healthcare organizations whose clinicians use DAX Copilot; June 2024.

The post A new era of ambient intelligence in healthcare appeared first on Microsoft Industry Blogs.


Source: Microsoft Industry Blog