Artificial Intelligence / Machine Learning Demonstration Projects 2025

Crowdsourcing ideas to bring advances in data science, machine learning, and artificial intelligence into real-world clinical practice.

Enhancing IBD Flare Inpatient Management: Integrating AI Tools into the ApeX EHR System for Improving Protocol Adherence, Patient Outcomes, and Reducing Healthcare Costs

Proposal Status: 

The UCSF Health problem

Inflammatory bowel disease (IBD), including Crohn's disease and ulcerative colitis, is a chronic inflammatory condition affecting 1.5 million people in the United States, with a significant hospitalization rate of 9.24 per 100 IBD patients annually [1]. Flares of the disease are a common cause of these hospitalizations. Optimal management of acute IBD flares necessitates timely surgical consultations, endoscopic evaluations, initiation of anticoagulant therapy, and possible surgical interventions [2]. However, adherence to these protocols is often compromised by heavy clinical workloads and oversight, leading to delays that diminish patient care quality, extend hospital stays, and increase healthcare costs [3].

UCSF, a tertiary medical center with a comprehensive IBD program, is committed to providing extensive care for patients with complex IBD cases necessitating hospitalization due to flares. Strict adherence to clinical protocols is vital for enhancing the quality of patient care, reducing the duration of hospital stays, and decreasing healthcare costs [3]. Despite these high standards, adherence to these protocols at UCSF, particularly at newly integrated sites such as Saint Francis Memorial Hospital and St. Mary's Medical Center, and during the absence of clinical fellows who assist the service, can fall short of expectations. This is largely due to the challenges posed by heavy clinical workloads and oversight.

How might AI help?

Our goal is to enhance the Advancing Patient-Centered Excellence (ApeX) electronic medical records (EHR) system at UCSF to improve monitoring and management of patients admitted with IBD flares. By integrating large language model (LLM) AI tools, such as VERSA at UCSF, both structured data (e.g., orders, lab values) and unstructured data (e.g., physician notes) collected during patient admissions can be effectively processed. This AI tool will actively monitor adherence to the established IBD management protocol and alert gastroenterology/IBD providers and primary team providers, such as hospitalists, by displaying a reminder window when they access a patient’s chart in the ApeX EHR system and being available as SmartPhrases available to providers when writing notes and signing off to the next provider in the hand-off in the APeX EHR system. This prompt aims to prevent delays in care by ensuring timely adherence to necessary clinical actions.

Key Areas Where AI Is Essential:

  1. Imaging Interpretation: Determining whether imaging studies indicate colonic dilation requires parsing radiology reports. AI, particularly natural language processing (NLP) tools, can extract such findings from unstructured text, facilitating timely surgical consultations when necessary.​
  2. Treatment Response Assessment: Evaluating a patient's response to intravenous glucocorticoids or biologic therapies involves synthesizing symptom descriptions, lab trends, and provider assessments documented in narrative form. AI models can integrate these data points to identify non-responders and prompt appropriate management adjustments.​
  3. Abscess Monitoring: Assessing the resolution of a Crohn’s-related abscess post-antibiotic therapy and drainage requires tracking symptom improvement and imaging findings over time. AI can correlate these unstructured data elements to determine if further intervention is warranted, especially when the drainage is not adequate.​
  4. Discharge Planning: Ensuring readiness for discharge encompasses verifying smoking cessation counseling, monitoring stool characteristics, and confirming nutritional tolerance—all typically noted in free-text clinical documentation. AI can collate this information to identify potential safety gaps prior to discharge, and prompt early initiation of discharge planning.
  5. Admission Reasoning: Identifying admissions in which an IBD flare was not the primary reason for hospitalization, or during which a IBD flare developed, necessitates analyzing provider notes before the ICD codes available from the medical coding team. AI can detect such nuances and initiate IBD flare management monitoring ​as above at very early stage.
  6. External Data Integration: Patients often receive care at multiple institutions, leading to fragmented records. AI can reconcile external documents, such as vaccination records or prior treatments, that are embedded in unstructured formats.
  7. Reducing Unnecessary Alerts and Enhancing Clinical Workflow: Traditional electronic health record (EHR) systems often generate numerous alerts based solely on structured data, leading to alert fatigue among clinicians. AI can mitigate this by analyzing unstructured data to provide context-aware alerts. For the situation in the above, if a vaccination was administered at an out-of-network facility and documented only in clinical notes, AI can recognize this and prevent redundant alerts. By tailoring notifications to the physician's preferences and clinical context, AI enhances the usability of EHR systems, reducing the cognitive burden caused by irrelevant or excessive alerts.

How would an end-user find and use it?

Our objective is to enhance the Advancing Patient-Centered Excellence (ApeX) electronic medical records (EHR) system at UCSF, aiming to improve the monitoring and management of patients admitted with IBD flares. By incorporating large language model (LLM) AI tools, such as VERSA at UCSF, we can effectively process both structured data (e.g., orders, lab values) and unstructured data (e.g., physician notes) collected during patient admissions. This AI tool will actively monitor adherence to the established IBD management protocol. It will alert gastroenterology/IBD providers and primary team providers, such as hospitalists, by displaying a reminder window when they access a patient’s chart in the ApeX EHR system. Additionally, SmartPhrases will be made available to providers for use when writing notes and during the hand-off process to the next provider, ensuring seamless communication. These prompts are designed to prevent delays in care by ensuring timely adherence to necessary clinical actions.

Embed pictures of what the AI tool might look like

Figure 1 shows the reminder window for the IBD flare inpatient milestone not met: Failure to response to the first line therapy, but no secondary medical or surgery therapy started. The failure and is recognized by the AI tools, large language model, from the unstructured data, symptoms included in the provider and nurse notes. The deficiency of secondary medical or surgery therapy information is collected from structured data, medication orders, and notes, notes documenting surgical evaluation and procedure plan. A useful link is also attached to show the most updated guideline about the milestones for user education. And if the provider accepts it, it will open an order set directly for the convenience. But it does not force the provider to take any action, in case it is a falsely positive reminder. It works more as a reminder and allow the provider to check. The “Decline” button will allow the user to leave a comment to help with the tool.

 Figure 2 depicts a SmartPhrase that shows the current IBD flare inpatient milestone meeting status. It will contain following menus to allow the provider to select if necessary and allow the provider to do any edits. The SmartPhrase will also work in the Handoff part for the purpose of signing off and providing information for other teams, like the night on call shift or other consult teams. 

 

 

 

 

 

What are the risks of AI errors?

AI tools may occasionally misinterpret data or generate errors, leading to false positives or false negatives. For instance, a milestone may not be met, yet the AI might incorrectly indicate that it has been met (false positive). However, this is unlikely to cause additional harm, since a provider who would have recognized the unmet milestone independently would not be misled into overlooking it simply because the AI suggested otherwise. The only scenario in which a milestone would still be missed is if both the provider and the AI tool fail to recognize it—essentially the same outcome as not using the AI tool at all. Conversely, the AI might fail to recognize that a milestone has been met, resulting in a false negative. However, these errors can often be corrected through provider review. Given that these milestones are generally straightforward for providers to verify, but may be overlooked due to heavy clinical workloads, the AI tool primarily functions as a reminder rather than a decision-maker.

How will we measure success?

The implementation and evaluation of the AI tool for IBD flare management necessitate a strategic approach to measure its effectiveness and minimize disruption of clinical workflow and maximize provider adoption and satisfaction. The evaluation process will be structured into two primary components:

  1. Evaluate the Effectiveness of Protocol Adherence Improvement: Prospective cohort study: Patients admitted with IBD flares before applying this tool and those after applying this tool and using the tool by their providers will be assigned to two different groups: This approach allows us to directly compare outcomes between the groups.
    • Primary Outcome—Protocol Adherence: Adherence will be quantitatively assessed using a specifically designed scoring system that evaluates how closely care teams follow established management protocols.
    • Secondary Outcomes—Health Costs and Hospital Stay Duration: We will measure the economic impact by considering potential costs associated with AI-related errors, such as false negatives that could lead to unnecessary workups, and potential savings by reducing the complication due to delay of care. Additionally, the length of hospital stays will be tracked to assess any efficiencies gained through improved management adherence.
  2. Evaluate Provider Satisfaction:
    • User Feedback and Scoring System: The acceptance and satisfaction of inpatient providers using the AI tool are critical for its long-term success. We will collect Net Promoter Scores (NPS) to quantify user satisfaction. Additionally, we will gather detailed feedback for further improvements, capturing both qualitative and quantitative aspects of user experience.

Existing APeX Data Metrics: Protocol Adherence, Hospital Stay Duration.

Additional Ideal Metrics: Health Costs, User Feedback, and Scoring.

Describe our qualifications and commitment

This project is spearheaded by Yuntao Zou, MD, a seasoned hospitalist with extensive experience in inpatient medical management. Dr. Zou is also an accomplished AI researcher, with a clinical and research focus on leveraging AI tools, such as Large Language Models (LLMs), to enhance healthcare delivery and clinical decision-making processes. I am responsible for designing and overseeing the entire project process. If selected, I will devote at least 10% or as required effort for 1 year to ensure the success of this proposal.

Vivek Rudrapatna, MD, PhD, serves as the co-lead for this initiative, with a specific focus on enhancing the IBD management protocol and developing the AI tool. As a physician-scientist and specialist in inflammatory bowel disease, Dr. Rudrapatna brings a wealth of experience in IBD management. He also leads a research group dedicated to developing methods for analyzing healthcare data, aiming to enhance clinical decision-making processes.

Reference:

1.           Buie, M.J., S. Coward, A.A. Shaheen, J. Holroyd-Leduc, L. Hracs, C. Ma, et al., Hospitalization Rates for Inflammatory Bowel Disease Are Decreasing Over Time: A Population-based Cohort Study. Inflamm Bowel Dis, 2023. 29(10): p. 1536-1545.

2.           Lewin, S. and F.S. Velayos, Day-by-Day Management of the Inpatient With Moderate to Severe Inflammatory Bowel Disease. Gastroenterol Hepatol (N Y), 2020. 16(9): p. 449-457.

3.           Burisch, J., M. Zhao, S. Odes, P. De Cruz, S. Vermeire, C.N. Bernstein, et al., The cost of inflammatory bowel disease in high-income settings: a Lancet Gastroenterology & Hepatology Commission. Lancet Gastroenterol Hepatol, 2023. 8(5): p. 458-492.

 

 Summary of Open Improvement Edits

  • Changed Figure 1 and its explanation to an example more related to the requirement of AI tools.
  • Thanks to Dr. Pletcher reminding! Key features requiring AI tools added in the "How might AI help?" part.
  • Thanks to Dr. Xue reminding! Added the influence of the falsely positive result (missing the milestones not met) in the "What are the risks of AI errors?" part.
Supporting Documents: 

Comments

This seems like it has the potential to be useful for clinicians.  What aspects of this require AI?  It seems like many of the required milestones assessable using structured data already in the chart?  Could you design an algorithmic approach that doesn't need any sort of AI?

Dr. Pletcher—thank you for your insightful question. While certain milestones in IBD management, such as lab values or medication orders, can be assessed using structured data, many critical clinical decisions rely on nuanced information embedded within unstructured data sources. These include imaging reports, provider notes, and consult documentation, which are not readily accessible through traditional algorithmic approaches.

Key Areas Where AI Is Essential:

  1. Imaging Interpretation: Determining whether imaging studies indicate colonic dilation requires parsing radiology reports. AI, particularly natural language processing (NLP) tools, can extract such findings from unstructured text, facilitating timely surgical consultations when necessary.

  2. Treatment Response Assessment: Evaluating a patient's response to intravenous glucocorticoids or biologic therapies involves synthesizing symptom descriptions, lab trends, and provider assessments documented in narrative form. AI models can integrate these data points to identify non-responders and prompt appropriate management adjustments.

  3. Abscess Monitoring: Assessing the resolution of a Crohn’s-related abscess post-antibiotic therapy and drainage requires tracking symptom improvement and imaging findings over time. AI can correlate these unstructured data elements to determine if further intervention is warranted.

  4. Discharge Planning: Ensuring readiness for discharge encompasses verifying smoking cessation counseling, monitoring stool characteristics, and confirming nutritional tolerance—all typically noted in free-text clinical documentation. AI can collate this information to support safe and timely discharge decisions.

  5. Admission Reasoning: Identifying whether an IBD flare is the primary reason for admission, especially when not explicitly stated, necessitates analyzing provider notes. AI can detect such nuances, ensuring accurate diagnosis coding and appropriate care pathways.

  6. External Data Integration: Patients often receive care at multiple institutions, leading to fragmented records. AI can reconcile external documents, such as vaccination records or prior treatments, that are embedded in unstructured formats, ensuring comprehensive patient histories.

Reducing Unnecessary Alerts and Enhancing Clinical Workflow:

Traditional electronic health record (EHR) systems often generate numerous alerts based solely on structured data, leading to alert fatigue among clinicians. AI can mitigate this by analyzing unstructured data to provide context-aware alerts. For instance, if a vaccination was administered at an out-of-network facility and documented only in clinical notes, AI can recognize this and prevent redundant alerts. By tailoring notifications to the physician's preferences and clinical context, AI enhances the usability of EHR systems, reducing the cognitive burden caused by irrelevant or excessive alerts.

 

So, incorporating AI into clinical workflows is not merely advantageous but essential for capturing the full spectrum of patient information, particularly when dealing with the intricacies of IBD management. By leveraging AI, clinicians can access synthesized, comprehensive insights, ultimately leading to improved patient outcomes and more efficient care delivery.

this is a well-structured proposal and inclusion of SmartPharased and milestone tacking tailored for hand-offs is a particularly thoughful touch, showing attention to real-world workflows.

below are some questions I have for this proposal:

1) will this AI tool be piloted on one service or multiple teams concurrently?

2) How will alert be prsented in ApeX without contibuting to alert fatigue?

3) will there be any training or onboard offered prior (such as tip sheets, demo sessions)?

4) what is the long-term plan for tool maintenance and ownership post-pilot?

5) are there any risks of unintended consequences (potential harm relate to over-reliance on AI prompts)?

Thank you for reviewing and bringing up all the valuable questions!

1) will this AI tool be piloted on one service or multiple teams concurrently?

---Would be piloted in a small number of patients in one or few teams before applying to more teams.

2) How will alert be prsented in ApeX without contibuting to alert fatigue?

-- We have some strategies to avoid the alert fatigue: 1. One alert per provider per day. 2. When the provider clicks on decline, it will allow the provider to select no more alerts for the same milestone for this admission. 3. Traditional alert would only be able to recognize the structured data, which may produce a lot of unnecessary alerts, but the AI tools like LLMs can recognize the unstructured data like notes to recognize the milestones already met and avoid the unnecessary alerts.

3) will there be any training or onboard offered prior (such as tip sheets, demo sessions)?

--Yes plan to give demo session and a summary of the milestones need to be met during the admission and the timeline.

4) what is the long-term plan for tool maintenance and ownership post-pilot?

-- After the pilot phase, we plan to apply to more teams. And will perform analysis to compare the clinical outcomes and other outcomes like admission length and cost before and after applying the AI reminding tool. And will collect feedback periodically from the providers to improve the tools.

5) are there any risks of unintended consequences (potential harm relate to over-reliance on AI prompts)?

-- This AI tool may produce false positives—for example, indicating that a milestone has been met when it has not. However, this is unlikely to cause additional harm, since a provider who would have recognized the unmet milestone independently would not be misled into overlooking it simply because the AI suggested otherwise. The only scenario in which a milestone would still be missed is if both the provider and the AI tool fail to recognize it—essentially the same outcome as not using the AI tool at all. As if the not-meeting Conversely, the AI might fail to recognize that a milestone has been met, resulting in a false negative. However, these errors can often be corrected through provider review.  Given that these milestones are generally straightforward for providers to verify, but may be overlooked due to heavy clinical workloads, the AI tool primarily functions as a reminder rather than a decision-maker.