Department of Medicine 2016 Tech Challenge

New Uses of Information Technology to Advance the Missions of the Department of Medicine

Protecting Patients from Unnecessary Emergency Room Visits and Hospitalizations: Harnessing Big Data to Directly Improve Clinical Care at UCSF

Idea Status: 

Authors: Alvin Rajkomar, MD and Sara Murray, MD

Background:  As many as 70% of emergency room visits may be preventable, with a proportion of these resulting in hospitalizations that may have also been avoided.  These preventable escalations of care, which we define as decompensations, cause personal and financial stress to patients and use of costly services by health systems which are increasingly focused on high-value care. There is a pressing need to be able to identify patients at greatest risk for imminent decompensation, with the intent on intervening prior to the need for emergency or hospital care.  Prior to use of our unified electronic health record (EHR), the data that contained the clinical status of patients was locked up in paper charts or disparate electronic databases that could not be analyzed without time-consuming manual chart review.   Our EHR now houses a wealth of clinical data in a single database that can be used for to help clinicians identify these high-risk patients, through high-throughput algorithms.  Much of the assessment of clinical status and risk of decompensation is contained within the clinical notes as unstructured free text (e.g. “The patient is calling to report worsening fever and chest pain.”).  Therefore, algorithms must not only able to quickly gather information about patients but also draw upon modern machine learning techniques to extract meaning from structured and unstructured data to assess patient status. Here we propose developing a novel algorithm that leverages the EHR to improve outcomes for our highest risk patients across the medical center.

Proposal:  We propose building a machine-learning model that employs deep learning (e.g. deep neural networks), including processing of free text from clinical notes/encounters, to predict a given patient’s risk of emergency room visit or hospitalization within the next 7 days. To do this, we will first assemble a data repository of all patients receiving care at UCSF by extracting the subset of the EHR that includes clinically relevant structured data (demographics, laboratory data, encounters, problem lists, etc.) as well as all of the clinical notes for each individual patient.   Using this data repository, we will develop an algorithm that accounts for the patterns and content of an individual’s interactions with the healthcare system.  We plan to deploy this algorithm on a bi-weekly basis to identify UCSF patients at greatest risk for decompensation and feed that information back to key stake-holders (including primary care clinics, the accountable care organization, and ideally a call system to check-in on these patients).

Feasibility: This project is feasible for our team, as it builds upon and synthesizes prior work we have done in validating data extraction algorithms from the EHR, streamlining storage and processing of large amounts of EHR data, and building machine-learning algorithms to be used in predictive modeling. Both primary investigators are Clarity certified and have direct access to generate this data repository.  Dr. Murray has already built a similar data repository containing structured data and unstructured free text for use with machine learning algorithms in lupus patients.  Dr. Rajkomar has already built and is using a computational server that pulls and analyzes EHR data in real-time and has created deep learning algorithms on high-throughput computational clusters.  Both primary investigators have collaborated with Epic build-team members in prior projects and understand how to push data from an algorithm back into the EHR.


This would be a highly novel application of health IT in which we synthesize big data - nearly the entirety of the EHR - and use the machine learning to directly improve clinical care. We anticipate that this project will not only help us reduce emergency room visits and hospitalizations at UCSF, but also serve as a model that fundamentally changes how we use EHR data to affect patient care.  The fundamental premise of real-time processing of clinical data at scale has multiple applications for the Department of Medicine and UCSF Health, as the same pipeline could be used to predict an infinite number of outcomes that are important to our health system and the patients we serve.  


This is a very exciting and innovative project that could have important implications for quality of care, patient satisfaction, and education. If possible, it would be very interesting to incorporate patient MyChart messages into the algorithm (although this may raise consent issues.) Another potential application would be predicting patients at high risk of rehospitalization after discharge, as our current tools for assessing readmission risk are imperfect.

Hi Sumant,

Thank you for your kind comments.  We, of course, will seek IRB/CHR approval, but we are planning on using the rich information in MyChart messages to help with the prediction.  For example, a patient with repeated MyChart messages indicating that they are having trouble obtaining a high-risk medication may have an elevated risk of seeking resolution in the emergency room rather than with their outpatient team.  Can we train an algorithm to identify this high-risk situation and flag it for further review by the health system?  Other industries do this quite well; for example, credit card companies will temporarily invalidate credit cards if they suspect fraud with very similar mechanisms: an algorithm flags purchases that are likely fraud and then pushes the account to a human who investigates further.

Predicting high risk of rehospitalization is a natural extension of our project, as I suspect we'll find that patients who have been recently hospitalized are at elevated risk of seeking a hospital care again.  With the granularity of data we are looking at, I also suspect that we'll find different factors that predict re-admission from patients discharged from the surgical services compared to those discharged from the general medicine service.  



This is an exciting and innovative project that will harness data from the EMR to potentially identify patients at high risk of readmission. This has important implications for our health system, patients and families, physicians and trainees. If successful this model could be used for a number of different clinical and patient situations. I would like the authors to carefully consider how the data from their algorithm will be fed back to various stakeholders (our organization, clinical teams, individual providers, patients). How will the data be presented, delivered and how will you ensure it is actionable? 

Hi James,

Thanks for your excellent comments.  We agree that feedback of the results of our algorithms into the health system is a critical point.  We need to balance the need to give timely and important alerts to care teams with the need to avoid alert fatigue.  One of our first tasks will be to ensure that our predictions are timely and actionable, which we can evaluate ourselves as physicians as well as with important stakeholders like primary care physicians and the accountable care organization. Because care teams have many competing priorities, we need to tailor the types of feedback depending on urgency and capacity to intervene.

We are currently running a randomized clinical trial of delivering data to UCSF clinicians using electronic dashboards.  This data is actually from the Clarity (UCSF EHR) database and runs on servers within UCSF, so we have expertise with the dashboard process and developement cycle.  Our team also has significant experience working with the Epic build teams and can work with physicians to find managable ways to intervene in Apex (e.g. we have the practical know-how to create Apex alerts and inbox messages).  

We are fortunate that we already have the capacity in our team to make data available and presentable to clinicial teams, although before we commit to deciding exactly how we will intervene, we will need to have more discussion with stakeholders to make sure that we come up with a solution that streamlines care rather than just adds work on clinicians' plates.  It's possible that we need health professionals from outside a patient's primary care team to do the initial outreach to high-risk patients.  We would love to hear from others if they have suggestions.



I really like this idea - clinically focused, a direct patient outcome, cost-saving, and utilizing technology to enhance human decision making.

Thanks, Henry!

Alvin - like others, I think this is an exciting idea. I humbly suggest that as you move this forward, you will need primary care partners to help you make decisions from beginning to end, including algorithm creation and testing, and implementation work-flow development and testing.

Hi Leah,

Thanks, and I totally agree.  If we are selected to move forward to round 2, we will definitely seek out primary care partners to be part of the final project proposal.  Please let us know if you know anyone who might be interested!  



Sara, Alvin--I, too, love this idea, and agree with the comments Leah, James, and Sumant have made about thinking carefully about how you get data back to the stakeholders, especially including the PCP and the patient. As you may know, MyChart and Inbasket both have APIs that you can leverage when it comes time to deliver the actionable data to the key stakeholders. Other deeper integrations with the EHR may also be possible--we should talk more about these as you ramp up this exciting project!

Awesome proposal. I think chipping away at the high number of preventable emergency room visits is an important cause. I especially like that you are thinking not only about reducing visits to the ED, but how to use data to improve patient care. I agree that MyChart messages would be a great additional set of data for this algorithm.

Good luck!

Thanks, Rhiannon!

Thanks Rhiannon!  One of the really exciting things about processing the free text in mychart messages is that - since in many cases they aren't read immediately (although we know many providers do!) - our algorithm will have the potential to incorporate/process information that is transiently unknown even to the providers.  Initially we will run the algorithm on a schedule as detailed above, but once it is succesful we could imagine scaling it up to a near-daily process that identifies very high-risk patient situations and can feed that back to providers very quickly.

Great idea - really pushing the edge of what is known about preventing adverse events. Another interesting input would be the post discharge phone calls. We are currently analyzing these calls to determine the association between answering the calls, participating in the phone survey and selecting the option to talk to a nurse or someone from patient relations and both readmissions and patient satisfaction. There are future plans to integrate this data into APeX.

Looking forward to seeing what you find! 

Hi Michelle,

Yes - that's a fantastic idea.  Occasionally I have seen notes from the amazing 14M/L nurses who have documented information about these conversations (which we will be able to extract directly), but we will definitely think about extracting some of that structured data, as well.  If Cypher has an API (application programming interface) we could think about pulling the data from their system directly as needed, as we have done that for some of our analysis before.

Thanks for your support,


I also love this idea, particularly if we extend the project to readmissions which is a natural next step.  Let me know if I can help with the cipher APeX integration!

Commenting is closed.