Each year over 100,000 diabetes patients are readmitted early to U.S. hospitals at a cost of $15 billion. Evidence suggests that even relatively simple interventions can reduce 30-day readmissions by up to a third. With insight into patient readmission risk, hospitals can make the right decisions in order to realize significantly improved health outcomes and reduced costs.
We are using machine learning trained on real-world examples to predict which patients are at highest risk for early admission and to offer insight into why they are at such high risk. Our solution places the right information at the right time into the hands of healthcare professionals so they can intervene in an appropriate and effective manner.
Our clinical decision making tool provides discharge planners, diabetic case managers and physicians with the data they need to assess readmission risk on a patient-level and target appropriate interventions to mitigate risk.
The tool is utilized at two key decision points in a diabetic hospital stay: transfer to inpatient and pre-discharge. Upon transfer to inpatient, typically from the Emergency Department, medical staff can make preliminary risk assessments and determine appropriate interventions for the inpatient setting.
Pre-discharge, the hospital staff make a follow-up risk assessment, this time with data collected during the inpatient stay. The more accurate predictions made at this juncture inform the decision making process around more intensive outpatient interventions.
The primary objective of our model was to predict the risk (%) of a patient being readmitted within a 30-day period. This risk percentage is also mapped to a very low to very high range to characterize the risk level. The second objective was to provide a ranked list of factors that contributed to the risk percentage. The intent was to provide medical professionals insight into what is driving the readmission risk.
Our models are trained and tested using ~250,000 diabetic cases from the state of California's 2011 Healthcare Cost and Utilization Project (HCUP) database. The database contains over 800,000 inpatient diabetes cases across 450,000 unique diabetes patients. Additionally, we used ~60,000 deidentified cases from the Cerner's Health Facts database in the earlier stages of our model development.
Our model consists of single, rules-based decision trees. We train one tree with data available at the time of admission and a second including data available on discharge. While ensemble methods, such as random forest, provide greater precision in risk assessments, a single tree allows us to provide medical staff with a clear decision path for each risk assessment.
To generate readmission risk assessments, we use a single decision tree. A decision tree is structure in which each internal, “parent” node (circle below) represents a decision split, each branch represents the outcome of the test and each external, “leaf” node represents a classification. The graphic below gives an interactive example of a decision tree being used to classify diabetes readmission. Click on the nodes to expand/collapse the tree.
We evaluate the performance of our models relative to the LACE index, which is used widely as a tool for quantifying the risk of early readmission upon hospital discharge. The LACE index is completely transparent in terms of how patient features contribute to the risk assessment. Our goal is to maintain this transparency while improving upon the reliability of the assessment. We implemented the LACE algorithm and cast the results as probabilities using a linear function so that we could compare the results on our test data directly with the probabilites generated by our decision tree models
Model performance is measured using ROC AUC: area under receiver-operator characteristic (ROC) curve. AUC is a standard metric used for evaluating readmission risk models. With AUC, the entire range of risk quantification is taken into account, not just a binary classification into high/low risk categories. A useful real-world interpretation of the AUC score is the likelihood that any two random samples will be correctly ranked relative to each other.
Looking forward, we are aiming to improve the model performance by using more advanced, ensemble modeling techniques. While this approach does not allow the same level of visibility into the specific risk factors, we are hoping to use the refined model in concert with our single-tree model to give greater predictive precision without losing insight into the risk factor identification.
We also anticipate that model performance can be improved by modeling specific sub-groups such as age brackets.
Director of Product, Big Data/Analytics
Software/Data Engineer, Analyst and Consultant
Principal Engineer, Technical Operations
Senior Business Analyst
Advisory Board Company