Machine learning predicts unpredicted deaths with high accuracy following hepatopancreatic surgery

Kota Sahara, Anghela Z. Paredes, Diamantis I. Tsilimigras, Kazunari Sasaki, Amika Moro, J. Madison Hyer, Rittal Mehta, Syeda A. Farooq, Lu Wu, Itaru Endo, Timothy M. Pawlik

Abstract

Background: Machine learning to predict morbidity and mortality—especially in a population traditionally considered low risk—has not been previously examined. We sought to characterize the incidence of death among patients with a low estimated morbidity and mortality risk based on the National Surgical Quality Improvement Program (NSQIP) estimated probability (EP), as well as develop a machine learning model to identify individuals at risk for “unpredicted death” (UD) among patients undergoing hepatopancreatic (HP) procedures.
Methods: The NSQIP database was used to identify patients who underwent elective HP surgery between 2012–2017. The risk of morbidity and mortality was stratified into three tiers (low, intermediate, or high estimated) using a k-means clustering method with bin sorting. A machine learning classification tree and multivariable regression analyses were used to predict 30-day mortality with a 10-fold cross validation. C statistics were used to compare model performance.
Results: Among 63,507 patients who underwent an HP procedure, median patient age was 63 (IQR: 54–71) years. Patients underwent either pancreatectomy (n=38,209, 60.2%) or hepatic resection (n=25,298, 39.8%). Patients were stratified into three tiers of predicted morbidity and mortality risk based on the NSQIP EP: low (n=36,923, 58.1%), intermediate (n=23,609, 37.2%) and high risk (n=2,975, 4.7%). Among 36,923 patients with low estimated risk of morbidity and mortality, 237 patients (0.6%) experienced a UD. According to the classification tree analysis, age was the most important factor to predict UD (importance 16.9) followed by preoperative albumin level (importance: 10.8), disseminated cancer (importance: 6.5), preoperative platelet count (importance: 6.5), and sex (importance 5.9). Among patients deemed to be low risk, the c-statistic for the machine learning derived prediction model was 0.807 compared with an AUC of only 0.662 for the NSQIP EP.
Conclusions: A prognostic model derived using machine learning methodology performed better than the NSQIP EP in predicting 30-day UD among low risk patients undergoing HP surgery.