Clinical validation of scoring systems of postoperative pancreatic fistula after pancreatoduodenectomy: applicability to Eastern cohorts?
Original Article

Clinical validation of scoring systems of postoperative pancreatic fistula after pancreatoduodenectomy: applicability to Eastern cohorts?

Jae Seung Kang1#, Taesung Park2#, Youngmin Han1, Seungyeon Lee3, Jae Ri Kim1, Hongbeom Kim1, Wooil Kwon1, Sun-Whe Kim1, Jin Seok Heo4, Seong Ho Choi4, Dong Wook Choi4, Song Cheol Kim5, Tae Ho Hong6, Dong Sup Yoon7, Joon Seong Park7, Sang Jae Park8, Sung-Sik Han8, Sae-Byeol Choi9, Joo Seop Kim10, Chang-Sup Lim11, Jin-Young Jang1

1Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea; 2Department of Statistics and Interdisciplinary Program in Biostatistics, Seoul National University, Seoul, Republic of Korea; 3Department of Mathematics and Statistics, Sejung University, Seoul, Republic of Korea; 4Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea; 5Division of Hepatobiliary and Pancreatic Surgery, Department of Surgery, Ulsan University College of Medicine and Asan Medical Center, Seoul, Republic of Korea; 6Department of Surgery, Seoul St. Mary’s Hospital, The Catholic University of Korea, College of Medicine, Seoul, Republic of Korea; 7Pancreatobiliary Cancer Clinic, Department of Surgery, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea; 8Center for Liver Cancer, National Cancer Center, Gyeonggido, Republic of Korea; 9Department of Surgery, Korea University Guro Hospital, Korea University College of Medicine, Seoul, Republic of Korea; 10Department of Surgery, Kangdong Sacred Heart Hospital, Hallym University College of Medicine, Seoul, Republic of Korea; 11Department of Surgery, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea

Contributions: (I) Conception and design: JS Kang, T Park, JY Jang; (II) Administrative support: Y Han, S Lee, JR Kim, H Kim, W Kwon, SW Kim; (III) Provision of study materials or patients: JS Kang, Y Han, DW Choi, SC Kim, TH Hong, JS Park, SJ Park, SB Choi, JS Kim, CS Lim; (IV) Collection and assembly of data: JS Kang, JS Heo, SH Choi, SC Kim, TH Hong, DS Yoon, SS Han, SB Choi, JS Kim, CS Lim; (V) Data analysis and interpretation: JS Kang, T Park, S Lee, JY Jang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Jin-Young Jang, MD, PhD. Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, 101 Daehak-ro, Chongno-gu, Seoul 03080, Republic of Korea. Email:

Background: Although several prediction models for the occurrence of postoperative pancreatic fistula (POPF) after pancreatoduodenectomy (PD) exist, all were established using Western cohorts. Large-scale external validation studies in Eastern cohorts that consider demographic variables including lower body mass index (BMI) are scarce. The purpose of this study was to externally validate POPF prediction models using nationwide large-scale Korean cohorts.

Methods: Nine tertiary university hospitals in the Republic of Korea participated. Patients’ preoperative characteristics, intraoperative factors, and pathologic findings were evaluated. POPF grades were determined according to the 2016 International Study Group on Pancreatic Surgery definition. Three POPF risk models (Callery, Roberts, and Mungroop) were selected for external validation.

Results: A total of 1,898 PD patients were enrolled. A non-pancreatic disease diagnosis [hazard ratio (HR), 1.856; 95% confidence interval (CI), 1.223–2.817; P=0.004), higher preoperative BMI (HR, 1.069; 95% CI, 1.019–1.121; P=0.006), and soft pancreatic texture (HR, 1.859; 95% CI, 1.264–2.735; P=0.002) were independent risk factors for clinically relevant POPF (CR-POPF). The area under the receiver operating characteristic curve (AUC) values were 0.61, 0.64, and 0.63 on the Callery, Roberts, and Mungroop models, respectively; all were lower than those published in each external validation study.

Conclusions: Western POPF prediction models performed less well when applied to Korean cohorts. Thus, a large-scale Eastern-specific and externally validated POPF prediction model is needed.

Keywords: Pancreatic fistula; pancreatoduodenectomy (PD); predictive score

Submitted Sep 29, 2018. Accepted for publication Dec 27, 2018.

doi: 10.21037/hbsn.2019.03.17


Postoperative pancreatic fistula (POPF) remains a lethal complication, and is related with increased hospital stays, costs, and surgery-related mortality rates (1-4). The definition of POPF was unified and clarified by The International Study Group for Pancreatic Surgery (ISGPS) in 2005 (5), and revised in 2016 (6).

The availability of a prediction model is required for the development of individualized programs for the postoperative management of POPF. Such models can also decrease unnecessary postoperative interventions and hospital costs in low-risk patients, enable appropriate evaluations and treatments, and decrease life-threatening events and mortality in high-risk patients (7). Callery et al. reported that pancreatic texture, pathologic diagnosis, main pancreatic duct (MPD) diameter, and intraoperative estimated blood loss (EBL) were associated with the development of POPF after pancreatoduodenectomy (PD) and subsequently established the Callery score of POPF prediction based on these four factors (2). The Callery score has been widely used and demonstrated reliable validity in some studies (8-10). Simpler risk prediction models of POPF were recently developed and reported in the United Kingdom (UK) and The Netherlands (11,12). The UK group proposed body mass index (BMI) and MPD diameter as POPF risk factors, while The Netherlands group reported pancreatic texture, BMI and MPD diameter. These two models performed moderately well in external validation studies (11,13).

However, these POPF prediction models were established using Western cohorts. Baseline patient characteristics, surgical techniques, and postoperative managements differ different between Eastern and Western countries, and large-scale external validation studies of Eastern cohorts are scarce. This study aimed to externally validate several POPF prediction models using large-scale nationwide Korean cohorts.



This retrospective cohort study examined data from a prospectively collected medical database. Nine tertiary university hospitals in the Republic of Korea participated in this study: Seoul National University Hospital, Samsung Medical Center, Asan Medical Center, Catholic Medical Center, Gangnam Severance Hospital, National Cancer Center in Korea, Korea University Guro Hospital, Hallym University Sacred Heart Hospital, and Seoul Metropolitan Government Seoul National University Boramae Hospital. The surgeons in each of these nine hospitals performed a minimum of 20 cases of PD annually. Patients who underwent PD or pylorus-preserving pancreatoduodenectomy (PPPD) due to periampullary disease were enrolled. Patients who did not undergo pancreatico-enteric anastomosis, who underwent combined other organ resections, who had a previous history of abdominal surgery before PD, and for whom insufficient medical data were available to investigate POPF were excluded.

This study was approved by our hospital’s institutional review board (C-1806-129-954).

Data collection and definition of POPF

Preoperative patient characteristics were investigated, including age, sex, BMI, preoperative co-morbidities, preoperative lab data, and MPD diameter on a cross-sectional view of a preoperative computed tomography (CT) image. Intraoperative factors included operation type, combined vascular resection, pancreatico-enteric anastomosis site (pancreaticojejunostomy or pancreaticogastrostomy), pancreatico-enteric anastomosis type (invagination or duct-to-mucosa), operation time, intraoperative EBL, and pancreatic texture. Postoperative data included pathologic diagnosis, drain amylase concentration on postoperative day 3, and complications with a Clavian-Dindo classification > grade II. Pancreatic cancer and pancreatitis were categorized as pancreatic disease, and the other pancreatic disease such as benign pancreatic cystic neoplasm, or neuroendocrine tumors were categorized in other periampullary disease of non-pancreatic disease (2).

POPF was determined based on the 2016 ISGPS guideline, while clinically relevant POPF (CR-POPF) was defined as grade B or C (6).

Statistical analysis

Nominal data were compared using χ2 tests, while continuous variables were examined using Student’s t-test. Only variables statistically significant on univariate analysis were included in the multivariate analysis. To calculate the performance of each model, the area under the receiver operating characteristic curve (AUC) was calculated. All statistical analyses were performed using R version 3.3.0 (R Foundation, Vienna, Austria), and two-sided P values <0.05 were considered statistically significant.


Patient demographics

Between 2007 and 2014, a total of 1,898 patients from the nine tertiary pancreaticobiliary centers in Korea were enrolled in this study (Table 1). The mean patient age was 62.6 years; 1,116 patients (58.8%) were male; and 669 (35.2%) were diagnosed with pancreatic ductal adenocarcinoma, 407 (21.4%) with extrahepatic common bile duct cancer, 365 (19.2%) with ampulla of Vater cancer, 39 (2.1%) with duodenal cancer, and 396 (20.9%) with other periampullary diseases (Table 1). The mean patient BMI was 23.0. Of the cohort, 1,767 patients (93.1%) underwent pancreaticojejunostomy anastomosis and 1,630 (85.9%) underwent duct-to-mucosa anastomosis. A soft pancreatic texture was seen in 57.5% of cases. Overall, POPF occurred in 752 patients (39.6%), while CR-POPF (grade B or C) occurred in 275 (14.5%).

Table 1
Table 1 Demographic and pathologic findings of modeling cohorts in four studies
Full table

Predictive factors for CR-POPF after PD

In the univariate analysis, male sex, non-pancreatic disease diagnosis, preoperative BMI, preoperative DM, MPD diameter on preoperative CT images, operation type, absence of vessel resection, duct-to-mucosa anastomosis, operation time, intraoperative blood loss, and soft pancreatic texture were associated with CR-POPF (Table 2). In the multivariate analysis, a non-pancreatic disease diagnosis [hazard ratio (HR), 1.856; 95% confidence interval (CI), 1.223–2.817; P=0.004), higher preoperative BMI (HR, 1.069; 95% CI, 1.019–1.121; P=0.006), and soft pancreatic texture (HR, 1.859; 95% CI, 1.264–2.735; P=0.002) were the independent risk factors for CR-POPF after PD.

Table 2
Table 2 Predictive factors for postoperative pancreatic fistula in univariate and multivariate analysis
Full table

Comparisons of discrimination ability and relationship between score severity and actual occurrence rates of POPF

Of the published POPF prediction models, we selected those with proven validity in clinical circumstances in both internal and large-scale external validation studies. The models of Callery et al., Roberts et al., and Mungroop et al. were selected and investigated (2,11,12). Table 3 shows the summarization of these three-scoring system and the comparisons of the AUC values. The three western models showed moderate discrimination ability that the AUC values were more than 0.7. Figure 1 shows the ROCs of the Korean cohorts based on the three-scoring system (Figure 1A, Callery model; Figure 1B Roberts model; Figure 1C, Mungroop model). The AUC values were calculated based on these ROCs that these models performed less well when the Korean cohorts were applied that the AUC values were between 0.61 and 0.64.

Table 3
Table 3 Comparisons of the scoring system of postoperative pancreatic fistula after pancreatoduodenectomy
Full table
Figure 1 The receiver operating curves of the Korean cohorts based on the three models. (A) Callery model (2); (B) Roberts model (11); (C) Mungroop model (12).

Figure 2 shows the relationship between score severity and actual POPF occurrence rates of Korean cohorts for each scoring system. The higher the score, the more frequent the occurrence of POPF in each model. However, the actual CR-POPF rate in the high-risk group was lower in this Korean cohort than in the reported validation studies. Regarding Callery score, the actual rate of CR-POPF in the high-risk group (Callery score 7–10) was 20.3% in the Korean cohort (Figure 2A) versus 28.6% in one external validation study (9) and 42.3% in another study (8). Regarding Mungroop score, the actual CR-POPF rate was 19.3% in the Korean cohort (Figure 2C) versus 31.0% in an external validation study (11).

Figure 2 Relationship between score severity and actual POPF rate of Korean cohorts. (A) Callery model; (B) Roberts model; (C) Mungroop model. POPF, postoperative pancreatic fistula; CR-POPF, clinically-relevant POPF.


Predicting POPF is important because it enables the development of individualized treatment plans for affected patients. Numerous risk factors for POPF have been identified, while several POPF prediction models have been subsequently developed (2,11,12,14-16). However, most have not been externally validated. Furthermore, there have been few large-scale external validation studies of POPF in Eastern cohorts. The present study identified three large-scale validated models and investigated their applicability to a large-scale Korean cohort.

AUC values, a discriminatory index for the prediction models, range from 0.5 (no discrimination ability) to 1.0 (perfect discrimination) (17,18). One study categorized the discriminatory ability of AUC values of 0.5–0.7 as poor, 0.7–0.9 as reasonable, and 0.9–1.0 as very good (19). Table 2 summarized the outcomes of internal and external validation studies for Callery, Roberts, and Mungroop scores. An internal validation revealed an AUC of the Callery score of 0.94 (2). However, the predictability of internal validation could be overestimated because establishing the prediction model was based on the multivariate logistic regression analysis of the study modeling cohorts. Miller et al. performed an external validation using the data of 594 PD patients who underwent pancreaticojejunostomy anastomosis and reported an AUC of 0.72 (9). Grendar et al. 
reported an AUC of 0.71 (8). Regarding Roberts and Mungroop scoring, the AUC in the external validation study was 0.77 and 0.78, respectively (Table 2) (11,13). These values indicated that these three models demonstrated reasonable performance for predicting POPF or CR-POPF when applied to Western cohorts.

Miller et al. insisted that the variations in predictability among the validation studies might have been caused by differences in patient characteristics, POPF rates in the high-risk groups, and perioperative managements (9). The present study performed external validation of the three methods using the data of 1,898 consecutive Korean PD patients from nine tertiary hospitals. The incidence of POPF in this Korean cohort increased as score severity increased (Figure 2). However, the AUC values were 0.61–0.64, meaning that these Western POPF scoring models had poor discriminatory ability when applied to our Korean cohort (Table 3). This might have been due to differences in demographics such as BMI and the proportion of patients with non-pancreatic diseases (Table 1). In addition, in our high-risk group of our cohorts, the actual CR-POPF rates were <30% for Callery scoring (Figure 2A) and 20% for Mungroop scoring (Figure 2C). Therefore, these Western models must be revised to increase their applicability to Korean cohorts.

Pancreatic texture and pathologic diagnosis were the risk factors proposed by Callery et al. (2). Preoperative BMI was recently included in two prediction models (11,12). EBL was also a risk factor proposed by Callery et al. proposed, but two recent external validation studies refuted this claim (8,10). Furthermore, due to the advent of minimally invasive surgery, reported intraoperative blood loss volumes became lower than those of studies of the period when Callery scoring was developed (20-22). In the present study, preoperative valuables including a non-pancreatic disease diagnosis (HR, 1.856; 95% CI, 1.223–2.817; P=0.004), higher preoperative BMI (HR, 1.069; 95% CI, 1.019–1.121; P=0.006), and soft pancreatic texture (HR, 1.859; 95% CI, 1.264–2.735; P=0.002) were the independent risk factors for CR-POPF in the multivariate analysis (Table 2), and one study suggested the exact same risk factors (23). Because patient characteristics, surgical skills, and perioperative management methods differ among countries, the statistically significant factors and hazard ratios could be different. Therefore, it is necessary to establish new models that reflect the factors and hazard ratios of each cohort. To consider such differences, pathologic diagnoses, preoperative BMI, and pancreatic texture should be included in the new model.

The previous three scoring system mentioned MPD size as one of the risk factors (2,11,12). However, it was not consistent with the present study that the MPD size was not statistically associated with CR-POPF (HR 0.936; 95% CI, 0.855–1.025, P=0.153, Table 2). This was probably because the anastomotic method (duct-to-mucosa or invagination), or pancreatic duct stenting method were not unified around nine institutions. In addition, different surgeons investigated MPD size in the preoperative CT image. In order to overcome this limitation, prospective study design and unified method to evaluate the MPD size would be needed.

The present study has some limitations. First, surgical techniques and perioperative management methods were not unified among the nine hospitals. Thus, surgeon preference was a potential confounding factor. Second, data missing from the surgical and pathologic reports could not be controlled for in this retrospective study. To ensure high-quality collaboration studies, centralization of the electric medical database and regular monitoring are needed. Despite these limitations, this was the largest external validation study of Western POPF prediction models in an Eastern cohort.

In conclusion, Western POPF prediction models performed less well when applied to Korean cohorts. Risk factors for POPF in Eastern model were a higher BMI, soft pancreatic texture, and non-pancreatic disease diagnosis. Thus, the development of a large-scale externally validated Eastern-specific POPF prediction model using Eastern cohorts is needed.


Funding: This study was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI16C2037), and by the Collaborative Genome Program for Fostering New Post-Genome Industry of the National Research Foundation funded by the Ministry of Science and ICT (NRF-2017M3C9A5031591).


Conflicts of Interest: The authors have no conflicts of interest to declare.

Ethical Statement: This study was approved by the institutional review board of Seoul National University Hospital (C-1806-129-954).


  1. Ahmad SA, Edwards MJ, Sutton JM, et al. Factors influencing readmission after pancreaticoduodenectomy: a multi-institutional study of 1302 patients. Ann surg 2012;256:529-37. [Crossref] [PubMed]
  2. Callery MP, Pratt WB, Kent TS, et al. A prospectively validated clinical risk score accurately predicts pancreatic fistula after pancreatoduodenectomy. J Am Coll Surg 2013;216:1-14. [Crossref] [PubMed]
  3. Frymerman AS, Schuld J, Ziehen P, et al. Impact of postoperative pancreatic fistula on surgical outcome—the need for a classification-driven risk management. J Gastrointest Surg 2010;14:711-8. [Crossref] [PubMed]
  4. McMillan MT, Vollmer CM, Asbun HJ, et al. The characterization and prediction of ISGPF grade C fistulas following pancreatoduodenectomy. J Gastrointest Surg 2016;20:262-76. [Crossref] [PubMed]
  5. Bassi C, Dervenis C, Butturini G, et al. Postoperative pancreatic fistula: an international study group (ISGPF) definition. Surgery 2005;138:8-13. [Crossref] [PubMed]
  6. Bassi C, Marchegiani G, Dervenis C, et al. The 2016 update of the International Study Group (ISGPS) definition and grading of postoperative pancreatic fistula: 11 years after. Surgery 2017;161:584-91. [Crossref] [PubMed]
  7. Callery MP, Pratt WB, Vollmer CM. Prevention and management of pancreatic fistula. J Gastrointest Surg 2009;13:163-73. [Crossref] [PubMed]
  8. Grendar J, Jutric Z, Leal JN, et al. Validation of Fistula Risk Score calculator in diverse North American HPB practices. HPB 2017;19:508-14. [Crossref] [PubMed]
  9. Miller BC, Christein JD, Behrman SW, et al. A multi-institutional external validation of the fistula risk score for pancreatoduodenectomy. J Gastrointest Surg 2014;18:172-79; discussion 179-80. [Crossref]
  10. Shubert CR, Wagie AE, Farnell MB, et al. Clinical risk score to predict pancreatic fistula after pancreatoduodenectomy: independent external validation for open and laparoscopic approaches. J Am Coll Surg 2015;221:689-98. [Crossref] [PubMed]
  11. Roberts KJ, Hodson J, Mehrzad H, et al. A preoperative predictive score of pancreatic fistula following pancreatoduodenectomy. HPB 2014;16:620-8. [Crossref] [PubMed]
  12. Mungroop TH, van Rijssen LB, van Klaveren D, et al. Alternative Fistula Risk Score for Pancreatoduodenectomy (a-FRS): Design and International External Validation. Ann Surg 2019;269:937-43. [Crossref] [PubMed]
  13. Roberts KJ, Sutcliffe RP, Marudanayagam R, et al. Scoring system to predict pancreatic fistula after pancreaticoduodenectomy: a UK multicenter study. Ann Surg 2015;261:1191-7. [Crossref] [PubMed]
  14. Chen JY, Feng J, Wang XQ, et al. Risk scoring system and predictor for clinically relevant pancreatic fistula after pancreaticoduodenectomy. World J Gastroenterol 2015;21:5926-33. [Crossref] [PubMed]
  15. Kim JY, Park JS, Kim JK, et al. A model for predicting pancreatic leakage after pancreaticoduodenectomy based on the international study group of pancreatic surgery classification. Korean J Hepatobiliary Pancreat Surg 2013;17:166-70. [Crossref] [PubMed]
  16. Yamamoto Y, Sakamoto Y, Nara S, et al. A preoperative predictive scoring system for postoperative pancreatic fistula after pancreaticoduodenectomy. World J Surg 2011;35:2747-55. [Crossref] [PubMed]
  17. Kumar R, Indrayan A. Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatr 2011;48:277-87. [Crossref] [PubMed]
  18. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;240:1285-93. [Crossref] [PubMed]
  19. Pearce J, Ferrier S. Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Modell 2000;133:225-45. [Crossref]
  20. Palanivelu C, Senthilnathan P, Sabnis SC, et al. Randomized clinical trial of laparoscopic versus open pancreatoduodenectomy for periampullary tumours. Br J Surg 2017;104:1443-50. [Crossref] [PubMed]
  21. Peng L, Lin S, Li Y, et al. Systematic review and meta-analysis of robotic versus open pancreaticoduodenectomy. Surg Endosc 2017;31:3085-97. [Crossref] [PubMed]
  22. Stauffer JA, Coppola A, Villacreses D, et al. Laparoscopic versus open pancreaticoduodenectomy for pancreatic adenocarcinoma: long-term results at a single institution. Surg Endosc 2017;31:2233-41. [Crossref] [PubMed]
  23. Jang JY, Chang YR, Kim SW, et al. Randomized multicentre trial comparing external and internal pancreatic stenting during pancreaticoduodenectomy. Br J Surg 2016;103:668-75. [Crossref] [PubMed]
Cite this article as: Kang JS, Park T, Han Y, Lee S, Kim JR, Kim H, Kwon W, Kim SW, Heo JS, Choi SH, Choi DW, Kim SC, Hong TH, Yoon DS, Park JS, Park SJ, Han SS, Choi SB, Kim JS, Lim CS, Jang JY. Clinical validation of scoring systems of postoperative pancreatic fistula after pancreatoduodenectomy: applicability to Eastern cohorts? Hepatobiliary Surg Nutr 2019;8(3):211-218. doi: 10.21037/hbsn.2019.03.17