Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study

作者全名:"Liu, Xiao Zhu; Duan, Minjie; Huang, Hao Dong; Zhang, Yang; Xiang, Tian Yu; Niu, Wu Ceng; Zhou, Bei; Wang, Hao Lin; Zhang, Ting Ting"

作者地址:"[Liu, Xiao Zhu; Zhou, Bei] Chongqing Med Univ, Dept Cardiol, Affiliated Hosp 2, Chongqing, Peoples R China; [Liu, Xiao Zhu; Duan, Minjie; Huang, Hao Dong; Zhang, Yang] Chongqing Med Univ, Med Data Sci Acad, Chongqing, Peoples R China; [Duan, Minjie; Huang, Hao Dong; Zhang, Yang; Wang, Hao Lin] Chongqing Med Univ, Coll Med Informat, Chongqing, Peoples R China; [Xiang, Tian Yu] Chongqing Med Univ, Univ Town Hosp, Informat Ctr, Chongqing, Peoples R China; [Niu, Wu Ceng] Handan First Hosp, Dept Nucl Med, Handan, Hebei, Peoples R China; [Zhang, Ting Ting] Fifth Med Ctr Chinese Peoples Liberat Army PLA Hos, Dept Endocrinol, Beijing, Peoples R China"

通信作者:"Wang, HL (通讯作者),Chongqing Med Univ, Coll Med Informat, Chongqing, Peoples R China.; Zhang, TT (通讯作者),Fifth Med Ctr Chinese Peoples Liberat Army PLA Hos, Dept Endocrinol, Beijing, Peoples R China."

来源:FRONTIERS IN ENDOCRINOLOGY

ESI学科分类:CLINICAL MEDICINE

WOS号:WOS:001027438000001

JCR分区:Q2

影响因子:3.9

年份:2023

卷号:14

期号: 

开始页: 

结束页: 

文献类型:Article

关键词:type 2 diabetes mellitus; diabetic kidney disease; machine learning; prediction; CatBoost model

摘要:"ObjectiveDiabetic kidney disease (DKD) has been reported as a main microvascular complication of diabetes mellitus. Although renal biopsy is capable of distinguishing DKD from Non Diabetic kidney disease(NDKD), no gold standard has been validated to assess the development of DKD.This study aimed to build an auxiliary diagnosis model for type 2 Diabetic kidney disease (T2DKD) based on machine learning algorithms. MethodsClinical data on 3624 individuals with type 2 diabetes (T2DM) was gathered from January 1, 2019 to December 31, 2019 using a multi-center retrospective database. The data fell into a training set and a validation set at random at a ratio of 8:2. To identify critical clinical variables, the absolute shrinkage and selection operator with the lowest number was employed. Fifteen machine learning models were built to support the diagnosis of T2DKD, and the optimal model was selected in accordance with the area under the receiver operating characteristic curve (AUC) and accuracy. The model was improved with the use of Bayesian Optimization methods. The Shapley Additive explanations (SHAP) approach was used to illustrate prediction findings. ResultsDKD was diagnosed in 1856 (51.2 percent) of the 3624 individuals within the final cohort. As revealed by the SHAP findings, the Categorical Boosting (CatBoost) model achieved the optimal performance 1in the prediction of the risk of T2DKD, with an AUC of 0.86 based on the top 38 characteristics. The SHAP findings suggested that a simplified CatBoost model with an AUC of 0.84 was built in accordance with the top 12 characteristics. The more basic model features consisted of systolic blood pressure (SBP), creatinine (CREA), length of stay (LOS), thrombin time (TT), Age, prothrombin time (PT), platelet large cell ratio (P-LCR), albumin (ALB), glucose (GLU), fibrinogen (FIB-C), red blood cell distribution width-standard deviation (RDW-SD), as well as hemoglobin A1C(HbA1C). ConclusionA machine learning-based model for the prediction of the risk of developing T2DKD was built, and its effectiveness was verified. The CatBoost model can contribute to the diagnosis of T2DKD. Clinicians could gain more insights into the outcomes if the ML model is made interpretable."

基金机构: 

基金资助正文: