"CrossFuse-XGBoost: accurate prediction of the maximum recommended daily dose through multi-feature fusion, cross-validation screening and extreme gradient boosting"

作者全名:"Li, Qiang; He, Yu; Pan, Jianbo"

作者地址:"[Li, Qiang; He, Yu; Pan, Jianbo] Chongqing Med Univ, Minist Educ, Inst Life Sci, Basic Med Res & Innovat Ctr Novel Target & Therape, Chongqing, Peoples R China; [Pan, Jianbo] Chongqing Med Univ, Basic Med Res & Innovat Ctr Novel Target & Therape, Minist Educ, Inst Life Sci, 1 Yixueyuan Rd, Chongqing 400016, Peoples R China"

通信作者:"Pan, JB (通讯作者),Chongqing Med Univ, Basic Med Res & Innovat Ctr Novel Target & Therape, Minist Educ, Inst Life Sci, 1 Yixueyuan Rd, Chongqing 400016, Peoples R China."

来源:BRIEFINGS IN BIOINFORMATICS

ESI学科分类:COMPUTER SCIENCE

WOS号:WOS:001173375300032

JCR分区:Q1

影响因子:9.5

年份:2024

卷号:25

期号:1

开始页: 

结束页: 

文献类型:Article

关键词:CrossFuse-XGBoost; multi-feature fusion; cross-validation screening; maximum recommended daily dose

摘要:"In the drug development process, approximately 30% of failures are attributed to drug safety issues. In particular, the first-in-human (FIH) trial of a new drug represents one of the highest safety risks, and initial dose selection is crucial for ensuring safety in clinical trials. With traditional dose estimation methods, which extrapolate data from animals to humans, catastrophic events have occurred during Phase I clinical trials due to interspecies differences in compound sensitivity and unknown molecular mechanisms. To address this issue, this study proposes a CrossFuse-extreme gradient boosting (XGBoost) method that can directly predict the maximum recommended daily dose of a compound based on existing human research data, providing a reference for FIH dose selection. This method not only integrates multiple features, including molecular representations, physicochemical properties and compound-protein interactions, but also improves feature selection based on cross-validation. The results demonstrate that the CrossFuse-XGBoost method not only improves prediction accuracy compared to that of existing local weighted methods [k-nearest neighbor (k-NN) and variable k-NN (v-NN)] but also solves the low prediction coverage issue of v-NN, achieving full coverage of the external validation set and enabling more reliable predictions. Furthermore, this study offers a high level of interpretability by identifying the importance of different features in model construction. The 241 features with the most significant impact on the maximum recommended daily dose were selected, providing references for optimizing the structure of new compounds and guiding experimental research. The datasets and source code are freely available at https://github.com/cqmu-lq/CrossFuse-XGBoost."

基金机构:National Natural Science Foundation of China [82104063]; Top-notch Talent Cultivation Program for Graduate Students of Chongqing Medical University [BJRC202110]

基金资助正文:National Natural Science Foundation of China (82104063) and the Top-notch Talent Cultivation Program for Graduate Students of Chongqing Medical University (BJRC202110)