Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy

作者全名:"Jin, Yudi; Lan, Ailin; Dai, Yuran; Jiang, Linshan; Liu, Shengchun"

作者地址:"[Jin, Yudi; Lan, Ailin; Dai, Yuran; Jiang, Linshan; Liu, Shengchun] Chongqing Med Univ, Affiliated Hosp 1, Dept Breast & Thyroid Surg, Chongqing 400016, Peoples R China; [Jin, Yudi] Chongqing Univ Canc Hosp, Dept Pathol, Chongqing Key Lab Intelligent Oncol Breast Canc iC, Chongqing 400030, Peoples R China"

通信作者:"Liu, SC (通讯作者),Chongqing Med Univ, Affiliated Hosp 1, Dept Breast & Thyroid Surg, Chongqing 400016, Peoples R China."

来源:EUROPEAN JOURNAL OF MEDICAL RESEARCH

ESI学科分类:CLINICAL MEDICINE

WOS号:WOS:001073188100001

JCR分区:Q2

影响因子:2.8

年份:2023

卷号:28

期号:1

开始页: 

结束页: 

文献类型:Article

关键词:Breast cancer; Machine learning; Random forest; Logistic regression; Event

摘要:"BackgroundBreast cancer (BC) is the most common malignant tumor around the world. Timely detection of the tumor progression after treatment could improve the survival outcome of patients. This study aimed to develop machine learning models to predict events (defined as either (1) the first tumor relapse locally, regionally, or distantly; (2) a diagnosis of secondary malignant tumor; or (3) death because of any reason.) in BC patients post-treatment.MethodsThe patients with the response of stable disease (SD) and progressive disease (PD) after neoadjuvant chemotherapy (NAC) were selected. The clinicopathological features and the survival data were recorded in 1 year and 5 years, respectively. Patients were randomly divided into the training set and test set in the ratio of 8:2. A random forest (RF) and a logistic regression were established in both of 1-year cohort and the 5-year cohort. The performance was compared between the two models. The models were validated using data from the Surveillance, Epidemiology, and End Results (SEER) database.ResultsA total of 315 patients were included. In the 1-year cohort, 197 patients were divided into a training set while 87 were into a test set. The specificity, sensitivity, and AUC were 0.800, 0.833, and 0.810 in the RF model. And 0.520, 0.833, and 0.653 of the logistic regression. In the 5-year cohort, 132 patients were divided into the training set while 33 were into the test set. The specificity, sensitivity, and AUC were 0.882, 0.750, and 0.829 in the RF model. And 0.882, 0.688, and 0.752 of the logistic regression. In the external validation set, of the RF model, the specificity, sensitivity, and AUC were 0.765, 0.812, and 0.779. Of the logistics regression model, the specificity, sensitivity, and AUC were 0.833, 0.376, and 0.619.ConclusionThe RF model has a good performance in predicting events among BC patients with SD and PD post-NAC. It may be beneficial to BC patients, assisting in detecting tumor recurrence."

基金机构:Not applicable.

基金资助正文:Not applicable.