Isolation forest-voting fusion-multioutput: A stroke risk classification method based on the multidimensional output of abnormal sample detection

作者全名:He, Hai; Yang, Haibo; Mercaldo, Francesco; Santone, Antonella; Huang, Pan

作者地址:[He, Hai] Chongqing City Management Coll, Sch Big Data & Informat Ind, Chongqing 401331, Peoples R China; [Yang, Haibo] Chongqing Med Univ, Informat Ctr, Chongqing 400016, Peoples R China; [Mercaldo, Francesco; Santone, Antonella] Univ Molise, Dept Med & Hlth Sci Vincenzo Tiberio, I-86100 Campobasso, Italy; [Huang, Pan] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China; [Yang, Haibo; Mercaldo, Francesco] 1 Med Sch Rd, Chongqing, Peoples R China

通信作者:Yang, HB; Mercaldo, F (通讯作者),1 Med Sch Rd, Chongqing, Peoples R China.

来源:COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

ESI学科分类:COMPUTER SCIENCE

WOS号:WOS:001250661600001

JCR分区:Q1

影响因子:4.9

年份:2024

卷号:253

期号: 

开始页: 

结束页: 

文献类型:Article

关键词:Abnormal samples; Stroke; Screening data; Multidimensional output; Risk factor

摘要:Background and Objective: Stroke has become a major disease threatening the health of people around the world. It has the characteristics of high incidence, high fatality, and a high recurrence rate. At this stage, problems such as poor recognition accuracy of stroke screening based on electronic medical records and insufficient recognition of stroke risk levels exist. These problems occur because of the systematic errors of medical equipment and the characteristics of the collectors during the process of electronic medical record collection. Errors can also occur due to misreporting or underreporting by the collection personnel and the strong subjectivity of the evaluation indicators. Methods: This paper proposes an isolation forest-voting fusion-multioutput algorithm model. First, the screening data are collected for numerical processing and normalization. The composite feature score index of this paper is used to analyze the importance of risk factors, and then, the isolation forest is used. The algorithm detects abnormal samples, uses the voting fusion algorithm proposed in this article to perform decision fusion prediction classification, and outputs multidimensional (risk factor importance score, abnormal sample label, risk level classification, and stroke prediction) results that can be used as auxiliary decision information by doctors and medical staff. Results: The isolation forest-voting fusion-multioutput algorithm proposed in this article has five categories (zero risk, low risk, high risk, ischemic stroke (TIA), and hemorrhagic stroke (HE)). The average accuracy rate of stroke prediction reached 79.59 %. Conclusions: The isolation forest-voting fusion-multioutput algorithm model proposed in this paper can not only accurately identify the various categories of stroke risk levels and stroke prediction but can also output multidimensional auxiliary decision-making information to help medical staff make decisions, thereby greatly improving the screening efficiency.

基金机构:Science and Technology Research Program of Chongqing Municipal Education Commission [KJQN202203301]

基金资助正文:<BOLD>Funding</BOLD> This work was supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJQN202203301) .