Comparison between medical knowledge based and computer automated feature selection for detection of coronary artery disease using imbalanced data

Konferenz: BIBE 2018 - International Conference on Biological Information and Biomedical Engineering
06.06.2018 - 08.06.2018 in Shanghai, China

Tagungsband: BIBE 2018

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Li, Han; Wang, Xinpei; Li, Yang; Liu, Changchun (School of Control Science and Engineering, Shandong University, Jinan, China)
Qin, Caijie (Institute of Information Engineering, Sanming University, Sanming, China)

This paper investigated the differences between medical knowledge based and computer automated feature selection with an imbalanced dataset of coronary artery disease (CAD). Sub-data sets 1 and 2 were generated by undersampling the CAD records. Decision tree (DT), neural network (NN), generalized linear model (GLM), logistic regression (LR) and Naïve Bayes (NB) were applied on the dataset for all attributes. The results showed that the specificity increased after resampling but accordingly the accuracy, F1-score and sensitivity decreased. Furthermore, information gain (IG), genetic algorithm and medical knowledge based approach were separately used to select features for sub-data set 1, and their performance was assessed on sub-data set 2. It was concluded that overall the medical knowledge based approach outperformed the other two feature selection methods in sub-data set 2.