Hybrid binary arithmetic optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical data


Pashaei E., Pashaei E.

Journal of Supercomputing, vol.78, no.13, pp.15598-15637, 2022 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 78 Issue: 13
  • Publication Date: 2022
  • Doi Number: 10.1007/s11227-022-04507-2
  • Journal Name: Journal of Supercomputing
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Page Numbers: pp.15598-15637
  • Keywords: Arithmetic optimization algorithm, Cancer classification, Feature selection, Gene selection, Optimization
  • Istanbul Gelisim University Affiliated: Yes

Abstract

© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.Gene expression data play a significant role in the development of effective cancer diagnosis and prognosis techniques. However, many redundant, noisy, and irrelevant genes (features) are present in the data, which negatively affect the predictive accuracy of diagnosis and increase the computational burden. To overcome these challenges, a new hybrid filter/wrapper gene selection method, called mRMR-BAOAC-SA, is put forward in this article. The suggested method uses Minimum Redundancy Maximum Relevance (mRMR) as a first-stage filter to pick top-ranked genes. Then, Simulated Annealing (SA) and a crossover operator are introduced into Binary Arithmetic Optimization Algorithm (BAOA) to propose a novel hybrid wrapper feature selection method that aims to discover the smallest set of informative genes for classification purposes. BAOAC-SA is an enhanced version of the BAOA in which SA and crossover are used to help the algorithm in escaping local optima and enhancing its global search capabilities. The proposed method was evaluated on 10 well-known microarray datasets, and its results were compared to other current state-of-the-art gene selection methods. The experimental results show that the proposed approach has a better performance compared to the existing methods in terms of classification accuracy and the minimum number of selected genes.