Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer


Ozcan I., Aydin H., Cetinkaya A.

Asian Pacific Journal of Cancer Prevention, cilt.23, sa.10, ss.3287-3297, 2022 (Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 23 Sayı: 10
  • Basım Tarihi: 2022
  • Doi Numarası: 10.31557/apjcp.2022.23.10.3287
  • Dergi Adı: Asian Pacific Journal of Cancer Prevention
  • Derginin Tarandığı İndeksler: Scopus, CAB Abstracts, EMBASE, MEDLINE, Veterinary Science Database, Directory of Open Access Journals
  • Sayfa Sayıları: ss.3287-3297
  • Anahtar Kelimeler: Breast cancer, Classification, Data management, Information systems, Machine learning
  • İstanbul Gelişim Üniversitesi Adresli: Evet

Özet

To identify which Machine Learning (ML) algorithms are the most successful in predicting and diagnosing breast cancer according to accuracy rates. Methods: The “College of Wisconsin Breast Cancer Dataset”, which consists of 569 data and 30 features, was classified using Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Logistic Regression (LR), Multilayer Perceptron (MLP), Linear Discriminant Analysis (LDA), XgBoost (XGB), Ada-Boost (ABC) and Gradient Boosting (GBC) ML algorithms. Before the classification process, the dataset was preprocessed. Sensitivity, accuracy, and definiteness metrics were used to measure the success of the methods. Result: Compared to other ML algorithms used in the study, the GBC ML algorithm was found to be the most successful method in the classification of tumors with an accuracy of 99.12%. The XGB ML algorithm was found to be the lowest method with an accuracy rate of 88.10%. In addition, it was determined that the general accuracy rates of the 11 ML algorithms used in the study varied between 88-95%.Conclusion: When the results obtained from the ML classifiers used in the study are evaluated, the efficiency of the GBC algorithm in the classification of tumors is obvious. It can be said that the success rates obtained from 11 different ML algorithms used in the study are valuable in terms of being used to predict different cancer types.