Machine Learning to Develop Credit Card Customer Churn Prediction


Creative Commons License

AL-Najjar D., Al-Rousan N., AL-Najjar H.

Journal of Theoretical and Applied Electronic Commerce Research, vol.17, no.4, pp.1529-1542, 2022 (SSCI) identifier

  • Publication Type: Article / Article
  • Volume: 17 Issue: 4
  • Publication Date: 2022
  • Doi Number: 10.3390/jtaer17040077
  • Journal Name: Journal of Theoretical and Applied Electronic Commerce Research
  • Journal Indexes: Social Sciences Citation Index (SSCI), Scopus, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Directory of Open Access Journals, DIALNET, Civil Engineering Abstracts
  • Page Numbers: pp.1529-1542
  • Keywords: customer churn, feature selection, machine learning, prediction model, two-step clustering
  • Istanbul Gelisim University Affiliated: Yes

Abstract

© 2022 by the authors.The credit card customer churn rate is the percentage of a bank’s customers that stop using that bank’s services. Hence, developing a prediction model to predict the expected status for the customers will generate an early alert for banks to change the service for that customer or to offer them new services. This paper aims to develop credit card customer churn prediction by using a feature-selection method and five machine learning models. To select the independent variables, three models were used, including selection of all independent variables, two-step clustering and k-nearest neighbor, and feature selection. In addition, five machine learning prediction models were selected, including the Bayesian network, the C5 tree, the chi-square automatic interaction detection (CHAID) tree, the classification and regression (CR) tree, and a neural network. The analysis showed that all the machine learning models could predict the credit card customer churn model. In addition, the results showed that the C5 tree machine learning model performed the best in comparison with the three developed models. The results indicated that the top three variables needed in the development of the C5 tree customer churn prediction model were the total transaction count, the total revolving balance on the credit card, and the change in the transaction count. Finally, the results revealed that merging the multi-categorical variables into one variable improved the performance of the prediction models.