MATHEMATICS AND STATISTICS (ALHAMBRA), cilt.11, sa.4, ss.693-702, 2023 (Scopus)
The Spearman rho nonparametric correlation
coefficient is widely used to measure the strength and
degree of association between two variables. However,
outliers in the data can skew the results, leading to
inaccurate results as the Spearman correlation coefficient is
sensitive toward outliers. Thus, the robust approach is used
to construct a robust model which is highly resistant to data
contamination. The robustness of an estimator is measured
by the breakdown point which is the smallest fraction of
outliers in a sample data without affecting the estimator
entirely. To overcome this problem, the aim of this study is
two-fold. Firstly, researchers have proposed a robust
Spearman correlation coefficient model based on the MMestimator, called the MM-Spearman correlation coefficient.
Secondly, to test the performance of the proposed model, it
was tested by the Monte Carlo simulation and
contaminated air pollution data in Kuala Terengganu,
Terengganu, Malaysia. The data have been contaminated
from 10% to 50% outliers. The performance of the MMSpearman correlation coefficient properties was evaluated
by statistical measurements such as standard error, mean
squared error, root mean squared error and bias. The MMSpearman correlation coefficient model outperformed the
classical model, producing significantly smaller standard
error, mean squared error, and root mean squared error
values. The robustness of the model was evaluated using
the breakdown point, which measures the smallest fraction
of outliers that can be present in sample data without
entirely affecting the estimator. The hybrid MM-Spearman
correlation coefficient model demonstrated high robustness
and efficiently handled data contamination up to 50%.
However, the study has a limitation in that it can only
overcome data contamination up to a maximum of 50%.
Despite this limitation, the proposed model provides
accurate and efficient results, enabling management
authorities to make sound decisions without being affected
by contaminated data. The MM-Spearman correlation
coefficient model provides a valuable tool for researchers
and decision-makers, allowing them to analyze data with a
high degree of accuracy and robustness, even in the
presence of outliers.