Abstract
Credit card fraud is an alarming criminal offence that
causes significant harm to both individual identities and financial
institutions. For this reason, it is crucial for financial institutions
to identify and stop fraudulent activity. However, fraud prevention
and detection are often costly, labor-intensive, and time-consuming
procedures. This exploration provides an extensive experimental
study of the methods that handle the imbalanced classification
problem faced by fraud detection. Using a labeled credit card fraud
dataset, standard machine learning techniques for fraud detection
were evaluated, their weaknesses were identified, and the results
were carried out. The experiments analyze how well the Support
Vector Machine (SVM), Gaussian Naïve Bayes (GNB), Decision
Trees (DT), Adaptive Boosting Regression (ABR), and Logistic
Regression (LR) perform on highly skewed credit card fraud data.
The skewed data goes through an oversampling technique. The
results show that the SVM, ABR, LR, GNB, and DT classifiers have
Overall Accuracy (OA) of 0.9995, 0.9992, 0.9995, 0.9789, and 0.9993,
respectively. Comparative analysis shows that Logistic Regression
performs better than the other methods based on OA, precision,
recall, F1-score, and kappa score.