Annual Research Symposium

Privacy-Preserving Synthetic Data Generation for Federated Learning in Imbalanced Credit Card Fraud Detection: A Comparative Analysis of SMOTE vs. GAN Approaches

Download

Download Full Text (791 KB)

Description

Credit card fraud detection faces two critical challenges: extreme class imbalance with less than 0.2% fraud cases in typical datasets, and strict privacy regulations that prevent financial institutions from sharing sensitive transaction data for collaborative model training. This research addresses these challenges by investigating privacy-preserving synthetic data generation approaches in federated learning settings. The study compares two methods: Federated-SMOTE, which employs secure cross-bank nearest neighbor discovery to generate synthetic fraud cases through interpolation with strong privacy guarantees (ε=0.3), and Federated-GAN, which uses generative adversarial networks with differential privacy to synthesize realistic fraud patterns with relaxed privacy constraints (ε=0.7). Using the European Credit Card dataset partitioned across five banks with non-IID distribution, both approaches were evaluated through federated learning with FedAvg aggregation over 10 communication rounds. Experimental results demonstrate that both privacy-preserving methods significantly outperform baseline approaches without data balancing, achieving 17-21% improvement in PR-AUC. Federated-GAN achieves the highest overall precision-recall balance (PR-AUC 0.827), while Federated-SMOTE provides the highest fraud detection rate (recall 0.777) with 2.3× stronger privacy protection. This research contributes a practitioner-ready comparative framework for financial institutions to select privacy-preserving synthetic data generation methods based on their specific regulatory constraints and performance priorities, enabling collaborative fraud detection without exposing sensitive customer transaction data.

Publication Date

2026

Recommended Citation

Bhutta, Muhammad and Mehmood, Abid, "Privacy-Preserving Synthetic Data Generation for Federated Learning in Imbalanced Credit Card Fraud Detection: A Comparative Analysis of SMOTE vs. GAN Approaches" (2026). Annual Research Symposium. 85.
https://scholar.dsu.edu/research-symposium/85

COinS

Annual Research Symposium

Privacy-Preserving Synthetic Data Generation for Federated Learning in Imbalanced Credit Card Fraud Detection: A Comparative Analysis of SMOTE vs. GAN Approaches

Description

Publication Date

Recommended Citation

Browse

Search

Author Corner

Annual Research Symposium

Privacy-Preserving Synthetic Data Generation for Federated Learning in Imbalanced Credit Card Fraud Detection: A Comparative Analysis of SMOTE vs. GAN Approaches

Authors

Files

Description

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner