ID: 23FE10CSE00743

“Explainable and Fair Credit Default Prediction Across Multiple Financial Datasets Using Machine Learning”

Project Statement: This research develops an explainable and fairness-aware machine learning framework to predict credit default across multiple financial datasets. The goal is to achieve high predictive accuracy while ensuring transparent decision-making, reduced demographic bias, and reliable risk insights for responsible financial assessment.

Problem Statement

Financial institutions depend on credit risk models to detect potential defaulters. However, most existing systems either focus only on accuracy or behave like black boxes, making their decisions difficult to interpret and sometimes unfair. Traditional statistical models lack the ability to capture complex financial patterns, while modern AI models often raise concerns around transparency, trust, and demographic bias. This project proposes a reliable, interpretable, and fairness-aware prediction framework that performs consistently across multiple datasets and supports responsible, explainable credit decision-making.

Literature Review

Earlier studies used Logistic Regression for interpretability and Random Forest/XGBoost for higher accuracy. Recent work introduces Explainable AI (SHAP) to understand model decisions and fairness techniques to detect demographic bias. However, most research focuses on only one aspect — either accuracy, explainability, or fairness — and usually on a single dataset.

Research Gap

Existing credit prediction studies rarely combine performance, interpretability, and fairness in one unified system. Most models are trained and tested on a single dataset, limiting real-world reliability. This work bridges that gap by integrating multi-dataset validation, SHAP explainability, fairness analysis, and bias mitigation into a single comprehensive framework for responsible AI-driven credit scoring.

System Methodology

Dataset

Three real-world financial datasets were used for robust evaluation: • Taiwan Credit Card (UCI) – 30,000 customers with billing and repayment behavior • German Credit (UCI) – 1,000 socio-economic borrower profiles • Give Me Some Credit (Kaggle) – ~150,000 large-scale financial records Feature engineering was applied to create payment scores, utilization ratios, and financial behavior indicators. Data was then processed using scaling, stratified splitting, and class imbalance handling (SMOTE + class weights).

Model / Architecture

Multiple models were trained and compared: • Logistic Regression – interpretable baseline • Random Forest – ensemble learning model • XGBoost – highest performing model Evaluation used Accuracy, Precision, Recall, F1-score, ROC-AUC, and 5-fold cross-validation. To enhance reliability: • Threshold tuning improved detection of high-risk borrowers • SHAP provided explainable insights • Fairness-aware reweighting reduced demographic bias

Live Execution

VIEW CODE / DEMO

Results & Analysis

Best ROC-AUC Achieved 0.86

XGBoost consistently delivered the best performance across all datasets, achieving ROC-AUC scores of 0.77 (Taiwan), 0.80 (German), and 0.86 (GiveMeSomeCredit). Threshold optimization improved sensitivity toward high-risk borrowers, enabling stronger identification of potential defaulters. Using SHAP explainability, the model identified major financial risk drivers such as: • Repayment delays • Credit utilization ratio • Past-due behavior Fairness-aware training reduced demographic disparity while maintaining strong predictive accuracy, making the system more transparent and responsible.

Academic Credits

Project Guide

Dr. Onkar Singh

Team Member 1

Divyanshi Singh

23FE10CSE00743