Paris-Dauphine finance and machine learning project

Credit Risk Modelling

Loan approval scoring pipeline from experimentation to batch inference

This project turns a notebook-driven loan approval classifier into a reproducible end-to-end workflow. It combines model comparison, tuned XGBoost training, structured configuration, and a business layer that converts probabilities into usable risk scores.

View code

94.97%

Accuracy

94.89%

F1 score

0.99

ROC-AUC

Problem

Loan approval decisions need more than raw model accuracy: they need reproducible training, reliable inference outputs, and business-friendly risk interpretation.

Approach

Built a modular Python codebase with dedicated training and inference entrypoints, feature engineering pipelines, serialized artifacts, YAML configuration, validation notebooks, and Docker support for portable execution.

Results

Reached 94.97% accuracy, 94.89% F1, and 0.99 ROC-AUC while introducing a RiskScore_ML from 0 to 100 and A-to-E risk levels for decision support.

What is in the repository

Compared four classifiers before locking the best-performing XGBoost model.

Separated feature engineering, training, and inference into reusable pipeline modules.

Packaged the workflow with configuration files and Docker for reproducibility.

Translated model outputs into business-facing risk bands instead of raw probabilities only.

Role and scope

End-to-end ML pipeline design, model selection, packaging, and business translation

Project context

Paris-Dauphine finance and machine learning project

Main stack

PythonXGBoostscikit-learnPandasDockerPyYAML