By Mike Hay | Open Analytics
In the evolving world of credit risk modelling, choosing between traditional statistical methods and modern machine learning techniques isn’t always straightforward.
This study compares logistic regression and LightGBM (a leading gradient boosting algorithm) for predicting 12-month default probabilities in non-bank asset finance data. It explores each approach’s performance, transparency, and practical considerations—from model building and hyperparameter tuning to interpretability and overfitting risk.
Key takeaways:
- Machine learning models outperform logistic regression by ~4 Gini points, but risk faster performance deterioration if overfit.
- Logistic regression remains more transparent and easier to explain to stakeholders and regulators.
- Both methods have trade-offs in ease of use and long-term maintenance.
The full paper offers detailed results, visual comparisons, and recommendations for credit risk professionals deciding between interpretability and predictive power.
📄 Read the full analysis → [Download PDF]