Strategy for finding an effective machine learning method based on the example of credit scoring
( Pp. 132-138)

More about authors
Isaev Denis V. aspirant
Financial University under the Government of the Russian Federation
Abstract:
For many companies, the problem of finding optimal strategies for predicting target events is currently relevant. The aim of this work is to develop a prediction method based on machine learning, which allows solving problems related to the choice of the most effective algorithm. As part of the current work, the search for such an algorithm is carried out on the data of commercial Bank customers who have been issued a loan, where the target event is the fact of a credit default. Credit scoring is a popular subject of research, so for many researchers, the problems and features of the problem are familiar. In addition to basic machine learning models, such as naive Bayesian classifier, logistic regression, discriminant analysis, nearest neighbor method, support vector method, and decision trees, the article also analyzes algorithms that take first place in competitions, such as ensembles over decision trees and neural networks. To build a model with a good generalizing ability, it is necessary to choose the most significant input predictors from the point of view of the target event - in our article, these are data describing a potential borrower. Therefore, before training classification models, a comparative analysis of the following methods for selecting explanatory features is carried out: statistical, iterative, feature selection methods based on the gradient boosting model and the genetic algorithm that is gaining popularity recently. The results of the conducted studies showed that for the problem of credit scoring on the data set under consideration, the best method of feature selection is selection based on the ratio gain indicator, and the most effective classifiers were ensembles of decision trees: random forest and gradient boosting. The practical contribution of the study is the proposed strategy for finding the most effective binary classification model. The developed approach of sequential evaluation of methods for selecting predictors and classifiers using several accuracy metrics is a scientific novelty.
How to Cite:
Isaev D.V., (2020), STRATEGY FOR FINDING AN EFFECTIVE MACHINE LEARNING METHOD BASED ON THE EXAMPLE OF CREDIT SCORING. Economic Problems and Legal Practice, 6 => 132-138.
Reference list:
D. Guegan, B. Hassani Regulatory learning: How to supervise machine learning models An application to credit scoring // The Journal of Finance and Data Science. 2018. №4.
Baesens B., Van Gestel T., Viaene S., Stepanova M., Suykens J., Vanthienen J. Benchmarking state-of-the-art classification algorithms for credit scoring // Journal of the Operational Research Society. 2003. №54.
S. Lessmann, B. Baesens, H-V. Seow, L. C.Thomas Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research // European Journal of Operational Research. 2015. №247.
V. Moscato, A. Picariello, G. Sperl A benchmark of machine learning approaches for credit score prediction // Expert Systems With Applications. 2020. №165.
Yuelin Wang, Yihan Zhang, Yan Lu, Xinran Yu A Comparative Assessment of Credit Risk Model Based on Machine Learning - a case study of bank loan data // Procedia Computer Science. 2020. №174.
P. Ziemba, A. Radomska-Zalas, J. Becker Client evaluation decision models in the credit scoring tasks // Procedia Computer Science. 2020. №176.
Cuicui Luo, Desheng Wu, Dexiang Wu A deep learning approach for credit scoring using credit default swaps // Engineering Applications of Artificial Intelligence. 2017. №65.
M. Herasymovych, K. M rka, O. Lukason Using reinforcement learning to optimize the acceptance threshold of a credit scoring model // Applied Soft Computing Journal. 2019. №84.
X. Dastile, T. Celik, M. Potsane Statistical and machine learning models in credit scoring: A systematic literature survey // Applied Soft Computing Journal. 2020. №91.
S. K. Trivedi A study on credit scoring modeling with different feature selection and machine learning approaches // Technology in Society. 2020. №63.
A. G m s M. E. Tenekeci, A. V. Bilgili Estimation of wheat planting date using machine learning algorithms based on available climate data // Sustainable Computing: Informatics and Systems. 2020.
H. Chena, Y. Xiang The Study of Credit Scoring Model Based on Group Lasso // Procedia Computer Science. 2017. №122.
A. Bequ , S. Lessmann Extreme learning machines for credit scoring: An empirical evaluation // Expert Systems With Applications. 2017. №86.
D. Tripathi, D. R. Edla, V. Kuppili, A. Bablani Evolutionary Extreme Learning Machine with novel activation function for credit scoring // Engineering Applications of Artificial Intelligence. 2020. №96.
F. Shen, X. Zhao, G. Kou, F. E. Alsaadi A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique // Applied Soft Computing. 2020.
V. B. Djeundje, J. Crook, R. Calabrese, M. Hamid Enhancing credit scoring with alternative data // Expert Systems with Applications. 2020. №163.
Wang Bao, Ning Lianju, Kong Yue Integration of unsupervised and supervised machine learning algorithms for credit risk assessment // Expert Systems With Applications. 2019. №128.
Haoting Zhang, Hongliang He, Wenyu Zhang Classifier selection and clustering with fuzzy assignment in ensemble model for credit scoring // Neurocomputing. 2018. №316.
Feng Shena, Xingchao Zhao, Gang Kou Three-stage reject inference learning framework for credit scoring using unsupervised transfer learning and three-way decision theory // Decision Support Systems. 2020. №137.
J. P. Barddal, L. Loezer, F. Enembreck, R. Lanzuolo Lessons learned from data stream classification applied to credit scoring // Expert Systems With Applications. 2020. №162.
Keywords:
credit scoring, machine learning, feature selection, random forest, ensemble of models.


Related Articles

Multiscale Modeling for Information Control and Processing Pages: 11-20 DOI: 10.33693/2313-223X-2022-9-2-11-20 Issue №21224
Finding the Optimal Machine Learning Model for Flood Prediction on the Amur River
disaster management floods forecasting Amur River machine learning
Show more
Mathematical and Software of Computеrs, Complexes and Computer Networks Pages: 16-22 DOI: 10.33693/2313-223X-2023-10-4-16-22 Issue №47939
Detection of Depression Among Social Network Users Using Machine Learning Methods
XGBoost social networks VKontakte support vector machine logistic regression
Show more
Artificial intelligence and machine learning Pages: 19-31 DOI: 10.33693/2313-223X-2022-9-3-19-31 Issue №21873
Identification Algorithm Faces and Criminal Actions
Kaggle machine learning deep convolutional neural network Kaggle landmarks
Show more
Mathematical and Software of Computеrs, Complexes and Computer Networks Pages: 26-35 DOI: 10.33693/2313-223X-2023-10-2-26-35 Issue №23034
Analysis of the Algorithms of the Constituent Parts of the Compiler and its Optimization
compiler program code optimization algorithm analysis
Show more
Artificial intelligence and machine learning Pages: 35-44 DOI: 10.33693/2313-223X-2022-9-2-35-44 Issue №21224
Elements of artificial intelligence in solving problems of text analysis
sentiment analysis artificial neural networks machine learning recurrent neural networks long short-term memory
Show more
System Analysis, Information Management and Processing, Statistics Pages: 78-84 DOI: 10.33693/2313-223X-2024-11-1-78-84 Issue №95385
Algebraic Models for Data and Knowledge Representation in Modern Database Management Systems
SQL algebraic models database management systems machine learning artificial intelligence
Show more
Mathematical and Software of Computеrs, Complexes and Computer Networks Pages: 83-91 DOI: 10.33693/2313-223X-2023-10-3-83-91 Issue №23683
Determination of Parameters of Hidden Threats of Early Detection in Information Systems for Machine Learning Tasks
Anylogic machine learning corporate information systems (CIS) simulation modeling data analysis
Show more
5.2.2. MATHEMATICAL, STATISTICAL AND INSTRUMENTAL METHODS OF ECONOMICS Pages: 75-79 Issue №21250
Modern Directions of Research in the Field of Recommender Systems
recommender system collaborative filtering content-based filtering cold start machine learning
Show more
4. MATHEMATICAL AND INSTRUMENTAL METHODS OF ECONOMICS 08.00.13 Pages: 65-72 Issue №19146
FORECASTING FINANCIAL MARKETS USING CONVENTIONAL NEURAL NETWORK
financial market forecasting machine learning convolutional neural network mathematical model algorithm
Show more
4. MATHEMATICAL AND INSTRUMENTAL METHODS OF ECONOMICS 08.00.13 Pages: 85-97 Issue №19146
INVESTMENT RISKS MODELLING IN AGRO-INDUSTRIAL COMPLEX
risk models default assessment credit scoring agriculture
Show more