A data driven model for credit scoring of loan applicants within a crowdfunding scenario in a P2P lending platform in Iran

Document Type : Research Paper


School of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran


Crowdfunding is a fundraising tool to solicit many small amounts of capital from a large number of potential investors. Peer to peer lending is known as a main type of crowdfunding in which lenders and borrowers can interact directly through an online platform. By eliminating the intermediaries and therefore reducing operating expenses, P2P platforms can provide a win-win situation for both borrowers and lenders. However, the absence of intermediaries –such as banks- increases the risk of loan repayment fraud. To avoid such losses, credit scoring methods help lenders to decide on a specific loan by assessing corresponding credit risk. This paper proposes a credit scoring model on a P2P lending platform in Iran. Although data-driven approaches have increasingly used to enhance credit scoring within financial domains, there is a lack of research on assessing the usability of these approaches within P2P crowdfunding scenarios. This research focuses on developing a novel data-driven model that can enhance P2P credit scoring within crowdfunding scenarios. To do so, on the basis of data from an Iranian P2P lending platform, five different tree-based classifiers were developed, among which Random Forest resulted in the best accuracy (97.80%). Lenders in the used platform are businesses, each having a different risk tolerance threshold. A default probability was computed for each loan request to help lenders make decisions based on their own risk tolerance. The results clearly demonstrate how novel data analytics approaches can enhance intelligent decision making about P2P funding within P2P lending platforms.


Main Subjects

Alborzi, M. and Khanbabaei, M. (2016) ‘Using data mining and neural networks techniques to propose a new hybrid customer behaviour analysis and credit scoring model in banking services based on a developed RFM analysis method’, International Journal of Business Information Systems, 23(1), pp. 1–22. doi: 10.1504/IJBIS.2016.078020.
Alomari, Z. and Fingerman, D. (2017) ‘Loan Default Prediction and Identification of Interesting Relations between Attributes of Peer-to-Peer Loan Applications’, New Zealand Journal of Computer-Human Interaction.
Anglin, A. H. et al. (2018) ‘The power of positivity? The influence of positive psychological capital language on crowdfunding performance’, Journal of Business Venturing, 33(4), pp. 470–492. doi: 10.1016/j.jbusvent.2018.03.003.
Bekhet, H. A. and Eletter, S. F. K. (2014) ‘Credit risk assessment model for Jordanian commercial banks: Neural scoring approach’, Review of Development Finance, 4(1), pp. 20–28. doi: 10.1016/j.rdf.2014.03.002.
Belleflamme, P., Omrani, N. and Peitz, M. (2015) ‘The economics of crowdfunding platforms’, Information Economics and Policy, 33, pp. 11–28. doi: 10.1016/j.infoecopol.2015.08.003.
Berliner, L. S. and Kenworthy, N. J. (2017) ‘Producing a worthy illness: Personal crowdfunding amidst financial crisis’, Social Science and Medicine, 187, pp. 233–242. doi: 10.1016/j.socscimed.2017.02.008.
Bhatia, S. et al. (2017) ‘Credit Scoring using Machine Learning Techniques’, International Journal of Computer Applications, 161(11), pp. 1–4. doi: 10.5120/ijca2017912893.
Byanjankar, A., Heikkila, M. and Mezei, J. (2015) ‘Predicting credit risk in peer-to-peer lending: A neural network approach’, Proceedings - 2015 IEEE Symposium Series on Computational Intelligence, SSCI 2015, pp. 719–725. doi: 10.1109/SSCI.2015.109.
Chandrashekar, G. and Sahin, F. (2014) ‘A survey on feature selection methods’, Computers and Electrical Engineering, 40(1), pp. 16–28. doi: 10.1016/j.compeleceng.2013.11.024.
Chopra, A. and Bhilare, P. (2018) ‘Application of Ensemble Models in Credit Scoring Models’, Business Perspectives and Research, 6(2), pp. 129–141. doi: 10.1177/2278533718765531.
Cordova, A., Dolci, J. and Gianfrate, G. (2015) ‘The Determinants of Crowdfunding Success: Evidence from Technology Projects’, Procedia - Social and Behavioral Sciences, 181, pp. 115–124. doi: 10.1016/j.sbspro.2015.04.872.
Dhiraj, K. (2019) Top 5 advantages and disadvantages of Decision Tree Algorithm, medium. Available at: https://dhirajkumarblog.medium.com/top-5-advantages-and-disadvantages-of-decision-tree-algorithm-428ebd199d9a (Accessed: 5 June 2021).
Feng, X. et al. (2018) ‘Dynamic ensemble classification for credit scoring using soft probability’, Applied Soft Computing Journal, 65, pp. 139–151. doi: 10.1016/j.asoc.2018.01.021.
Ge, R. et al. (2017) ‘Predicting and Deterring Default with Social Media Information in Peer-to-Peer Lending’, Journal of Management Information Systems, 34(2), pp. 401–424. doi: 10.1080/07421222.2017.1334472.
Haponik, A. (2020) Decision Tree Machine Learning Model, addepto. Available at: https://addepto.com/decision-tree-machine-learning-model/ (Accessed: 5 June 2021).
Hörisch, J. (2015) ‘Crowdfunding for environmental ventures: An empirical analysis of the influence of environmental orientation on the success of crowdfunding initiatives’, Journal of Cleaner Production, 107, pp. 636–645. doi: 10.1016/j.jclepro.2015.05.046.
Hörisch, J. and Tenner, I. (2020) ‘How environmental and social orientations influence the funding success of investment-based crowdfunding: The mediating role of the number of funders and the average funding amount’, Technological Forecasting and Social Change, 161(April), p. 120311. doi: 10.1016/j.techfore.2020.120311.
Huang, C. L., Chen, M. C. and Wang, C. J. (2007) ‘Credit scoring with a data mining approach based on support vector machines’, Expert Systems with Applications, 33(4), pp. 847–856. doi: 10.1016/j.eswa.2006.07.007.
Jiang, C. et al. (2020) ‘The impact of soft information extracted from descriptive text on crowdfunding performance’, Electronic Commerce Research and Applications, 43(June), p. 101002. doi: 10.1016/j.elerap.2020.101002.
Johan, S. and Zhang, Y. (2020) ‘Quality revealing versus overstating in equity crowdfunding’, Journal of Corporate Finance, 65(September), p. 101741. doi: 10.1016/j.jcorpfin.2020.101741.
Ju, Y., Jeon, S. Y. and Sohn, S. Y. (2015) ‘Behavioral technology credit scoring model with time-dependent covariates for stress test’, European Journal of Operational Research, 242(3), pp. 910–919. doi: 10.1016/j.ejor.2014.10.054.
Klafft, M. (2008) ‘Online peer-to-peer lending: A lenders’ perspective’, Proceedings of the 2008 International Conference on e-Learning, e-Business, Enterprise Information Systems, and e-Government, EEE 2008, (July), pp. 371–375. doi: 10.2139/ssrn.1352352.
Kozodoi, N. et al. (2019) ‘A multi-objective approach for profit-driven feature selection in credit scoring’, Decision Support Systems, 120(March), pp. 106–117. doi: 10.1016/j.dss.2019.03.011.
Liang, D., Tsai, C. F. and Wu, H. T. (2015) ‘The effect of feature selection on financial distress prediction’, Knowledge-Based Systems, 73(1), pp. 289–297. doi: 10.1016/j.knosys.2014.10.010.
Liu, Y., Chen, Y. and Fan, Z. P. (2021) ‘Do social network crowds help fundraising campaigns? Effects of social influence on crowdfunding performance’, Journal of Business Research, 122(February 2019), pp. 97–108. doi: 10.1016/j.jbusres.2020.08.052.
Lucas, Y. et al. (2020) ‘Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs’, Future Generation Computer Systems, 102, pp. 393–402. doi: 10.1016/j.future.2019.08.029.
Lukkarinen, A. et al. (2016) ‘Success drivers of online equity crowdfunding campaigns’, Decision Support Systems, 87, pp. 26–38. doi: 10.1016/j.dss.2016.04.006.
Majumdar, A. and Bose, I. (2018) ‘My words for your pizza: An analysis of persuasive narratives in online crowdfunding’, Information and Management, 55(6), pp. 781–794. doi: 10.1016/j.im.2018.03.007.
Martens, D. et al. (2007) ‘Comprehensible credit scoring models using rule extraction from support vector machines’, European Journal of Operational Research, 183(3), pp. 1466–1476. doi: 10.1016/j.ejor.2006.04.051.
Moon, T. H. and Sohn, S. Y. (2010) ‘Technology credit scoring model considering both SME characteristics and economic conditions: The Korean case’, Journal of the Operational Research Society, 61(4), pp. 666–675. doi: 10.1057/jors.2009.7.
Moradi, S. and Mokhatab Rafiei, F. (2019) ‘A dynamic credit risk assessment model with data mining techniques: evidence from Iranian banks’, Financial Innovation, 5(1). doi: 10.1186/s40854-019-0121-9.
Nikulski, J. (2020) The Ultimate Guide to AdaBoost, random forests and XGBoost, Towards Data Science. Available at: https://towardsdatascience.com/the-ultimate-guide-to-adaboost-random-forests-and-xgboost-7f9327061c4f (Accessed: 5 June 2021).
Patwardhan, A. (2018) Peer-To-Peer Lending, Handbook of Blockchain, Digital Finance, and Inclusion, Volume 1: Cryptocurrency, FinTech, InsurTech, and Regulation. Elsevier Inc. doi: 10.1016/B978-0-12-810441-5.00018-X.
Petitjean, M. (2018) ‘What explains the success of reward-based crowdfunding campaigns as they unfold? Evidence from the French crowdfunding platform KissKissBankBank’, Finance Research Letters, 26, pp. 9–14. doi: 10.1016/j.frl.2017.11.005.
Polena, M. and Regner, T. (2018) ‘Determinants of borrowers’ default in P2P lending under consideration of the loan risk class’, Games, 9(4), pp. 1–17. doi: 10.3390/g9040082.
Raab, M. et al. (2020) ‘More than a feeling: Investigating the contagious effect of facial emotional expressions on investment decisions in reward-based crowdfunding’, Decision Support Systems, 135. doi: 10.1016/j.dss.2020.113326.
Random forest: many are better than one (2017) QuantDare. Available at: https://quantdare.com/random-forest-many-are-better-than-one/ (Accessed: 5 June 2021).
Robiady, N. D., Windasari, N. A. and Nita, A. (2020) ‘Customer engagement in online social crowdfunding: The influence of storytelling technique on donation performance’, International Journal of Research in Marketing, (xxxx), pp. 1–9. doi: 10.1016/j.ijresmar.2020.03.001.
Serrano-Cinca, C. and Gutiérrez-Nieto, B. (2016) ‘The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending’, Decision Support Systems, 89, pp. 113–122. doi: 10.1016/j.dss.2016.06.014.
Suryono, R. R., Purwandari, B. and Budi, I. (2019) ‘Peer to peer (P2P) lending problems and potential solutions: A systematic literature review’, Procedia Computer Science, 161, pp. 204–214. doi: 10.1016/j.procs.2019.11.116.
Thapa, N. (2020) ‘Being cognizant of the amount of information: Curvilinear relationship between total-information and funding-success of crowdfunding campaigns’, Journal of Business Venturing Insights, 14. doi: 10.1016/j.jbvi.2020.e00195.
Vadapalli, P. (2020) Bagging vs Boosting in Machine Learning: Difference Between Bagging and Boosting, UpGrad Blog. Available at: https://www.upgrad.com/blog/bagging-vs-boosting/.
Wang, Z. and Yang, X. (2019) ‘Understanding backers’ funding intention in reward crowdfunding: An elaboration likelihood perspective’, Technology in Society, 58. doi: 10.1016/j.techsoc.2019.101149.
Why using CRISP-DM will make you a better Data Scientist? (2020) Great Learning. Available at: https://www.mygreatlearning.com/blog/why-using-crisp-dm-will-make-you-a-better-data-scientist/ (Accessed: 6 June 2021).
Xia, Y., Liu, C. and Liu, N. (2017) ‘Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending’, Electronic Commerce Research and Applications, 24, pp. 30–49. doi: 10.1016/j.elerap.2017.06.004.
Xu, L. Z. (2018) ‘Will a digital camera cure your sick puppy? Modality and category effects in donation-based crowdfunding’, Telematics and Informatics, 35(7), pp. 1914–1924. doi: 10.1016/j.tele.2018.06.004.
Yuan, H., Lau, R. Y. K. and Xu, W. (2016) ‘The determinants of crowdfunding success: A semantic text analytics approach’, Decision Support Systems, 91, pp. 67–76. doi: 10.1016/j.dss.2016.08.001.
Zhang, Y. et al. (2017) ‘Determinants of loan funded successful in online P2P Lending’, Procedia Computer Science, 122, pp. 896–901. doi: 10.1016/j.procs.2017.11.452.