Artificial intelligence and bank credit analysis: A review

Abstract This article teases out the ramifications of artificial intelligence (AI) use in the credit analysis process by banks and other financing institutions. The unique features of AI models, coupled with the expansion of computing power, make new sources of information (big data) available for creditworthiness assessments. Combined, the use of AI and big data can capture weak signals, whether in the form of interactions or non-linearities between explanatory variables that appear to yield prediction improvements over conventional measures of creditworthiness. At the macroeconomic level, this translates into positive estimates for economic growth. On a micro scale, instead, the use of AI in credit analysis improves financial inclusion and access to credit for traditionally underserved borrowers. However, AI-based credit analysis processes raise enduring concerns due to potential biases and ethical, legal, and regulatory problems. These limits call for the establishment of a new generation of financial regulation introducing the certification of AI algorithms and of data used by banks.


Introduction
Algorithms help guide decisions in many areas, such as medical diagnostics, predictive justice, facial recognition, fraud detection, job search, and access to higher education (ACPR, 2018). The world of ABOUT THE AUTHOR HichamSadok is Full Professor of Business and Economics at Mohammed V University in Rabat. He is the author of peer-reviewed articles and books. He is interested, among other themes, in the roles induced by technology in the economy and management. Fadi Al Sakka is a professor at Hamdan Bin Mohammed Smart University. He works on the impact of strategic components, such as artificial intelligence programs, on the development of people, as well as on various business functions and processes. Mohammed El Hadi El Maknouzi is professor of Business law at the University of Abu Dhabi. He is the author of several books and articles on law and economics. El Maknouzi has a marked interest in the study of the changes induced by Artificial Intelligence on the contemporary structure of the future Law.

PUBLIC INTEREST STATEMENT
Artificial intelligence (AI) is now essential for the bank of tomorrow. It is closely linked to changes in technology and consumption patterns. For the banking sector, it is a powerful tool for analysing the creditworthiness of credit applicants and anticipating customer needs. This type of system can also make the bank fairer and more responsible. Moreover, the advantages offered by AI are not limited to the economic and financial realm. Indeed, this technology can help banks to actively contribute to establishing economic and social inclusion in the heart of their business. finance, too,hasn't been left untouchedby the data science revolution sparked by artificial intelligence (AI). AI has come under public scrutiny, as part of a new mythology of the digital evoking simultaneously hope and fear.For instance, AI is credited with the prospect of a revival of consumption, of cross-sector growth in productivity, and of improved risk management, to name a few. At the same time, it is connected to fears of technological job substitution, to the prospect ofskills retraining, to a widening digital divide and, more broadly, to a drift towards transhumanism, i.e. the transformation of humanity through technologically enhanced capacities (Bostrom, 2017).
Historical hindsight shows that AI is not the first technological "disruption"to have affectedthe banking industry: here it suffices to consider the rise ofautomatic teller machines and online banking. In 1990's, Bill Gates said "banking is necessary, banks are not". 1 This view exemplifies a widespread tendency to call into question the operating model of traditional banks whenever new technologies appear. While the fundamental forms of operation and decision-making within banks have not been dramatically revolutionised, still, the question around bank operations is legitimate in the light of the emergence of AI. Especially so, since the macro-financial conditions resulting from the 2020 crisis have put banks' profitability under stress due, among other factors, to near zero interest rates.
In this article, we explore further this question around the impact of AI on bank operations, focusing specifically on the core tasks of credit risk analysis and assessment. Traditionally, these are based on qualitative variables (business expertise, customer relations) complemented by statistical risk models that account for different dimensions of credit risk. For the purpose of this article, we focus specifically on the development of AI-based rating models for predicting credit default risk, in order to grant or refuse loans. While credit scoring might seem like a technical subject, it conditions the allocation of credit amongst economic agents. For example: which households will be able to own property, which companies will be able to finance their investment programmes, which companies will have to file for bankruptcy and, among these, how many will be liquidated. In view of these repercussions, it is easier to see how credit-scoring models have implications on financial stability, on the regulatory capital to be held by banks, on financial inclusion, employment, and economic growth.
AI disrupts the banking business because it makes it possible to use more, and different, kinds of data that can yield better credit risk predictions. In particular, AI is capable of manipulating "big data" collected as a result of consumer behaviour, the digitisation of customer relations, and information made available through such sources as social networks. Big data makes it possible to base credit score predictions on a broader range of variables than those traditionally included in the classic statistical models (typically, payment history and income). This raises the follow-up question, whether AI-based predictions using big data could make credit available to individuals or companies that were previously considered ineligible, following classical statistical modelling reliant on traditional databases.
To answer such questions, Section 1 reviews the literature on AI and its potential impact on the economy. Section 2 explores how AI might act as a catalyst for changing traditional models of credit and risk analysis.This discussion is expanded in Section 3, by looking more closely at the contribution of big data in complementing the efficacy of AI technologies. Section 4 moves on to the socio-economic impact of AI use in credit analysis. Before providing a summary in the conclusion, Section 5 fields a discussion of the limits of the approach under consideration.

Literature review on the role of AI in the economy
There is widespread agreement that AI is the third major technological revolution in economic history, after the onset of industrial production in the 19th century and of computer science in the 20th (Baldwin, 2019). While AI is acknowledged to be a driving force of change in contemporary business and society, the models and algorithms on which it is based suffer from a lack of public confidence. They are sometimes considered "black boxes" and their reliability are deemed limited due to a lack of robustness. This notwithstanding, advances in AI technology based on principles of "collective intelligence" (Servan Schreiber, 2018) are beginning to change this view.
Historically, Agrawal et al. (2016) consider that the concept of AI dates back to 1950, when Alan Turing (1950) proposed the following test: if a person chatting with multiple parties is unable to discern which one is a computer, the computer passes the test for artificial intelligence. According to O'Regan (2013), the purpose of AI was then described by Marvin Lee Minsky as the development of computer programs capable of taking on tasks performed unsatisfactorily by human beings, because ofthe demand they place on high-level mental processes (e.g., perceptual learning, memory, critical reasoning). Following this definition, AI has been implemented, in practice, through composite solutions built from software bricks or algorithms that process big data. The notion of "big data" obviously refers to data collection on a scale that is several orders of magnitude greater than would otherwise be possible through traditional databases. However, "big data" also implies a qualitatively wider range of information, including: structured and unstructured knowledge, language, perceptions, meaning, recognition of objects, and geographic information, among others. Combined, the foregoing features of big data make conventional information processing technologies appear out-dated. This is where AI has a role to play in helping manipulate big data, leading to improved decision-making. AI has developed along two different strands, with different degrees of technological maturity. The first is "symbolic AI", where a computer is programmed by a system expert, so that it can manipulate knowledge (this leaves the system expert in control).The second strand is "machine learning" (ML). 2 This covers advanced statistical models especially noted for the adoption of neural networks (Le Cun, 1987). These are statistical models that are programmed to mimic the functioning of neural networks, in the sense that they are capable of learning through iterative data processing. The enhanced computing power of modern-day hardware makes it realistic to run such processes on big data. The drawback of ML is that learning algorithms cannot account for what they have learned, which limits their acceptability. Alongside these two strands, there is a third one developing, which combines symbolic AI, ML, and natural language. This third strand develops the capability to integrate knowledge from various sources and also tries to implement explanation and transparency (Pearl & Mackenzie, 2018).
Automation that relies on AI is able to expand productivity beyond the direct coding capacities of computer scientists, because it joins the capability to learn from previous iterations with the power to process wide learning samples. Hence, AI can introduce greater efficiency in almost all functions of business organisations. For instance, it can transform human resource management processes by improving decisions on the attraction, retention, and skills development of employees. This can be achieved, for instance, through predictive metrics and analytical algorithms to scrutinise employee skills, and identify the most suitable candidates for each function (Durai et al., 2019). Moreover, AI makes it possible to query the mass of data available in human resources information systems (HRIS) using qualitative and quantitative algorithms. This possibility affords an additional decision support tool for reviewing organisational business processes (Brockbank et al., 2018). Bughin et al. (2018) thus predict that AI-based technologies could afford a 20 to 40% improvement margin in work efficiency, across industrial economies. In turn, this optimisation of costs promises to deliver a boost to annual economic growth in the range of 0.5% in industrial countries-a percentage that could rise to 1.5% if the deployment of AI were accompanied by innovations improving well-being at work. On similar grounds, audit firm Accenture (2017) projects that AI could double the growth rates of twelve major Western economies by 2035 on grounds of "new relations between seller and customer and between man and machine", and improve work efficiency by almost 40% in some countries. In a 2017 study, PWC (2017) estimates that the specific contribution of AI to global GDP between 2018 and 2030 stands at $ 15,700 billion, an increase of 14%. This growth can be ascribed mainly to productivity gains (55%) and to a recovery in consumption (45%). Bughin et al. (2018) also project that 90% of jobs will be transformed as a result of AI. While only 1% could be fully automated, AI could still take on a third of the tasks involved in around 60% of jobs. Repetitive tasks, like interaction with customers, routine operations, certain administrative support functions, and accounting would be threatened by technological substitution. Instead, the roles of manager and technician-especially digital technician-should prosper. In another study, Boston Consulting Group (2018) estimates that 32% of banks in China have already integrated AI in their daily operations, compared to 22% in the United States and 20% in Germany and France.
Despite these projections around the disruptive contributions of AI to the economy, a significant increase in labour productivity based on AI has yet to be recorded, confirming so the Solow paradox-whereby investments in improved computing capability do not yield matching returns. According to Gantz and Michaels (2015), the sectors grouped under the term "robotics"-within which AI is predominant-appear to account for only a 0.4% yearly increase in GDP between 1993 and 2007, across the seventeen leading industrial countries. Brynjolfsson and McAfee (2014) add that while AI would not, in the short or medium term, account for significant productivity gains (save for a few specific activities), it would nevertheless lead to vast changes in the world of employment. This prediction has been broadly confirmed by Furman and Seamans (2018) and the OECD (2018). According to Baldwin (2019), AI could accelerate the offshoring of jobs and intensify the dematerialisation and disintermediation of production and trade processes. It might also shorten value creation chains and decision-making circuits within organisations and their ecosystems, and thereby encourage the emergence of new work processes, in principle more agile and less costly, in areas ranging from data collection to decision-making.

The impact of AI on credit analysis procedures
A significant area in which AI makes it possible to improve banking operations is the management of risk, by strengthening credit scoring, portfolio management, fraud detection, the optimisation of debt collection strategies, the rapid detection and interpretation of signals from weak borrowers, and the construction of economic models, among others. Some AI applications can also help securely keep and analyse the large flows of data that banks are required to collect by law for managing customer relationships. Historically, credit scoring was one of the first applications of AI to the banking sector, specifically through the use of ML. Leloup (2017) suggests that the introduction of AI to the banking sector has potential to reconfigure the commercial relations between banks and their stakeholders after principles of objectivity and trust. The same commentator goes as far as visioning a "second digital revolution" based on "ethical AI"-a point that will be explored further in the Discussion section, as one of the frontiers where AI is in need of fine-tuning.
Leaving aside for a moment the question of AI's predictive performance compared to traditional statistical risk analysis models, AI-based methods for the analysis of banking risks have another undeniable advantage over the usual parametric scoring approaches (Bedi et al., 2020). Namely, they allow significant productivity gains, particularly in the following areas: data pre-processing, data managing, and modelling in service of decision-making (Athey, 2019;Athey & Imbens, 2019;Charpentier et al., 2018;Mullainathan & Spiess, 2017;Varian, 2014).To understand this point fully, it is necessary to consider the heavy workload traditionally required of a statistician in charge of building a credit rating model in a bank's risk department.
The first stage of his or her work consists in treating the body of available data in different ways. Among these, first come checks for missing or aberrant data, which require the establishment of detection, imputation, and exclusion procedures. Subsequent steps include grouping the data under categories of discrete explanatory variables and the discretisation of continuous variables. For each qualitative variable, modalities are determined so as to reduce the number of classes and to maximize the discriminating power of each variable. This involves, on the one hand, capturing potential non-linear effects and, on the other hand, reducing the influence of extreme values or uncorrected outliers. The number of classes and the discretisation thresholds are arrived at through iterative algorithms. These are built with the objective of maximising a measure of association between the target variable and the explanatory variables, according to Cramer's V-type or chi-square test. The second step consists of analysing the correlations between predictors, in order to verify that the variables are not too correlated with each other. Based on these correlations, the expert then decides to remove certain redundant variables according to a principle of parsimony. The third step involves selection of the explanatory variables for the score model. Within the framework of a given score model (for example, a logistic regression), one needs to select-from among the restated variables-those which make it most likely to predict a default. Depending on the number of variables available, this selection can either be carried out manually or through set procedures, e.g., stepwise. Selection using set procedures is often complemented by business expertise and a more detailed analysis of the model (e.g., consideration of marginal effects and odds ratios).
Compared to this setup, AI algorithms parse available data to determine the optimal functional form of the model, within the meaning of a certain criterion (Chaisuwan & Chumuang, 2019). This therefore makes the step of selecting explanatory variables for the score modelobsolete. For instance, the use of a classification tree-or of algorithms based on trees, such as random forests-makes the work of discretising continuous variables and earlier grouping methods redundant. This is because AIbased techniques independently determine the optimal discretisations and groupings of modalities. Against this setup, the analysis of correlations between predictors becomes less critical, in the sense that most AI algorithms already incorporate strongly correlated predictors. These productivity gains in the risk modelling process due to AI are now evident in the banking sector. Grennepois et al. (2018) point out that the predictive performance of AI algorithms is generally robust to the non-imputation of missing values, the presence of strong correlations between certain explanatory variables, the nongrouping of categories of discrete variables, and the non-discrimination of continuous variables. This robustness therefore makes it possible to limit the pre-processing treatment of data.
Beyond productivity gains, limiting data pre-processing also reduces modelling bias. This is because, ultimately, AI lets raw data express itself. The use of AI thus allows greater automation in the credit granting process, including in the construction and review of risk models. Using data on mortgage processing times in the United States, Fuster et al. (2018b) show that financial actors that systematically use AI ("FinTechs") process loan applications around 20% faster than other lenders, without noticeable deterioration in the quality of file selection.
The productivity and predictive performance gains just described are all the more accessible in light of a real democratisation of algorithm use. This has been greatly simplified thanks to the development of streamlined and efficient procedures. Let us consider, for instance, the major software applications in the area of credit risk modelling: SAS, R, and Python. Each of them comes with built-in procedures, packages, or environments for implementing the main AI techniques. Focusing on these AI techniques, Fuster et al. (2018b) have shown how the companies being granted credit by FinTech institutions possess the same characteristics of companiesto which traditional banks have refused credit. This result suggests that FinTechs contribute to the financial inclusion of small borrowers, although it still remains difficult to ascribe this result specifically to AI or to the use of bigdata.

The role of big data in AI-based credit analysis
When discussing the contribution of AI to credit analysis, it is difficult to discriminate between the gains from algorithm use and those from the availability of big data. Let us consider the case of a loan. Here, explanatory variables generally include the nature of the loan, the characteristics of the borrower (age, income, marital status), and his or her banking history. A typical example of a rating based on these variables would be the FICO 3 score, which is widely used in the US financial industry to assess the creditworthiness of retail clients. This score factors in such variables as payment history, outstanding debts, length of credit history, and the recent opening of new accounts, among others. Conversely, big data is drawn from a much more varied range of sources, either through the digitisation of customer relations (digital fingerprint data) or by leveraging new forms of customer information, such as social network activity. It is not uncommon forbig data to aggregate very disparate sources of information, even with no apparent link to the creditworthiness of clients.
These types of data can be used either by traditional financial actors (banks) or by FinTechs. Depending on the type of financial institution one considers, it is likely that big data undergo different kinds of treatment. FinTechs and similar credit institutions (e.g., lending platforms, online banks, neobanks, and certain merchant sites) use big data directly to construct scores for internal use. On those scores depend such decisions as: the extension of credit, the conditions of financing, and risk control of the loan portfolio. A different use of big data is that undertaken by consulting firms that build credit risk scores for sale to lending institutions. This outsourcing of the collection and analysis of big data is therefore similar to the outsourcing of traditional scores, such as FICO in the USA. At the same time, depending on the nature of the data collected, it raises specific questions in terms of liability and regulatory compliance. Some FinTech soffering credit scores based on big data promise to integrate data on social network activity by the lender company and its managers, as well as data relating to the browsing mode (e.g., IP address, device used, browsing behaviour) of online loan applications. For example, the start-up NeoFinance uses data relating to the quality of the job held by the loan applicant and of his professional relations on the LinkedIn network. FinTech Lenddo aims to develop financial inclusion in developing countries, by mobilising non-traditional data to provide both a credit rating (Lenddo score), but also a form of identity verification (Lenddo verification). Lenddo's strategy is clearly to bypass the need for an official credit score (like FICO or credit bureau 4 ), in order to allow as many people as possible to access credit. Their rating mobilises different sources of information: customer activity on social networks (e.g., Facebook, LinkedIn, Twitter), connections with people at risk, navigation data from smartphones or computers by the loan applicant. On a related note, the ZAML (Zest Automated Machine Learning) technology implemented by FinTech Zest Finance is very illustrative. It builds a score from very disparate sources of data such as digital fingerprints, the number of times the client has moved, and the intellectual level measured by the vocabulary used in writing and by typing error detection, among others (Jagtiani & Lemieux, 2019).
Yet another use of big data is undertaken by commercial actors, for better assessment of the risk incurred with their stakeholders (Bussmann et al., 2021). Berg et al. (2019) note the case of a large e-commerce company based in Germany, which allows its customers to pay for their purchases only upon receipt of the goods, within a period of fourteen days. Every transaction is therefore construed as a short-term consumer loan, which assumes that the company is able to assess accurately the creditworthiness of its customers. To do this, it relies on the digital fingerprint data left by customers' browsing activity in the run-up to an online purchase. The considerations offered here sketch a picture of possible combinations between big data and AI-based analysis techniques in the credit analysis process. They do not yet address the wider socio-economic impacts (beyond the financial institutions using them directly) of AI use in connection to creditworthiness assessments. Wei et al. (2016) put forth a theoretical assessment of the impact that AI-based models (using big data generated through social media activity) might have on the quality of credit scores. They conclude that they might backfire, by inducing strategic changes in the patterns of activity on social media platforms by potential borrowers. For instance, they might restrain their connections on social networks or favour contacts with members of socio-professional categories that are less exposed to job loss, such as civil servants. This means that AI-based scoring models aimed at improving predictive results might in the long run induce behavioural changes aimed at performing well according to the chosen indicators. That is, the use of social networks might initially improve the accuracy of credit scores by processing a larger body of information, obtained by mobilising diverse and in-depth sources of information on the life and behaviour of potential borrowers. At the same time, the learning effect of loan applicants vis-à-vis the automated process, adverse selection of customers, and a trend towards the increasing fragmentation of social networks (and information) all conspire to reduce the initial predictive gains from AI use, compared to traditional methods.

The socio-economic impact of AI use in credit analysis
Empirically, the contribution of AI to improving financial decision-making remains the subject of much debate. FinTechs like Lenddo and Big Data Scoring predictably stand by their use of AI algorithms to treat masses of data, which is precisely their core business. However, others, like Zest Finance, have rejected the use of AI to process social media data, questioning both its usefulness and its legitimacy. A study by Berg et al. (2019) observed that taking into account a customer's fingerprint has considerably improved the predictive performance of scoring models, for an e-commerce company that extends credit to consumers till the delivery of goods. Similar results were obtained by Frost et al. (2019) in their analysis of Argentinian platform Mercado Libre, which offers loans to small businesses. The authors show that AI-based credit scoring techniques outperform credit bureau ratings in predicting loss rates, especially for riskier companies. More recently, Óskarsdottir et al. (2019) have modelled the odds of default on credit card debt, by using detailed mobile phone statements to reconstruct the social connections of cardholders. The anonymised data used by the authors includes 90 million telephone numbers, 2 million bank customers, and incorporates a range of socio-demographic and banking information. The social connections of individual cardholders are modelled through around 200 statistics to capture their ties to other clients who have experienced late payments or other banking incidents. The authors show that taking into account certain features of the cardholders' network of telephone contacts makes it possible to increase the precision of credit and solvency analysis predictions, compared to predictions reliant solely on traditional socio-demographic or banking data. Incidentally, this conclusion opens up interesting prospects, especially for developing countries, the population of which is often underserved by bank accounts, but where cell phone use is widespread.
Bazarbash (2019) draws attention to the fact that traditional banks might often refrain from assessing the creditworthiness of small borrowers, on the assumption that low repayment expectations and potentially high loan risk would not even cover the costs of assessment. In this respect, the use of AI and its ability to process a wider range of data enables institutions like FinTechs to enter a terrain that has thus far remained unexplored. Namely: to try and assess the creditworthiness of riskier smaller borrowers by frequently issuing small loans and monitoring repayment behaviour. In this respect, AI, coupled with big data use, can improve access to credit for the financially excluded and small businesses that cannot post financial guarantees.This reasoning helps account for the results by Schweitzer and Barkley (2017). After analysing a large database of loans taken out by small American companies in 2015, they have noted how companies receiving financing through FinTech platforms fell in the same qualitative group of borrowers that would traditionally be refused credit by traditional banks.
In a similar way, Jagtiani and Lemieux (2019) show that alternative sources of data mobilised by Lending Club's AI algorithms allow clients, who would have been traditionally placed in the riskiest categories, to now be regarded as"worthwhile" risks and thereby access credit on better terms. About 8% of borrowers rated A (best) by Lending Club's scoring method had FICO scores below 680 (poor or fair), and 28% of borrowers rated B had FICO scores in the same range. Similarly, Berg et al. (2019) also found that the digital fingerprint data used by AI gives an informational basis for accepting borrowers with a fragile profile, who would otherwise not have been accepted-solely on the basis of credit scores from traditional databases.
A related, but equally important, finding is described by Bartlett et al. (2019). Namely, they demonstrate how AI can contribute to reducing ethnic discrimination, by studying the American mortgage market. They use different databases for their study, including in connection with the Home Mortgage Disclosure Act, which covers nearly 90% of mortgage loans in the United States over the period 2009-2015. Their results show that traditional lenders charge Latin American and African American borrowers more by 7.9 and 3.6 basis points, respectively, all other things being equal. Globally, this represents an additional interest burden of almost $765 million per year. Discriminatory outcomes do not disappear with FinTechs, but they are attenuated in the order of 40% compared to traditional banks, according to the same study. Bartlett et al. (2019) also observe that traditional lenders reject requests from Latin Americans and African Americans about 6% more often than they reject requests from clients who are not drawn from these minorities. At the macroeconomic level, over the period 2009-2015, this represents 0.74 to 1.3 million Latin American and African American clients whose loans could have been accepted, had there not been any discrimination. Zest Finance claims that if its AI credit and risk rating tools were applied across the United States, this would reduce the gap in mortgage approval rates by 70% between white and Hispanic borrowers and by 40% with AfricanAmerican borrowers, allowing more than 172,000 additional people each year to become homeowners (Fuscaldo, 2019).
In view of the foregoing, it seems clear that the application of AI to credit analysis can act as a powerful factor of social inclusion. At the same time, it is a source of concern that new sorts of mistakes-different to the ones of humans-might enter the credit analysis process (Houdé, 2019).

Discussion: Limitations in the application of AI to credit analysis
AI algorithms for credit analysis can detect fine nuances, if enough data is available to train the most relevant model possible. However, this flexibility comes at a cost: that of opacity. Indeed, for some AI methods, it is difficult, if not impossible, to know what variables-and their respective proportions-algorithms end up selecting as a basis for their predictions. These algorithms therefore work as"black boxes"that associate predictions on the target variable with a set of predictors, without disclosing the origin and the proportions of these predictions. This is particularly true for aggregate methods like bagging or boosting, 5 which otherwise yieldstrong predictive performance. Obviously, this opacity raises serious ethical and legal concerns. It is also troubling from the standpoint of financial regulation, since these models are used to guide decisions affecting the lives of individuals or companies, most notably the granting of credit.
Reliance on big data is another source of tension between banks using AI and regulators.On the one hand, there is the banks' desire to measure risk more accurately. On the other, there is a need for protection of customers' personal data. Thus, rules on prudential banking ratios like those prescribed by Basel III/IFRS9 seem to come into conflict with, for example, the General Data Protection Regulation (GDPR)-the reference normative text in Europe on the protection of personal data. Regulation governing the assessment of credit risks does not usually prohibit reliance on big data sources currently in use. This is also thanks to data anonymisation techniques, often prototyped in the medical field, which make it possible to share personal data with third parties in a perfectly secure manner. Nonetheless, when risk analysis and assessment models are outsourced to an external entity, the bank and its responsible officer are exposed to liability in the event of a breach of confidentiality protocols. Outsourcing of risk assessment also raises the related question of ensuring that data transferred by the bank to the risk assessor is not stored further, after the development of the risk assessment model has been completed (Lam & Hsiao, 2019).
One of the principal challenges posed by reliance on AI is the question of ascribing liability for harm, when this is caused by the operation of a self-directing system (Soulez, 2018). We might not be too far from an era in which financial and investment decisions might be controlled-not by humans, but-by smart machines featuring cognitive and decision-making processes (De Vauplane, 2015). This transformation will have a definite impact on legal rules, in particular it might mark a shift from bank liability for human mistakes, to a new type of bank liability for mistakes committed by automated systems reliant on AI technology (Costello et al., 2020).
Alongside legal issues such as these, there are separate ethical questions arising from the use of big data, in particular data originating from social network activity. Let us consider the following example: that of a customer who has his or her access to credit downgraded because, all other things being equal, he or she has bad payers as contacts on social networks. This sort of outcome is not, per se, legally blameworthy, but it clearly poses an ethical problem. What considerations might help guide decisions around the social acceptability of theseuses of social network data? Óskarsdottir et al. (2019) have tried to suggest a solution inspired by manufacturing dashboards. When assigning scores to be displayed on dashboards, it is common to discretise any continuous explanatory variables by assigning a score to different segments, according to their contribution to the detection of a production defect. An ethical use of this principle in the context of credit scoring would be to assign a score of zero to data segments that would disadvantage borrowers, while assigning positive weights to segments that might facilitate their access to credit. The application of an ethical penalty to variables drawn from big datawould certainly lead to a deterioration in AI's predictive performance, but would better guarantee its social acceptability.
It is submitted that the ultimate criterion for resolving ethical ambiguity about the use of big data by AI might be that of customer acceptability. Beyond legal or moral issues, the crux of the matter is knowing whether customers would consent to having some of their personal data used as part of their loan applications. In the case of mortgages, for example, it is known that-apart from business cycle and the unemployment rate-one of the most important predictors of default is divorce. As a consequence, any variable predicting divorce will also be a good predictor of default. If a bank were to use a creditworthiness score based on the AI analysis of extramarital dating sites, would customers be willing to accept this type of approach in order to obtain more favourable loan terms?
Another major risk of AI and big data is the emergence of bias or unfair treatment (ACPR, 2018). Applied to the field of credit scoring, the question is whether AI algorithms can come to penalise certain populations, or even exclude them entirely from accessing credit. For instance, AI algorithmsmight select as predictors certain variables that would be considered discriminatory such as: gender, ethnicity, sexual or political orientation, among others. Even carefully checking source data beforehand does not rule out that AI could develop unfairly biased scoring models. Instead, a human model builder in a financial institution would not take the moral, legal, or reputational risk of a scoring model that included discriminatory variables, even if the data were available and explanatory.
Moreover, the absence of discriminatory variables in the source data does not completely guarantee the absence of bias in AI-based scoring models. In fact, biases can creep in more subtly, in indirect ways, i.e. through other variables that give rise to what is called "proxy discrimination" (Prince & Schwarcz, 2019). This term describes those situations where discrimination results from the interaction or triangulation of several variables, which do not in themselves appear to be discriminatory. For example, an AI algorithm can intersect several acceptable variables such as income and type of housing implicitly to predict the place of residence, and use that information to discriminate against customers residing in sensitive districts. The risk is all the greater,considering databases are rich in predictors, and AI algorithms have many ways of combining those to identify interactions between a large number of variables. For instance, by analysing a database of mortgages in the United States, Fuster et al. (2018a) have shown how, through the shift from logistic regression scoring to an AI-based approach, black and Hispanic borrowers lost out compared to white borrowers.In sum, there remain problematic legal or ethical issues. On top of which, AI based technology also requires a degree of buy-in from customers. In credit risk management, this, too, remains a credible obstacle to the widespread adoption of AI.

Conclusion
Traditional approaches to credit analysis in banking combinea variety of data pre-processing and parametric statistical approaches that offer reliable performance, e.g., logistic regression scoring. If the mass of data is held constant, AI algorithms only offer marginal performance gains, alongside some productivity gains because oftheir operating methods. However, with the introduction of AI the basis of data does not remain unchanged. This is because AI techniques make it possible to mobilise new sources of information, known as big data, which could not have been integrated into traditional credit risk management models, due to their size.These new sources of information mobilised by AI make it feasible to capture weak signals-whether in the form of interactions or non-linearities-which, without always knowing the reason, seem improve assessment of customer creditworthiness. More fundamentally, these aggregate predictive gains sometimes translate at the microeconomic level into individual gains,for instance, by improving financial inclusion and access to credit for the most vulnerable borrowers.At the same time, these new sources of data can give rise to many biases evoking ethical, legal, and regulatory questions-even without banks noticing. These emerging opportunities, and their attendant risks, call for the implementation of a new generation of financial regulation reformingthe legal rules onbank liability, and introducing forms of certification for AI algorithms and fordata used by banks.