How Counterfactual Fairness Modelling in Algorithms Can Promote Ethical Decision-Making

Abstract Organizational decision-makers often need to make difficult decisions. One popular way today is to improve those decisions by using information and recommendations provided by data-driven algorithms (i.e., AI advisors). Advice is especially important when decisions involve conflicts of interests, such as ethical dilemmas. A defining characteristic of ethical decision-making is that it often involves a thought process of exploring and imagining what would, could, and should happen under alternative conditions (i.e., what-if scenarios). Such imaginative “counterfactual thinking,” however, is not explored by AI advisors - unless they are pre-programmed to do so. Drawing on Fairness Theory, we identify key counterfactual scenarios programmers can incorporate in the code of AI advisors to improve fairness perceptions. We conducted an experimental study to test our predictions, and the results showed that explanations that include counterfactual scenarios were perceived as fairer by recipients. Taken together, we believe that counterfactual modelling will improve ethical decision-making by actively modelling what-if scenarios valued by recipients. We further discuss benefits of counterfactual modelling, such as inspiring decision-makers to engage in counterfactual thinking within their own decision-making process.


Introduction
Organizational decisions-makers, such as leaders, often have to make difficult decisions.An increasingly popular way to make better decisions is to use information and recommendations provided by data-driven algorithms, i.e., artificial intelligent (AI) advisors (Chong et al., 2022)).Organizations are using such technological advances (e.g., in machine learning, natural language processing, and other forms of artificial intelligence) to provide relevant and instant recommendations to decisionmakers (Logg et al., 2019).In some cases, such as Uber, algorithms are even taking on supervisory roles and engage in algorithmic management (Jarrahi et al., 2021;Lee et al., 2015;M€ ohlmann & Henfridsson, 2019;Nagar & Malone, 2011;Schildt, 2017).AI advisors also used big consultancy firms (e.g., McKinsey) as strategic advisors for investments (Schrage, 2017;Mertens, 2023).With the popularity of ChatGPT, the use of AI as an advisor might even become more frequent.
Advice is particularly useful when organizational decisionmakers have to make judgment calls, which is mostly the case when conflicts of interest are involved (i.e., ethical dilemma situations).Ethical decision-making entails making choices that are aligned with moral norms and standards while treating all parties involved fairly (Franklin & Guerber, 2020;Trevino, 1986).To ensure fairness and uphold moral standards, decision-makers can employ a strategy of exploring and imagining counterfactual scenarios when making judgment calls.Such counterfactual scenarios involve asking questions such as "What would happen if I do this," "What should happen?"and "Could I make such decision?"(Folger & Cropanzano, 2001).For example, a hiring committee might consider whether to hire a particular candidate who did not graduate from a prestigious school, and then evaluate whether the prestige of a school would make a difference.Such questions help to detect potential biases that decision-makers may have.Indeed, empirical research has shown that decisionmakers who explored counterfactual alternatives were less likely to make biased decisions and were more receptive to information that challenges their thinking (Galinsky & Moskowitz, 2000;Kray et al., 2006;Kray & Galinsky, 2003).
Interestingly, recently various counterfactual modelling techniques have seen an upsurge in AI-driven systems (Guidotti, 2022).These contemporary counterfactual AI models are used to make AI predictions more interpretable.Specifically, the algorithms identify the nearest counterfactual, which refers to the minimal changes required in input features that would alter the AI recommendation (Wachter et al., 2018).For example, consider the scenario where an individual applies for a loan at a bank, and an AI advisor recommends rejecting the application.A possible nearest counterfactual-output might be: "If your annual income was $10,000 higher, we would have approved your application."This counterfactual scenario can help an applicant understand why an AI model rejected their loan application, and which factors or attributes were important to the AI model (Yao et al., 2022).
The nearest counterfactual is determined by the minimum change in input features to alter the outcome, without explicitly considering the content of those changes.This approach is agnostic about the nature of the characteristics of the nearest counterfactual.The nearest counterfactual could be "If your annual income was $100 higher, your loan application would have been approved," "If you were three years younger, your loan application would have been approved," or "If you lived three miles closer to the city centre, your loan application would have been approved."While these counterfactual scenarios provide insight into how an AI advisor arrived at a particular recommendation, these explanations may not always be perceived as equally fair by recipients.
Solely modelling the nearest counterfactual ignores the fact that recipients themselves also engage in their own counterfactual reasoning to assess the fairness and ethicality of the decision-making process.We believe that this is where current algorithm development and modelling is limited.That is, when it comes down to determining the fairness of employing algorithms, there is limited attention to the subjective fairness perceptions of the recipient of the decision that the algorithm made (see De Cremer, 2020).
Fairness Theory (Folger & Cropanzano, 2001), argues recipients focus on different counterfactuals to come to fairness perceptions.This theory states that recipients tend to explore three counterfactual thoughts: Would a different decision by the decision-maker result in a better outcome, could the decision-maker have made a different decision, and should the decision-maker have made a different decision.Depending on how these counterfactual scenarios are answered, recipients will perceive the decision-maker to be more versus less fair and ethical.Therefore, when decisionmakers (e.g., leaders) explore these counterfactuals before deciding, they will be better able to explain their decisions to recipients (e.g., employees), which will then promote the perceived fairness.In fact, research showed that appropriate explanations (e.g., those that include counterfactuals) help to ensure that employees accept a decision, even if they receive a poor outcome, because they believe that the decision-making process itself is fair (Shaw et al., 2003).However, despite the importance of these three types of counterfactual considerations in recipients' fairness perceptions of a decision, AI advisors do not incorporate those counterfactual considerations into their development of algorithmic advice-giving.
In this paper we will draw on Fairness Theory (Folger & Cropanzano, 2001) to introduce new ways in which programmers can incorporate counterfactual modelling into the development of AI advisor systems.We argue that when AI advisors also explore these counterfactual scenarios (i.e., would, could, and should) they can provide more useful and open-minded advice, and as such help decision-makers to better explain their decisions.Recipients will more react positively (see the decision as fair and legitimate)even when they receive poor outcomes (Shaw et al., 2003).
Furthermore, more sophisticated AI advice that includes these different counterfactual scenarios has several other advantages.First, AI advisors will be more effective in improving ethical decision-making by actively engaging with counterfactual scenarios.This will allow decision-makers to understand better how an AI advisor made its decisions.Consequently, decision-makers will be better able to explain their decisions and how they used the AI advice, leading to increased fairness perceptions.Secondly, by actively considering counterfactual scenarios, AI advisors facilitate open-mindedness and reduce biases in the decision-making process as some of these biases will be detected more quickly.Thirdly, counterfactual-oriented AI advisors can also motivate and inspire decision-makers to adopt a counterfactual thinking mindset themselves for decisions where no algorithms are involved.Previous research has indeed revealed that a counterfactual thinking mindset can be activated by a prior context and transferred to a subsequent context (Gollwitzer et al., 1990).

Counterfactual modelling in AI advisor systems
Currently, counterfactual modelling in AI advisor systems is geared at making how to algorithm came to a decision interpretable (Asher et al., 2022;Murdoch et al., 2019).This is an important step to arrive at more explainable AI and thus increase fairness perceptions.Indeed, AI recommendations are considered a black box, where one does not know how exactly an AI model arrived at a particular recommendation (Murdoch et al., 2019)-but counterfactual modelling can make this prediction process less vague and more transparent.In particular, as we noted earlier, AI recommendation models do so by looking for the nearest counterfactual which involves making the smallest changes to input features to alter the model's prediction (Guidotti, 2022;Wachter et al., 2018).
There are different ways to model nearest counterfactuals (e.g., white-box optimization, heuristic search strategies, instance-based strategies, and decision trees; Guidotti, 2022).Some authors provide criticism and recommendations about the assumptions programmers have when programming counterfactual models, such as the way gender and ethnicity are defined, and that these assumptions influence the model performance (Kasirzadeh & Smart, 2021).These are valid concerns, but the basic principle of how a counterfactual model is used remains the same regardless of the specific analytical techniques used and considerations considered.The central focus of all these models is identifying small changes in the input that result in a change in prediction.
To illustrate how this idea of nearest counterfactual works, we will now provide a detailed scenario: Imagine an AI advisor that screens job applicants based on their CV, personality tests, and intelligence tests, and predicts that a particular candidate should not be hired.Below are two examples of the nearest counterfactuals for different candidates (Wachter et al., 2018).
Candidate 1: If you had a master's degree, you would have been selected for the next round in the selection process.
Candidate 2: Your verbal reasoning score was 60, if your verbal reasoning score was 20 points higher, you would have been selected for the next round in the selection process.
These nearest counterfactuals provide explanations for individual decisions.It is case specific, and the features of the counterfactual explanation can differ depending on each instance.The nearest counterfactual scenario could very well refer to a negligible change, as in the example below.
Candidate 3: Your verbal reasoning score was 79, if your verbal reasoning score was 1 point higher, you would have been selected for the next round in the selection process.This example highlights that not all nearest counterfactuals provide the same quality of explanation.In this third example, the counterfactual provides insights into how the AI advisor came to a prediction, but clearly it may not be viewed by the applicant as an adequate explanation for not getting the job.While the strategy and principle stay the same in the case of Candidate 2 and 3, the psychological impact differs when applicants experience near misses.Candidate 3 had 79 and only needed one point more, while Candidate 2 missed by a more substantial amount and needed 20 points more.The nature of these nearest counterfactuals is important to consider because especially under situations that involve near misses, people care more about fairness and react more negatively when they deem the outcome or procedure unfair (De Cremer & van Dijk, 2011).
Despite this, machine learning scholars seem to assume that any counterfactual explanation will satisfy users (see e.g., Yao et al., 2022), as long as the counterfactual remains valid and actionable (Kasirzadeh & Smart, 2021).The algorithm that identifies the nearest counterfactual does not consider a range of alternative situations but only which counterfactual situation requires the least amount of change (a change of 1 [candidate 3] or 20 [candidate 2]) to obtain a different outcome (e.g., meeting the threshold of 80 and being selected).It first looks at the change in outcome (e.g., will you meet the threshold and be selected), and then identifies the "most optimal" counterfactual situation that led to this change (i.e., how much change is needed to obtain this outcome).
This approach, however, fails to acknowledge that fairness concerns extend beyond data transparency.When individuals evaluate decisions and their resulting outcomes, they rely on their own perspectives and attempt to make sense of them.Specifically, they will assess whether the decision and resulting outcome was fair or not from their own viewpoint.To do so, recipients first evaluate certain counterfactual scenarios (i.e., what-if scenarios) and then assess the impact of these counterfactual scenarios (Folger & Cropanzano, 2001).For example, the candidate will question whether they would have been better off if they have been selected for the position, whether the hiring manager could have selected them for the position, and whether the hiring manager should have hired them for the position.As we will explain below, these counterfactual thoughts will determine how the candidate will feel and respond to not being hired in terms of judging the use of algorithmic decision-making as fair or not.
For the above reasons, we therefore argue that AI advisors should not only focus on explaining how an AI model arrives at predictions (i.e., interpretability), but also incorporate the perspective of the recipients of the decision and how they reason to secure that the decision is being seen as fair (De Cremer, 2020).As such, it is necessary to focus also on the expectations and information needs that the stakeholders of AI-assisted decision have.To achieve this, in the present paper, we therefore draw on insights from psychology, and argue that it is possible to further improve counterfactual modelling by incorporating key insights from the human psychology of causal reasoning.We will do so by relying on Fairness Theory (Folger & Cropanzano, 2001).

Fairness Theory: Which counterfactuals do recipients explore?
When decision-makers are faced with an ethical dilemma, recipients will inevitably receive an outcome that they perceive as negative.According to Fairness Theory (Folger & Cropanzano, 2001), when negative events occur, recipients engage in counterfactual thinking to understand what happened.Fairness Theory thus provides insights in what kind of counterfactuals AI advisors can model.The counterfactual thoughts recipients explore focuses on three aspects: (1) would they feel better if the decision-maker made a different decision, (2) could the decision-maker have made a different decision, and (3) should the decision-maker have made a different decision (Folger & Cropanzano, 2001).These thoughts determine how recipients will eventually feel and react to decisions.For example, when employees believe their supervisor could and should have acted differently, they are likely to blame their supervisor for the poor outcome and feel angry towards their supervisor.In contrast, if the outcome is negative, but the employee believes that the supervisor could not have made a different decision, they are less likely to feel angry towards their supervisor (Weiner, 1985).Below, we discuss in more detail what thought and judgment processes are elicited as a function of the type of counterfactual the recipient engages in (i.e., would, could, should).
First, when a decision has been made, recipients explore whether they "would" be better off if the decision-maker made another decision.This question comes down to estimating the causal negative impact of the decision.An employee who is denied a promotion might ask whether they would have been better off if they had received the promotion.Or, a person who was denied a loan might wonder whether they would have been better off if they were granted the loan.If the counterfactual scenario is seen as more desirable, then the actual scenarioand resulting outcome emerging from the decision -is seen as harmful relative to the counterfactual scenario (Folger & Cropanzano, 2001;Nicklin et al., 2011).
The could-counterfactual thought concerns addressing the question whether the decision-maker (a) is believed to have had any influence over the outcome and (b) could have prevented the detrimental outcome (Folger & Cropanzano, 2001;Nicklin et al., 2011).This counterfactual thus assesses whether the decision-maker had any discretionary control over the outcome.For example, an employee who was not promoted might feel less angry towards a supervisor deciding on the promotion when it is organizational policy to always promote the most senior member in the team.The employee might still feel that the outcome is unfair but might not consider their direct supervisor responsible.
Finally, should-counterfactual thinking assesses whether the decision-maker upheld moral norms and obligations.A decision is perceived as unfair if recipients believe that the standards to which one is expected to adhere to are violated (Folger & Cropanzano, 2001).For example, consider a company with a policy that prioritizes the promotion of minority members.If the only minority member in a team is not promoted while a majority member is, the minority member will question what should have occurred according to the organizational policy.In this scenario, it is evident that the minority member will perceive the decision as unfair, as they believe the supervisor violated the established norms and policies of the organization.
The way a should-counterfactual is evaluated depends on the recipients' primary concerns (Folger & Cropanzano, 2001).Recipients tend to focus on two aspects: the outcome of a decision, and the process by which the decision was made Interestingly, individuals differ in terms of which aspect they prioritize.Some individuals place greater emphasis on the outcome, some prioritize the process.For example, for a utilitarian, is concerned about the outcome (Schminke et al., 1997).A 'right' decision is one that achieves the best outcome for the greatest number of people (Hume, 1969).Therefore, to determine whether a decision is justified, they will assess whether the decision maximized the welfare of the relevant stakeholders.Others, in contrast, focus more on the process (Folger & Cropanzano, 2001) and expect that an ethical decision is one that follows important moral principles (Kant, 1959).These principles, of course, are influenced, at least partly, by the recipients own ethical standards and perspectives.As such, for these individuals an ethical decision is one that they consider as procedurally fair and minimizing biases (Colquitt, 2001).
There is empirical support for the underlying mechanisms of Fairness Theory.Spencer and Rupp (2009) provide experimental evidence that that people engage in counterfactual thinking after experiencing a negative event, and that these counterfactual thoughts determined their emotional responses.Further, Nicklin et al. (2011) found across three studies, that counterfactual thinking predicted fairness perceptions, particularly when outcomes were bad.These studies show that when recipients believe that something is not fair, they base this perception on a counterfactual analysis of decision events (Colquitt & Zipay, 2015).
In sum, examining counterfactual thoughts enables decision-makers to gain a deeper understanding of the diverse perspectives held by recipients.This helps them to effectively explain and justify their decisions, resulting in reduced negative reactions from recipients (Shaw et al., 2003).Of course, exploring and adequately answering each of these counterfactual scenarios is not evident, but in the present paper, we develop the argument that state-of-the-art AI techniques as decision-making advisors can help in this process.

Counterfactual modelling and fairness in action
An important assumption of our present paper is that AI advisors should be developed to simulate the counterfactual scenarios that decision recipients are likely to consider.By doing so, we adopt a human-centred approach to AI (De Cremer et al., 2022) in which the (human) recipients of a decision are placed more central in how AI models as advisors can be used within organizations.In particular, we propose that inference-based counterfactual modelling can address two key types of counterfactuals: the would-counterfactual, which assesses the impact of a decision on the recipient, and the outcome-focused should-counterfactual, which examines whether negative impacts on stakeholders are minimized.Additionally, AI-enabled decision support systems can assist decision-makers in understanding the could-counterfactual by mapping and summarizing the legal and organizational constraints they face in specific scenarios, providing clarity on what actions are permissible.Lastly, by actively assessing model performance across demographics and detecting biases, the AI model can address the processbased should-counterfactual.

Modelling the would-counterfactual
An example of an inference-based question about the causal effects of a decision is: "Would the recipient be better off if I made a different decision?"To model this specific type of counterfactual, programmers can apply traditional Neyman-Rubin causal modelling (Rubin, 1974;Sekhon, 2009).In fact, counterfactual reasoning has a long tradition in statistical causal inference modelling (Lewis, 1973;Pearl, 2009;Rubin, 1974).Indeed, statisticians who are concerned with causal inference have put counterfactual questions central when estimating causal effects.Traditional examples of such counterfactual questions are: "Would this patient have survived if he had not received any treatment," or "Would this employee perform better if he follows this job training."In counterfactual modelling, the causal effect is determined by assessing the disparity between the actual outcome resulting from a certain action and the counterfactual outcome that would have arisen if that action had not been take (Morgan & Winship, 2014;Pearl, 2009).
Experiments are often used for causal inference (Morgan & Winship, 2014).In an experiment, participants are randomly assigned to a treatment group or a control group.Because the allocation is random, the two groups are expected to be similar on average, and that any difference between the two groups is due to the treatment-effect.The control condition is a means to estimate the counterfactual scenario -what would have occurred if participants from the treatment condition had not received the treatment.While experiments are often recommended for obtaining strong causal evidence, conducting and implementing experiments in organizational settings may not always be feasible and can involve substantial costs (Morgan & Winship, 2014).
Machine Learning (ML) models, however, offer some solutions (Bottou et al., 2013;Buesing et al., 2019;Johansson et al., 2016).Supervised ML models can draw on large data sets and advanced computing to estimate the counterfactual outcome.Specifically, ML models can create synthetic control conditions (i.e., a counterfactual scenario absent of a particular cause) that can be used to estimate causal effects (Ben-Michael et al., 2021).By creating a synthetic control condition, it can be compared to the actual situation, and as such, a causal effect can be estimated.These models have been used, for example, to estimate the causal effect of the Covid-19 lockdown in Wuhan on air pollution (Cole et al., 2020).In this study, the synthetic control involved what Wuhan's air pollution would have been if the Covid-19 lockdown did not happen.By comparing this counterfactual situation to the actual situation (i.e., Wuhan's air pollution during the lock down), these researchers were able to estimate the effect of the Covid-19 lockdown on air pollution in Wuhan: the lockdown caused a 63% reduction in nitrogen dioxide (NO 2 ) concentrations.Precisely because ML methods are able to estimate causal effects by drawing on large data sets, causal effects can be estimated even in situations where experiments are not straightforward.
What makes this method particularly useful is that once the ML has estimated the causal effect, AI advisors are in a position to inform decision-makers about the potential impact of certain decisions.Of course, knowing which outcomes are relevant to model also requires that business-level knowledge is part of the data sets that are worked with (De Cremer & De Schutter, 2021).To achieve that kind of integration it is thus important that those tech experts developing the AI advisor algorithms collaborate with the domain experts in HR and business (De Cremer, 2020).For example, when advising leaders on promotion decisions, the AI advisor preferably also needs to look at how promoting a specific employee might affect unit-level performance, employee engagement, and customer retention to arrive at a higher-quality level of advice.
Evaluating the would-counterfactual is relevant for decision-makers because recipients do poorly when trying to predict potential outcomes that do not benefit them.People tend to be unrealistically optimistic about future life events (Weinstein, 1980).An employee who is not being considered for a promotion may underestimate how difficult the new function actually might be for him or her.Or, a loan applicant who was denied a loan may not realize that if she/he were given the loan, she/he might have been systematically increasing her/his debt because she/he cannot make the monthly payments.To be accurate, it is not that people completely dismiss potential negative consequences, but research clearly shows that they tend to believe that those consequences are much less likely to happen to them compared to others (Weinstein & Klein, 1996).By actively modelling potential consequences in line with the wouldcounterfactual, AI advisors will thus be in a better position to provide decision-makers useful information to explain their decisions in ways that recipients may not consider sufficiently.

Modelling the could-counterfactual
The could-counterfactual is about the recipient's assessment of whether the decision-maker (e.g., the leader) could have made a different decision.Especially in situations that involve conflict of interests, it is important for decisionmakers to understand which rules they need to adhere toand whether they have discretion to influence the outcome.Obviously, this requires expertise in organizational rules and policies, as well as the intricacies and context of each specific decision.Knowing which rules and which expectations apply in a particular situation can be difficult to estimateespecially when rules are constantly changing and becoming more complex (Katz et al., 2020).
One way to model the could-counterfactual is to draw on AI-enabled decision support systems.As AI can quickly process and learn from big data sets, the AI-enabled decisionsupport systems can be of great support in retrieving, summarizing and informing decision-makers with relevant legal information (Prakken, 2016).An additional advantage of the ability to scan big datasets is that the AI advising system can also integrate the legal information with organizational and international business rules and policies.As law, regulations, and exceptions become more complex, it is very helpful for decision-makers to understand which decision options are in line with current rules and policies and which ones are not.As such, AI-advising system thus create an advantage by integrating rules and procedures from different sources and cases, enabling it to summarize and communicate to decision-makers which rules and regulations apply.Such supportive evidence-based advice can then help the decision-maker to understand better whether he/she has discretion in a given situation and therefore better estimate whether he/she could make decisions in different ways.
This kind of AI-based advice is already being used by lawyers who use supervised and unsupervised ML algorithms to read and interpret legal documentation (Metsker et al., 2021).An example of such an intelligent legal system is 'Eunomos', which helps users understand the meaning of legal texts and how they relate to organizational norms (Boella et al., 2019).Similarly, CLIEL (Commercial Law Information Extraction based on Layout) is a system capable of extracting key information from legal documents (Garc ıa-Constantino et al., 2017).
In situations involving complex ethical dilemmas, decision-makers may often lack awareness of their level of discretionary control.We posit that by providing decisionmakers with an AI advisor that offers a coherent synthesis of legal information and relevant case studies, their understanding of permissible actions can be significantly enhanced.The AI advisor can actively explain to decision-makers in accessible terms why certain options are unviable, thereby facilitating their ability to effectively justify their decisions to recipients.

Modelling the should-counterfactual
The should-counterfactual is about whether the decision upheld moral and organizational norms.We argue that these should-counterfactuals can be modelled by focusing on two main aspects: outcome fairness and procedural fairness.In this sense, AI advice -when it comes down to shouldcounterfactuals -can help decision-makers secure that their decisions are perceived as fair by being clearer and more transparent on both the outcomes received and the procedures used.
First, outcome fairness is about ensuring that the decision led to the most optimal distribution of outcomes across various stakeholders.Thus, are outcomes distributed in such a way within the collective that its members perceive their own outcome as fair.To model collective outcome fairness, programmers can extend the inference-based counterfactual modelling -that we discussed earlier -to estimate the impact of the decision on several stakeholders (i.e., not only one stakeholder, who is usually the recipient).Programmers will need to estimate these causal effects.In doing so managers and programmers will need to collaborate to decide on which metrics are relevant to capture relevant outcomes (e.g., predicted employee performance, customer satisfaction; see also our discussion of the would-counterfactual).
In addition, decision-makers will also need to assign a weight to the outcome that each stakeholder receives.Decision-makers can decide whether the importance of each stakeholder is weighted by the size of that particular stakeholder group, or whether each type of stakeholder receives the same weight.Management scholars propose several methods to account for these complexities.For example, Hall et al. (2015) propose a Social Return on Investment measure that permits managers to incorporate stakeholder's concerns, and value generated by firms.While no perfect measure exists, actively trying to answer and model outcome fairness motivates organizations to reflect on the impact of their decisions on the stakeholdersand how this will be perceived and in turn influence, for example, the motivation and performance of their workforce.
To summarize, the AI advisory systems can help decision-makers in explaining that a certain decision maximized the benefits of the collective by providing information on how a certain decision is likely to affect all members in the collective, and how this decision affects the collective goal.The AI advisor is able to mathematically estimate how fair decisions are by looking how they affect key collective outcomes.Recent research demonstrates that people actually prefer AI advice that computationally looked at the outcomes of decisions (Longoni & Cian, 2022).
Second, the fairness of the process is the second aspect people may focus on when trying to assess whether the decision-maker "should" have made a different decision.By being able to explain the "how" of the decision-making process, decision-makers can assure that a decision is perceived as more consistent, accurate, and bias-free; all procedural dimensions that make recipients see decisions as more fair (Colquitt, 2001).
When it comes down to promoting the use of accurate information and suppressing biased decision-making, decision-makers will have to acquire a good understanding of how the AI advisor performsin general, but also within each subpopulation (Buolamwini & Gebru, 2018;De Cremer & De Schutter, 2021).While the overall accuracy of a model might be good, it might perform much poorer in some subgroupsespecially those that are less well represented in the training dataset.If this is the case, then (supervised) ML algorithms are known to assign not enough weight to minority groups (even consider them more like error terms that need to be reduced) and as such arriving at biased decisions (e.g., De Cremer & De Schutter, 2021).
Even when the performance of two subgroups is similar, the type of mistakes the models make can be very different.A misclassification can either be a false positive (e.g., accepting an incompetent candidate) or a false negative (e.g., rejecting a competent candidate).While the overall accuracy can be the same for subpopulations, the false negative and false positive rates can be very different (Hardt et al., 2016;Wexler et al., 2020).This can especially be the case for subgroups where some cases are rare.Imagine that only 1% of candidates in a minority group are competent candidates.When the model is naïve and rejects every member of that minority group, then the model will be correct 99% of the time.However, while the overall accuracy is high, there is a 100% false negative rate for that particular subpopulation.When decision-makers learn about how the model performs for each subpopulation on an ongoing basis, they have more information about the presence of biases in the recommendation process.Decision-makers are able to actively search for and detect systematic performance and error rates differences among groups.
Employees might also wonder whether the same decision would have been made depending on demographics (e.g., gender, race, age).They might come up with their own relevant social comparison group (Folger & Cropanzano, 2001).Modelling this in an AI advisor is relatively simple as it would just require changing features in the input and then see if the outcome is different.For example, an employee concerned about a potential age bias might wonder whether the recommendation would be different if they were a few years youngerby showing that the recommendation would be the same, decision-makers can put some of these concerns at ease.The AI advisor can indicate what the recommendation would be given a difference in input features.Recently, Google developed a "What-IF" tool that allows users to choose several fairness strategies (Wexler et al., 2020).The What-IF tool is an open-source application that allows practitioners to assess what would have happened in a hypothetical scenario.The main advantage of such tools is that it allows decision-makers to think about and explore counterfactual scenarios they are currently interested in.By showing that the same recommendation would have been made regardless of gender or racedecision-makers can more convincingly explain to recipients that their decision minimized biases.
Aside from assessing whether the prediction or recommendation would be different across subpopulations the AI advisor should also assess whether the nearest counterfactual explanations differ depending on these demographics.For example, an employee might read a counterfactual explanation that if she scored 10 points higher on a test, she would have been recommended for the position.To assess the fairness of this explanation, she might wonder whether she would also need to score 10 points higher if she were a man.Process fairness concerns are not only about biases in the outcome, but also biases in the explanation.

Case study: Loan rejections
The central idea of our paper is that decision-makers can enhance perceptions of fairness by considering not only the nearest counterfactual of the algorithm classifier, but also counterfactuals proposed by Fairness Theory: could, would, and should.We further contend that it might not always be necessary to consider all three of these counterfactuals, since some may be more challenging to model than others.To illustrate the potential of including counterfactuals proposed by Fairness Theory, we conducted a vignette experiment (Aguinis & Bradley, 2014).A vignette experiment presents participants with a hypothetical scenario that mimics realworld situations and can lead to useful insights by systematically changing key elements across different scenario's and observing the effect of these changes on participant's responses.
Specifically.We will focus on loan rejections.This is a common context where the relevance of the nearest counterfactual explanation is illustrated (see e.g., Pawelczyk et al., 2022).In this scenario, a loan credit officer is tasked to assess the creditworthiness of loan applicants and the officer uses an algorithm to inform their decisions.The algorithm might reject applicants and will then identify the nearest counterfactual explanation: the minimum set of changes that would have made a rejected applicant eligible for a loan.
To evaluate the effectiveness of including additional counterfactuals, we developed scenarios that varied along two dimensions: the magnitude of the nearest counterfactual (i.e., small vs. large: earning £10,000 more vs.earning £1000 more) and the inclusion of a Could-and Should-counterfactual.The could-counterfactual involved explaining that there is a specific legal minimal requirement, and the shouldcounterfactual involved explaining that the model performs equally regardless of gender, sex, and ethnicity.This is a conservative test of our model because we look at the added benefit of our recommendations when there is already a default nearest counterfactual explanation.After participants read the scenario, we measured their perceptions of fairness and intention to engage in positive word of mouth to further understand the downstream effects of fairness perceptions.

Sample and procedure
We recruited 240 working adults living in the UK on Prolific, an online platform that is designed for participant recruitment for scientific purposes (Palan & Schitter, 2018)  1 .We excluded seven participants who failed to correctly answer an attention check (i.e., "What was the name of the company in the text?").Thus, our final sample consisted of 233 participants.Participants were on average 39.81 (SD ¼ 11.17) years old.In our sample 63.10% identified as women, 36.10% as men, and 0.90% preferred not to indicate their gender.Most participants in our sample (31.30%) had an annual income of £20,000 and £39,000, 28.30% had an annual income between £40,000 and £59,000, 22.30% had an annual income between £60,000 and £99,999, 12.00% had an annual income of less than £20,000 and 6.00% had an annual income of more than £100,000.We randomly assigned participants to one of four conditions in a 2 (small vs. large nearest counterfactual) x 2 (absence vs. presence of Could-and Should-counterfactual) between-subjects design.
Participants read that they have applied for a loan at ClearRiver Finance, a loan company that finances individual projects.They learned that they have applied several weeks ago and that they just received response back from ClearRiver Finance.The text read: Dear applicant, Thank you for submitting your loan application to ClearRiver Finance.We appreciate the time and effort you put into the process.
After a thorough review of your application, we regret to inform you that we cannot approve your loan request at this time.Our decision was based on a careful consideration of your financial situation and credit history by an expert employee and an AI advisor.
While we understand that this news may be disappointing, we want to take a moment to explain our decision.
The AI system indicated that if your gross annual income is increased by £10,000 [vs £1,000], it would have classified you as eligible for the loan.
Thank you again for considering ClearRiver Finance for your borrowing needs.

Loan officer ClearRiver Finance
Half of these texts also included counterfactual explanations proposed by Fairness Theory.A feasible and likely counterfactual explanation that can be modelled in our case study is a could-counterfactual that refers to a minimum legal threshold, and also a should -counterfactual that informs participants about how the model would perform if they had a different gender, age, or ethnicity.These are two counterfactuals that are relatively easy to incorporate in loan acceptance or rejection letters.In the response letters that included the could-and should-counterfactual, we included the following paragraphs: Unfortunately, your current income does not meet the minimum legal requirements for this loan program, and therefore we are unable to approve your loan application at this time.
I want to assure you that our decision-making process was thorough and unbiased.We utilized an AI model to assess your application, and we ran multiple robustness checks to ensure that our predictions were accurate and fair.These checks included testing the outcome if you had a different age, gender, or ethnicity.Our results showed that these demographics did not affect the outcome of the AI model.
We understand that circumstances can change, and we encourage you to apply again in the future if your financial situation improves.
After participants read this text, we asked them to give the name of the financial company.This served as an instrumental attention check.Then we measured participants fairness perceptions and the extent to which they are satisfied as a consumer and would talk positively to others about this organization.We added these questions to further understand the effects of adding counterfactual explanations beyond fairness perceptions.Both measures were answered on a response scale ranging from 1 (strongly disagree) to 5 (strongly agree).

Fairness
We used four items to capture the overall fairness perceptions of the decision-making process.The items were: "the treatment I received from the loan officer was fair," "I could count on the AI advisor to be fair," "the recommendation given by the AI advisor was fair," and "Using the AI in the decision-making process has introduced bias."Cronbach's alpha was .88.

Positive word-of-mouth
We measured participant's willingness to engage in positive word of mouth using the three-item scale from Xie et al. (2019).Items include: "I intend to say positive things about this company to friends, relatives and other people," "I intend to recommend my friends, relatives and other people considering work for this company," and "I intend to speak well of the company to friends, relatives and other people."Cronbach's alpha was .95.

Results
A Two-way ANOVA on perceived fairness did not reveal a significant effect of the effect size of the nearest counterfactual, F(1, 229) ¼ 1.69, p ¼ .195,g 2 ¼ .007.Showing that a nearest counterfactual explanation of £10,000 was seen as equally fair as when the change only required a change of £1000.Results did reveal however, a significant effect of the could-and should-counterfactual explanation, F(1, 229) ¼ 5.57, p ¼ .019,g 2 ¼ .024.When the could-and shouldcounterfactuals are included on top of the nearest counterfactual explanation, participants perceived the rejection to be more fair (M ¼ 3.50, SD ¼ 0.93) than when these explanations were not included (M ¼ 3.22, SD ¼ 0.87).There was no significant interaction between the effect size of the nearest counterfactual and the presence (vs.absence) of the couldand should-counterfactual explanation, F(1, 229) ¼ 0.38, p ¼ .539,g 2 ¼ .002.This shows that the increase in fairness occurs in both versions of the nearest counterfactual.
To further understand the downstream effects of the effect of the counterfactuals, we looked whether there was an indirect effect on positive word of mouth via fairness perceptions.To test this, we ran a mediation model using the PROCESS macro developed by Hayes in SPSS.Results showed a significant indirect effect (estimate ¼ 0.14, 95% CI ¼ [0.024, 0.259]).Including counterfactual explanations increased fairness perceptions which subsequently led to an increase in willingness to talk positively about this company to others.

Discussion
The present findings add to the research on counterfactual explanations in AI modelling.We show that above and beyond the nearest counterfactual, modelling counterfactuals informed by Fairness Theory can increase participants' experienced fairness.By including these counterfactual explanations, decision-makers are seen as fairer.Moreover, this increased in fairness even led to a stronger commitment to talk positively about the organization that implements the AI in the decision-makingshowing that increased fairness perceptions can serve as a competitive advantage.This effect remained the same regardless of the size of the nearest counterfactual.Interestingly, in our case study, the effect size of the nearest counterfactual did not have any effect on fairness perceptions.One potential reason could be that a monetary nearest counterfactual in the context of loan rejections implicitly implies a could-counterfactual about a minimum requirement in order to approve a loan.

Benefits of counterfactual explanations
We argue that AI advice that includes counterfactual scenarios has several advantages.First, counterfactual AI advice increases transparency and helps decision-makers explain their decision-making process to recipients.When recipients encounter negative events, they will try to make sense of these events by engaging in counterfactual thoughts.By putting the perspective of the recipients central (i.e., would-, could-, and should-counterfactuals) the AI advisor gives decision-makers useful information to address recipients' concerns.The would-counterfactual can help decisionmakers understand the impact of decision options on a particular recipient.The could-counterfactual helps decisionmakers understand which decisions are feasible and which ones are not.Should-counterfactual scenarios help decisionmakers deciding on which decision they should make.
A potential avenue for future research is that AI advisors can actively promote a counterfactual mindset in decisionmakers who can then be motivated to apply that mindset in other situations.Indeed, psychologists have found that a counterfactual mindset can be activated by one context and then transferred and applied in another context (Gollwitzer et al., 1990).By reading about counterfactual scenario's, the AI advisor triggers the decision-maker's cognitions that relate to counterfactual thoughts, and these thoughts become more salient in subsequent tasks.Even simple text scenarios can trigger counterfactual thinking.For example, Galinsky and Moskowitz (2000) found that counterfactual thinking can be activated when people read about an event (e.g., a woman winning a trip to Hawaii) and a counterfactual situation (e.g., if the woman had not changed seats, she would not have won the trip).Interestingly, participants primed with a counterfactual scenario performed better on subsequent tasks.Counterfactual thinking primed by simple scenarios leads to better problem solving (e.g., more creative ideas) and less biased thinking, i.e., less susceptibility to confirmation bias and greater openness to information (Galinsky et al., 2000;Galinsky & Moskowitz, 2000;Markman et al., 2007;Sassenberg et al., 2022).Therefore, receiving counterfactual AI advice is likely to encourage decision makers to think counterfactually in subsequent situations as well.Thus, counterfactual AI advice may systematically nudge decision-makers to adopt a critical counterfactual mindset in situations where they do not receive AI advice.
In all, it stands to reason that introducing different counterfactual scenarios will not only make a decision-makers more effective in taking decisions and explaining those decisions, but as a result also influence the recipients of those decisions to perceive the decision-making itself as more fair and ethical.From this perspective, AI-advisory systems that take into account different counterfactual scenarios are a great example of how AI in organizations can augment the abilities of decision-makers and make those decision-makers more effective in doing their job and encouraging employees to stay motivated and perform.This is an important consequence as with the introduction of AI in organization, the discussion of whether algorithms will replace employees in their job execution (i.e. the replication perspective) or make employees better at their job (i.e. the augmentation perspective) is high on the agenda when analysing the kind of value AI can provide to organizations and its different stakeholders (De Cremer, 2020;De Cremer & Kasparov, 2021;Duan et al., 2019;Raisch & Krakowski, 2021;Xu et al., 2023)

The limitations of counterfactual AI advice
The complexity of constructing counterfactual AI models varies considerably across the different type of counterfactuals.Some counterfactual models are relatively easier to develop.For example, the process-based should-counterfactual, which simply involves changing demographic variables as input features.Other counterfactuals are considerably more complex to model.For example, the outcome-based should-counterfactual, which focusses on optimizing collective outcomes based on specific decisions.Developing such a model is more complex and requires more resources because programmers and managers need to quantify collective value.
Developing high-quality AI advice models by integrating business and legal knowledge into large datasets presents numerous practical challenges.Data scientists depend on domain experts to precisely delineate business problems and acquire well-labeled data, while also gaining a clear understanding of the model requirements (Park et al., 2021).Unfortunately, in practice, these collaborations often lack structure and are ad-hoc (Park et al., 2021).Therefore, to overcome these difficulties, effective interdisciplinary collaboration, robust communication, and iterative refinement are essential.These factors ultimately contribute to the development of AI models capable of delivering accurate and pertinent advice for relevant business problems.
From a practical perspective, it might be relevant to initially focus on those counterfactuals that are easier to model.We do not contend that advisory systems must consistently model all three counterfactual scenarios.As an illustration, in our loan application case study, we concentrated on two counterfactuals that were comparatively straightforward to incorporate within that particular context.Remarkably, when these counterfactuals were integrated into the explanations, participants perceived the decision to be more fair.
Of course, there might also be some limitations to using advise and recommendation generated by AI-systems.AI advice is useful, but it should complement, not replace, human decisions (De Cremer & Kasparov, 2021).It is not because AI advise is of high-quality and can effectively augment decisions, that decision-makers should not make any decisions any more without relying on AI advice.In fact, AI advice, especially the one based on supervised ML algorithms, still relies on historical training datawhich might have some hidden biases or might not always be adequate to inform future outcomes.Organizational decision-makers, such as leaders, however, do not only make decisions in reactive ways where they rely on past experiences.In a volatile and ever changing and complex environment, organizational decision-makers (e.g., leaders) have to proactively anticipate and be creative in making judgement calls on what needs to be done (Bennett & Lemoine, 2014).AI advice is limited in making these proactive judgement calls (De Cremer, 2022).Organizational decision-makers thus requires creative thinking and problem-solving skills, and a critical mindset that is able to distinguish between good and bad advice.
Although the "augmentation" perspective is the most constructive one as it advocates a collaborative effort between AI and human workers in organizations, we do wish to note that an increased use of AI-advisory system could lead to organizational decision-makers to automate their own decision-making process and other jobs in the organization.This might be particularly the case for decision-makers who score higher on moral disengagement, the tendency to deactivate moral self-regulatory processes (Bandura, 1986;Moore, 2008).Individuals develop moral standards throughout their lives, but those who score higher on moral disengagement are more likely to disengage from these internal standards.One way to disengage from internal moral standards is to displace the responsibility so that one's own responsibility is minimized (Moore, 2008).Thus, when decision-makers who are more prone to morally disengage receive high-quality AI advice, they might not see themselves as fully responsible for the decisions they makeand don't use the advice in a critical manner.
All of the above thus demonstrates that an effective use of AI-advisory systems in organizations does not only require advanced and well-developed AI systems, but also ethical work environments, where decision-makers have developed their abilities and moral awareness to recognize ethical dilemmas and subsequently take moral responsibility for the decisions they make.Organizations should thus make sure that decision-makers stay in control when relying on AI recommendations and preserve their own autonomous moral compass (De Cremer & McGuire, 2022).One way to do so, is to promote ethical leadership by actively discussing ethical issues and setting high ethical standards among organizational members.In this way, decision-makers are more likely to be aware of ethical issues and less prone to moral disengagement (Moore et al., 2019;Ogunfowora et al., 2022).