Human/Machine(-Learning) Interactions, Human Agency and the International Humanitarian Law Proportionality Standard

ABSTRACT Developments in machine learning prompt questions about algorithmic decision-support systems (DSS) in warfare. This article explores how the use of these technologies impact practices of legal reasoning in military targeting. International Humanitarian Law (IHL) requires assessment of the proportionality of attacks, namely whether the expected incidental harm to civilians and civilian objects is excessive compared to the anticipated military advantage. Situating human agency in this practice of legal reasoning, this article considers whether the interaction between commanders (and the teams that support them) and algorithmic DSS for proportionality assessments alter this practice and displace the exercise of human agency. As DSS that purport to provide recommendations on proportionality generate output in a manner substantively different to proportionality assessments, these systems are not fit for purpose. Moreover, legal reasoning may be shaped by DSS that provide intelligence information due to the limits of reliability, biases and opacity characteristic of machine learning.


Introduction
The past year has seen generative artificial intelligence (AI), such as Open AI's ChatGPT, spark headlines and renew public interest in how AI can mediate our lives.In less than a year since the public release of ChatGPT on 30 November 2022, reports indicate the United States Department of Defense (US DoD) is experimenting with large language models from various Big Tech companies with a view to using "AI-enabled data in decision-making, sensors and ultimately firepower" in military operations (Manson 2023).Indeed, the US DoD has already launched "Task Force Lima" to pursue innovations in generative AI for military applications (US DoD 2023).These developments highlight that military AI is no future prophecy, but a reality unfolding in real time.
To date, debates on military AI have tended to revolve around the lawfulness and ethics of the use of "autonomous weapons systems" (AWS), often understood as a CONTACT Taylor Kate Woodcock t.woodcock@asser.nl;@TaylorKWoodcock weaponised platform capable of selecting and engaging targets without human intervention (US DoD 2012, 21).Nevertheless, there are also other software-based applications of data-driven AI that have broad scope for integration into military decision-making processes.Unlike AWS, these decision-support systems (DSS) would not autonomously engage in the use of force, but could support military personnel in decision making, including for targeting.As these software-based applications are less likely than AWS to be subjected to new international regulation, yet have far fewer obstacles to deployment in the short-term, they therefore warrant closer scrutiny.
As the legal framework primarily governing the conduct of parties to an armed conflict, International Humanitarian Law (IHL) applies broadly to the use of military AI in warfare (Lewis, Modirzadeh, and Blum 2016, 64-76).IHL is a vocabulary through which the international community appraises armed conflict, as well as a mechanism to regulate warfare and mitigate harm to victims.In the conduct of hostilities, IHL primarily regulates targeting through the interconnected norms on distinction, precautions and proportionality.First, the principle of distinction requires that civilians and civilian objects must always be distinguished from military objectives and must never be the direct object of attacks (First Additional Protocol to the Geneva Conventions (API) arts 48, 51(2)), also prohibiting indiscriminate attacks (API art 51(4), ( 5)).Next, the precautionary principle requires that "constant care shall be taken to spare the civilian population, civilians and civilian objects", with all feasible precautions to do so being taken in attacks (API art 57).Finally, proportionality requires the balancing of military necessity and humanitarian considerations, stipulating that attacks will be prohibited if the expected incidental harm to civilians and damage to civilian objects is excessive with respect to the anticipated military advantage (API art 51(5)(b)).
Rather than looking at whether the use of military AI conforms with these IHL obligations as such, this article instead reframes the issue by looking at the use of algorithmic DSS to support commanders in making legal determinations under IHL.This raises the question of whether the interaction between military practitioners and machine learning modelshereinafter referred to as "human/machine(-learning) interaction"place commanders in a position to conduct legal assessments.This article analyses this issue by considering whether DSS leveraging data-driven AI are fit for purpose, and exploring potential risks for the exercise of human agency in the interpretation and application of legal norms as a result of human/machine(-learning) interactions.More specifically, it will consider the implications of the use of data-driven AI to support assessments of proportionality under IHL in targeting.This article will begin by considering the emergence of machine learning in the military domain (section 2).Taking a practice-oriented approach, it then outlines a humancentred perspective to military AI that situates the exercise of human agency in practices of legal reasoning in which IHL is implemented in the conduct of hostilities (section 3).Next, it considers the treatment of proportionality in debates on military AI and conceptualises proportionality as a standard that requires a certain quality of legal decision-making by military commanders and the teams that support them, including military legal advisors (section 4).The final section then highlights how algorithmic DSS could be used to support proportionality assessments by either providing recommendations or actionable intelligence information, exploring the implications these sorts of DSS have on the exercise of human agency inherent in proportionality assessments (section 5).It argues that Recommendation DSS may not be fit for purpose in light of the contextual, qualitative and value-laden legal judgment required by the proportionality standard under IHL (section 5.1).Finally, it further highlights the risk that Information DSS may shape proportionality decisions in a manner inconsistent with IHL due to the limited reliability, biases and opacity characteristic of machine learning (section 5.2).

Algorithmic decision-support in the miliary domain
Contemporary developments in data-driven methods of AI, such as machine learning, create broad scope for the integration of algorithmic DSS into military operations.Data-driven AI has recently made great leaps due to improvements in computing power and data availability (Marcus 2018, 2).These algorithmic models have been lauded as achieving "super-human" prediction and pattern recognition capabilities (Campolo and Crawford 2020).At a general level, through an iterative process of statistical optimisation, data-driven algorithms generate an evolving model derived from groupings of correlations induced from distributions in training data (see e.g.Burrell 2016, 5;McQuillan 2018, 1-2;Pasquinelli andJoler 2021, 1271).Put simply, machine learning techniques process large amounts of data to gradually "learn rules" about how to solve particular problems based on patterns and relationships in that data, which can be generalised to solve the same problem when new situations arise. 1  The predictive power of data-driven AI is already being harnessed across many areas of life, including to support decision making.In the military domain, it has the potential to be leveraged to support decision making across many tasks, including targeting, intelligence-gathering, detention, humanitarian operations, and logistics (Lewis, Modirzadeh, and Blum 2016, 2).For some militaries, this is already reality.Big Tech company Palantir, which won major military contracts with the US and UK in 2022 (Dastin 2022;Palantir 2022), developed its "Gotham" system as an "AI-ready operating system that improves and accelerates decisions for operators across roles and all domains" (Palantir 2023).Gotham can "store, query, and visualize extremely large data sets, allowing analysts to discover patterns and relationships" (Munn 2017), demonstrating the power of AIenabled data mining for gleaning information drawn from vast data sets to support military decision-making.Likewise, the US "AN/GSQ-272 SENTINEL" system aims to produce and disseminate Intelligence, Surveillance and Reconnaissance information to support a number of military functions, including "intelligence preparation of the battlespace, predictive battlespace awareness, indications and warning, analysis of enemy courses of action, targeting and weapon selection, mission planning, and execution of air combat missions" (US Air Force 2014).Other militaries are also pursuing the integration of AI-enabled DSS into military arsenals, including China's StarSee Real-Time 1 Various different underlying models and training approaches (e.g.supervised learning, unsupervised learning and reinforcement learning) fall under the broad heading of "machine learning", with each having different ways of generating output to solve problems and different requirements in terms of data for training, or in the case of reinforcement learning, a training environment.This paper considers machine learning at a general level, with the key commonality between techniques being that models are induced from patterns/correlations in training data (or a training environment), rather than being explicitly programmed.
Combat Intelligence Guidance System (Fedasiuk, Melot, and Murphy 2021, 25) and the Russian RB-109A Bylina EW system (Konaev 2021, 67), as well as recent reports of the use of AI for targeting operations by the Israeli Defense Force in March 2023 (Mimran and Weinstein 2023).Moreover, as mentioned, the US has launched Task Force Lima to "assess, synchronize, and employ generative AI capabilities across the DoD" (US DoD 2023).
All of these systems point to the potential of data-driven AI to be used to provide actionable intelligence information to support military decision-making, as well as in some cases providing concrete recommendations or proposals on different decisions and courses of action.As such, this article will use these functions as illustrations of how algorithmic DSS can be used to support military decision-making by either (1) providing specific recommendations on decisions ("Recommendation DSS"); or (2) providing relevant information for decisions ("Information DSS").These two categories of algorithmic DSS will be used to explore how human/machine(-learning) interactions impact military legal decision-making on proportionality assessments.

Practices of legal reasoning in IHL and human agency
Viewing human agency in light of the practices of international law offers a new perspective for exploring the impact of human/machine(-learning) interaction when algorithmic DSS are used in military legal decision-making.Calls for a "human-centred" approach have been prevalent in debates on military applications of AI (see e.g.ICRC 2020), often being framed in terms of the need to maintain (meaningful) human control, involvement or judgment (see e.g.UN-CCW 2021).Despite being frequently relied upon, there remains a lack of clarity on whether and how these notions relate to the existing framework of IHL and how to operationalise them (Boutin and Woodcock 2023), reflective of a broader impasse amongst States in the Group of Governmental Experts (GGE) debates on Lethal AWS about whether the existing legal framework is sufficient or new law is needed (Lewis 2022, 492).
Building on the turn to practice, international law can be conceptualised as sets of social practices, rather than a mere body of rules (see e.g.Aalberts and Venzke 2017;Brunnée and Toope 2018;Lamp 2018;Stappert 2020).Defined as "socially meaningful patterns of action" (Adler and Pouliot 2011, 4), practices are a lens for viewing all human activity and meaning making.As practices assume the existence of interacting human agents who participate in and enact them, they therefore create scope to consider human agency.
Human agency is an "essentially contested concept" with no stable meaning across the various fields and disciplines in which it is used (Hildebrandt 2011, 5-6).Here, human agency is understood to reflect human involvement in sets of practices.The nature of human involvement or participation in practicesi.e.how human agency is exercised may vary as a matter of degree according to the practice at hand and subject to the attendant possibilities for and limitations on action.This conception departs from rationalist perspectives on agency that seek to explain what causes human behaviour, drawing on attributes such as intention, reason, motivation, or free will (see e.g.Anscombe 1957;Schlosser 2019).This represents a shift from understanding why humans act to tracing how they act and enact practices.
Understanding international (humanitarian) law as being enacted in practice thus opens space to understand how human agency is exercised within certain (military) contexts and provides a new entry point for examining the impact of emergent AI-enabled technologies.As with all law, a key dimension of the practice of international law is actors interpreting and applying norms to factual scenarios to make legal claims, in other words, legal reasoning.2It is in this sense that international law plays out through rhetorical processes where legal claims about the application of rules to specific situations are accepted or rejected based on standards for "what counts as a valid legal argument" produced within this practice (Aalberts and Venzke 2017, 307).This legal reasoning not only occurs through the adjudicatory function of law, but also as part of its function of guiding conduct in advance.As such, there are various actors beyond professional advocates and judgessuch as military commanders and legal advisorsthat are involved in practices of legal reasoning to assess the lawfulness of past and future conduct.How international law is enacted in practices of legal reasoning, and thus how human agency is exercised in this context, will depend upon the legal framework and specific norms being implemented.
Situating human agency in the practice of international law from the perspective of the legal decision maker demonstrates that, though we may consider the individuals who apply the law, this is not an exclusively individual phenomenon.Rather, it is also a social one, as it takes place in the context of sets of broader social practices.The exercise of human agency can therefore be viewed as having an "intersubjective" dimension, a notion often used in constructivism to capture how understanding and knowledge are shared through social interactions and practices (Adler 2005, 3, 19-20, 94ff).Discussions around human judgment or control relating to military AI tend to take an individualcentric perspective, yet this need not be the case to maintain a human-centered approach.As has been aptly pointed out in debates on AWS, military targeting practices and the implementation of IHL are distributed and diffused throughout military command and control structures (Cummings 2019, 24;Ekelhof 2019b, 347).It therefore seems appropriate to consider human agency from an intersubjective perspective that better captures how the implementation of IHL is part of interactional social practices, rather than considering the involvement, judgment or control of individuals in isolation.

The IHL proportionality standard and military AI
Proportionality is a central IHL norm in debates on military AI.It has various manifestations under IHL that serve to ensure militaries duly account for any incidental harm to civilians and civilian objects during attacks.Proportionality constitutes both a broad principle that reflects the need to balance the inherent tension between military necessity and humanitarian considerations and, in a narrower sense, what is often referred to as a "rule" that provides a limitation on the harm to civilians and damage to civilian objects that armed forces can cause during otherwise lawful attacks (Sloane 2015, 308;Van den Boogaard 2019, 63).This limitation on incidental harm to civilians is regarded as customary international law applicable in international armed conflict (between States) and non-international armed conflict (between non-State armed groups (NSAGs) or States and NSAGs) (Henckaerts and Doswald-Beck 2005, Customary IHL Rule 14).Proportionality is also enshrined in treaty law applicable to international armed conflict.API prohibits "indiscriminate attacks" where the expected incidental harm is excessive compared to the anticipated military advantage (Art 51(5)(b)), as well as requiring that commanders take feasible precautions to refrain from and halt these kinds of disproportionate attacks ( Arts 57(2)(a)(iii) and 57(2)(b)).These provisions demonstrate the interplay between proportionality and the IHL principles of distinction and precautions respectively, as a manifestation of the prohibition on targeting civilians and the duty of commanders to take feasible precautionary measures in order to avoid, or minimise, harm to civilians.
Rather than a rule, the test of proportionality in the narrow sense arguably exhibits a "standard-like" form (Cohen 2008;Lieblich and Benvenisti 2016, 252-253).Legal "rules" can be distinguished from "standards" as they tend to be a more specific form of legal regulation that can be objectively applied (Kennedy 1976(Kennedy , 1687(Kennedy -1688;;Pound 1933, 482-486;Sullivan 1992, 57), for instance a speed limit for vehicles.In contrast, standards exhibit a greater level of indeterminacy and tend to require discretionary judgments depending on the attendant circumstances.This captures the nature of proportionality under IHL, which requires an assessment of the potential excessiveness of expected incidental harm compared to the anticipated military advantage.Compliance with proportionality must be determined on a prospective ex ante basis, considering the information reasonably available to the commander leading up to the attack, rather than considering the results of an attack with the benefit of hindsight (Cohen and Zlotogorski 2021, 88-89;Dinstein 2013, 76).This test is considered indeterminate and open-textured as it requires the balancing of competing and "incommensurate values and interests" (Cohen and Zlotogorski 2021, 59;Henderson and Reece 2018, 837), evidencing its "standard-like" character.It is in the quality of discretionary judgments required by the proportionality standard, further elaborated in section 5.1, that demonstrates how human agency is situated in this particular legal reasoning process.
Whilst there is no agreement on what exactly constitutes "military advantage" in proportionality assessments, it is considered to entail "all sorts of tactical gains and military considerations" (Dinstein 2013, 75) that are "concrete and direct", in the sense of being "identifiable, clearly discernible and 'relatively closel[y]' related to the attack" (Schmitt 2020, 155).Incidental harm encompasses death and severe physical and mental harm to civilians (Geiß 2012, 77-79), damage to military objects, including the military dimension of dual use civilian objects, as well as foreseeable reverberating effects (Henderson and Reece 2018, 839).Though excessiveness is not defined, it has been interpreted as reflecting a "significant imbalance" between incidental harm and military advantage (Humanitarian Policy and Conflict Research Manual Commentary 2013, 98).Given the complexity of balancing these different elements, the determination of proportionality under IHL is not amenable to quantification or a strict mathematical formula (Cohen and Zlotogorski 2021, 59;Dinstein 2004, 122;Sloane 2015, 322-323).
As with other IHL conduct of hostilities norms, proportionality is a contentious issue in debates on military AI.It is contested whether "war algorithms will be capable of formulating and implementing certain IHL … evaluative decisions and value judgments, such as … [t]he assessment of 'excessiveness' of expected incidental harm in relation to anticipated military advantage" (Lewis, Modirzadeh, and Blum 2017;Van den Boogaard 2015, 267).On one hand, some commentators argue that it is unlikely AWS can conduct proportionality analyses, as elements of this assessmentsuch as military advantage and excessivenessare not clearly defined but depend upon context and the subjective assessment of the commander (Asaro 2012, 701-702;Van den Boogaard 2015, 261, 267).On the other hand, some suggest that with technological advances it may be possible to develop criteria for proportionality that could be preprogrammed into AWS (Sassóli 2014, 331), such as conservative or adjustable levels of collateral damage (Schmitt and Thurner 2013, 255-256;Zając 2023).
It is crucial to note that the obligation to implement IHL and conduct proportionality assessments remains with the commander.This underpins arguments that the use of AWS will be permissible so long as commanders are sufficiently aware of how these systems operate in situations where civilians are at risk (McFarland 2020, 123-124) and that commanders may conduct proportionality assessments in advance of deploying AWS within limited parameters (though the requisite temporal connection to the attack is unsettled) (Brunn, Bo, and Goussac 2023, 6-7).This view is reflected in one of the few GGE AWS working papers that deals with proportionality in any depth, the "Draft Articles on Autonomous Weapons Systems" submitted jointly by Australia, Canada, Japan, Poland, the Republic of Korea, the United Kingdom, and the United States in 2023.These draft articles state that the duty to "ensure effective implementation of the principle of proportionality in attacks involving the use of autonomous weapon systems" requires consideration of, inter alia, the presence of civilians in the area and during the period AWS are intended to operate, and the performance of AWS in targeting (2023, art 4).Despite this example, there appears to be general uncertainty on precisely how the IHL proportionality standard applies to military AI, as evidenced by the 2022 UK Statement indicating that an area for further clarification includes how "varying levels of autonomy affect … the assessment of proportionality" (United Kingdom 2023, 7).Moreover, as is generally the case with the GGE AWS debates, this issue has also not been taken up in the context of algorithmic DSS.
Notwithstanding this lack of clarity on military AI and proportionality, a central feature of debates on military AI that underpins this issue is whether, when and how human judgment or control must be exercised to properly implement IHL.Some commentators suggest that IHL precludes the delegation of targeting to autonomous systems as it is directed towards human addressees and entails qualitative judgments (Asaro 2012;Roff and Moyes 2016).In contrast, others suggest that these technologies are a means to exercise judgment and control and can thus facilitate compliance with IHL (McFarland and Galliott 2021, 51).Jensen further highlights a lack of consensus amongst States on whether there is a "legal requirement for human decision making in selecting and engaging targets" (2020, 53).As such, he argues that in light of obligations to review the legality of weapons under Article 36 API to deal with issues of accountability and predictability, there is no legal basis for prohibiting or limiting the development or use of weapons employing machine learning.Notwithstanding the crucial importance of Article 36 review procedures for military applications of AI (Klonowska 2022), rather than considering an abstract and unsettled requirement of human judgment in the use of force under IHL, analysis of practices of legal reasoning with IHL can open new avenues for assessing the impact of military AI on the exercise of human agency.Indeed, references by States at the GGE to the qualitative, contextual decisions required by IHL (e.g.Argentina et al. 2022, para 10; United Kingdom 2023, Annex A) prompt closer analysis of whether discretionary judgment is a necessary dimension of how specific IHL norms are reasoned with and implemented.
There is a tendency in current debates to frame questions around military applications of AI and IHL in technological, rather than legal, terms (Brunn, Bo, and Goussac 2023;Kurosaki 2020, 415).This is evident in calls for more systematic research comparing the ability of human combatants to comply with IHL norms with AWS (Trabucco and Heller 2022; see also Kurosaki 2020).However, this framing of whether military AI can "comply better" with IHL than humans erroneously equates the output of algorithms with human judgment, obscures that commanders always remain responsible for the implementation of IHL and treats IHL as an exclusively effects-based regime.Though IHL is undoubtedly concerned with limiting the effects of hostilities, including incidental effects on civilians relevant for the proportionality assessment, a means of achieving this is through the use of standard-like norms that require a certain quality of decision making.As put by Kennedy, "[t]he wise use of broad standards, rather than clear rules, encouragesbut also requiresa different kind of professional judgment by those on and off the battlefield evaluating the use of force" (2006,116).It is therefore necessary to closely consider how specific human/machine(-learning) interactions impact the exercise of human agency manifest in the IHL proportionality standard.This is not to deny the need to engage in rigorous study and testing when it comes to military AI or disregard the failings of IHL to prevent significant harm to civilians in contemporary armed conflict, but rather to question the assumption that data-driven AI really is the best solution to all problems, including for the implementation of IHL.

Algorithmic DSS for proportionality assessments
The broad scope of potential applications for data-driven AI suggests that algorithmic DSS could be integrated into various parts of the targeting process, including to support proportionality assessments.Boothby describes targeting as a process encompassing the planning and execution of attacks involving, inter alia, consideration of potential targets, accumulating relevant information to make military and legal determinations, selecting means and methods, and carrying out attacks (2012,4).Under IHL, proportionality assessments must be made throughout and as a final step in the planning phase after military objectives have been distinguished from civilians and civilian objects and all feasible precautions have been taken to avoid or minimise collateral damage (Gisel 2016, 65), as well as in the execution phase where attacks must be cancelled or suspended if it becomes apparent that the attack will be disproportionate (Henderson 2009, 235).Whilst military commanders are ordinarily responsible for conducting proportionality assessments (Van den Boogaard 2019, 208), Henderson suggests that in the execution phase this obligation is owed by anyone in "effective control over the attack" (2009,235).
Hypothetically, algorithmic DSS could be integrated throughout the targeting process to assist commanders in conducting proportionality assessments.Indeed, computer programmes applying "Collateral Damage Estimation Methodologies" are already used to provide predictions about the levels of collateral damage expected from a potential attack (Gisel 2016, 65;Van den Boogaard 2019, 217).Though collateral damage estimation is not a legal requirement itself, it nonetheless contributes to proportionality assessments (Ekelhof 2019a, 205).It can be envisaged that collateral damage estimation as well as the other proportionality elements of military advantage and relative excessivenesscould be further augmented through the integration of machine learning.Van den Boogaard suggests the output of collateral damage estimation software must be interpreted through "sound human judgment" to mitigate the limited availability of actionable intelligence information and the generalised nature of predictions that do not account for the prevailing context (2019,217).The integration of algorithmic DSS into targeting processes to support proportionality assessments equally raises the question of how the sound human judgment necessary for proportionality assessments more broadly will be impacted.
As mentioned, algorithmic DSS could be envisaged to produce probabilistic output that either provides concrete recommendations on courses of action or decisions, or actionable intelligence information, to support decision making.In the context of proportionality assessments, Recommendation DSS could hypothetically provide predictions of whether a potential attack is proportionate, i.e. whether or not the expected incidental harm to civilians is excessive compared to the anticipated military advantage.3Moreover, Information DSS could provide information to enhance the situational awareness necessary to conduct proportionality assessments.Here, machine learning models could be used to synthesise, evaluate and present relevant and actionable intelligence information.Crucially, both Recommendation DSS and Information DSS could be used to support decision making by humans and would not necessarily be linked to a physical platform like an AWS.4

Fitness for purpose of algorithmic recommendations on proportionality
The algorithmic generation of recommendations on the proportionality of a proposed attack risks compromising human agency in the implementation of IHL by displacing discretionary legal reasoning.Examining how the proportionality standard is interpreted and applied in practice, and thus how human agency is exercised in the implementation of this IHL standard, reveals a certain quality of legal decision-making that is contextual, qualitative, and value-laden.Data-driven DSS would not be fit for the purpose of providing recommendations on proportionality as these systems produce output in a manner that substantively diverges from the nature of legal reasoning necessary for proportionality assessments.As such, any attempt by commanders to rely on such a Recommendation DSS would jeopardise the exercise of human agency by altering the nature of proportionality assessments.

Contextuality and case-by-case decision-making
Recommendation DSS that predict the likelihood of an attack being proportionate cannot generate output that can be relied upon to support the contextual, case-by-case legal reasoning required for proportionality assessments.Determination of whether expected incidental harm to civilians is excessive compared to the military advantage anticipated is highly contextual and cannot be determined in the abstract (Corn 2018, 771-772;Dinstein 2013, 76).In contrast, the output of machine learning algorithms is decontextualised and reflects a "technological rendering of the world as a statistical data relationship" (Schwarz 2021, 60).Machine learning ultimately attempts to glean insights on individuals not based on their individual case, but from trends inferred from the group reflected in the algorithm's training data (Malik 2020, 17).Crucially, probabilistic recommendations on the excessiveness element of proportionality assessments produced by algorithmic DSS would not consider instances of proportionality on a truly case-by-case basis, instead generating predictions based on how well input data fits the correlations recognised from the model's training data.The attendant circumstances are not judged on their own termssuch as whether civilians are directly participating in hostilities and therefore not counted within the incidental harm element of proportionality assessments or whether a strike on a military base in proximity to vital infrastructure is advantageous enough relative to the expected incidental harm to civiliansbut always with reference to the training data.
Whilst data-driven predictions are based on correlation and not causation (Marcus 2018, 12-13;Tsamados et al. 2022, 217), contextual grounding allows humans to determine what patterns and factors are causally relevant.This is key for legal reasoning about proportionality, where commanders are routinely required to make determinations about the relevance of certain contextual considerations.A further risk with decontexualised algorithmic predictions is that irrelevant considerations may ultimately form the basis of DSS recommendations.It is not simply a matter of adding more data, as determining proportionality requires a contextual background of the armed conflict and military institution, with all of the attendant objectives, processes, strategies and limitations that go along with this.The inability of algorithmic DSS to produce contextualised predictions thus suggests these systems are not fit for purpose when it comes to supporting commanders by providing recommendations on proportionality.

The socially-situated, reasonable military commander
Algorithmic DSS are not fit to assist commanders by providing recommendations on proportionality as data-driven output does not adequately account for the qualitative nature of proportionality assessments.The determination of excessiveness in proportionality assessments requires the comparison of incommensurate variables (incidental harm to civilians and military advantage) for which there is no quantifiable yardstick (Cohen and Zlotogorski 2021, 59).In comparison, machine learning models process data through inherently quantitative methods of statistical analysis.As such, the "multifaceted, variable, and context-dependent" nature of social considerations analysed through qualitative approaches is not measurable in terms of the probabilistic analysis undertaken by machine learning (Malik 2020, 8).Proportionality cannot be determined through "crunching numbers" to compare the number of casualties and degree of harm on either side (Dinstein 2013, 75).As such, excessiveness is not observable based on correlations matched with training data.
Proportionality assessments therefore require qualitative discretionary judgments.Reference is frequently made to the subjective dimensions of the proportionality assessment due to its complexity and the need to use common sense and good faith, requiring an assessment made in "a diligent and honest manner by a competent commander at the appropriate level of command" (Gardam 2004, 105-106; see also Wright 2012, 839).Henderson interprets these claims of subjectivity as referring to the qualitative character of the proportionality assessment, rather than suggesting that the lawful determination will be fully left to the commander's subjective discretion (2009,222).Indeed, as put by Kalshoven and Zegveld, "the decision is not entirely left to the subjective judgment of the attacker: decisive is whether a normally alert attacker who is reasonably well informed and who, moreover, makes reasonable use of the available information could have expected the excessive damage among the civilian population" (2011,115).This evinces a qualified objective standard, according to which the proportionality assessment must be made and judged from the perspective of the "reasonable military commander" (Henderson and Reece 2018, 840;ICTY Report 2000, para 50).The reasonable military commander standard is understood to reflect the "experience, training and understanding of military operations" ordinarily possessed by a military commander (Henderson and Reece 2018, 845).Nonetheless, ultimately there may still be difficulties in achieving any real sense of objectivity, as decision-makers may hold different views on the value of human life, military interests and humanitarian considerations (Schmitt 1999, 151;Wright 2012, 840).What is "reasonable" for a commander is not objective as such, with proportionality delimiting a spectrum of reasonable conduct upon which reasonableand reasonably well informedminds might disagree.Whilst "[s]ubjective judgment is inevitable … [it must] be influenced by a reasonable and responsible understanding of what will be generally acceptable and understood in humanitarian terms" (Haines 2014, 286).This underscores why proportionality is best thought of as an intersubjective standard, as the military context, experience, training and culture in which military commanders using IHL are socialised will go a long way to determining what is considered reasonable in certain circumstances.It is through the ongoing practice of legal decision-making in targeting, encompassing proportionality assessments, that the bounds of reasonableness gain meaning and can provide a useful standard to guide conduct when commanders seek to implement IHL.Reasonableness is therefore not a matter of computation or creating artificial boundaries of reasonable conduct, but of being situated in the context of armed conflict, comprehending its social impacts and consequences, and making decisions in light of the intersubjectively shared experience and knowledge within military institutions.

Value-laden determinations on human life
The algorithmic reproduction of proportionality assessments by Recommendation DSS has limited utility as it fails to account for the moral dimension of this IHL standard.It is often stated that proportionality necessarily involves value-laden judgements by military commanders (Gisel 2016, 52;Mégret 2016, 772).This does not mean that proportionality assessments are to be determined on the basis of morality, but rather that the legal elements of proportionality assessments are inextricably tied up with the moral consideration of valuing human life.Whilst proportionality assessments may not entail moral decision-making alone or require that a commander subscribe to any particular ethical framework, "the 'moral' roots of the proportionality principle provide valuable insight into the scope and limits of the legal test" (Watkin 2005, 26).
Interpreting the proportionality standard in light of the Martens Clause further reflects its value-laden quality.First codified in the Hague Convention of 1899 and reflected in a number of modern IHL instruments, the Martens Clause stipulates that that in situations not covered by existing treaty law, civilians and combatants are subject to the protection of "the principles of international law derived from established custom, from the principles of humanity and from the dictates of public conscience" (CCW, Preamble; see also API Art 1(2)).The scope of this Clause was disputed in earlier debates on nuclear weapons.Whilst the International Court of Justice stated that that the Clause "proved to be an effective means of addressing the rapid evolution of military technology" (Nuclear Weapons Advisory Opinion 1996, para 78), it was nonetheless unwilling to accept the illegality of nuclear weapons based on the Martens Clause alone (Boothby 2012, 351;Meron 2006, 27-28).However, the Martens Clause can serve as an interpretive aid to interpret other IHL norms (Cassese 2000, 208).The Clause highlights "the willingness of the international community to introduce considerations of morality into international law" (Amoroso 2020, 163;Weatherall 2015, 80).It is largely on this basis that the desire to preserve human agency in the lethal use of force has been raised in debates on AWS (Human Rights Watch 2012, 35-33;ICRC 2020, 474).When interpreting the proportionality standard, the principles of humanity that underpin IHL and derive from the Martens Clause place clear emphasis on the moral concern of valuing human life.
The algorithmic replication of proportionality assessments can only ever represent an abstraction of the value-laden legal decision-making required by this IHL standard.Unlike computational artefacts, humans have the capacity to value human life on the basis of their own sentience and mortality (see e.g.Véliz 2021).The moral judgments required by IHL cannot be determined based on theoretical reasoning independent from human experience (Kalpouzos 2020, 310).As put by Heyns, technical artefacts "lack morality and mortality" and thus cannot grasp "the importance of life and the implications of taking it" (Heyns 2013, para 94).The US National Security Commission on Artificial Intelligence also stated that "[t]he moral reasoning involved in this calculus … remains the responsibility of a human commander" (2021, 92).As such, algorithmic DSS would not be fit for purpose for making recommendations on proportionality, as these systems have no way of duly accounting for the value of human life necessary to determine the potential excessiveness of incidental harm to civilians compared to military advantage.

Information DSS and the algorithmic shaping of legal reasoning
Moving now from systems that provide recommendations to Information DSS that provide actionable intelligence information for proportionality assessments, questions arise as to whether the resulting human/machine(-learning) interactions frame legal reasoning.This section explores how the exercise of agency by legal decision makers is subject to the inherent limitations of data-driven algorithms in terms of lack of reliability, bias, and explainability, in ways that may be inconsistent with the proper implementation of IHL.The desire to mitigate these characteristics, amongst others, is apparent in various AI ethics principles (see e.g.Jobin, Ienca, and Vayena 2019), as well as being considered relevant technical issues for military applications of AI (ICRC 2020, 473-475).Some in the GGE AWS debates consider this mitigation necessary for meaningful human control and/or IHL compliance (see e.g.Chile and Mexico 2022, 3, 5-6; UN-CCW 2023, paras 21, 26-27 ; Palestine 2023 para 5).However, as highlighted in Brunn, Bo and Goussac, "IHL does not expressly lay down the required levels of reliability or predictability of weapon systems (e.g. a maximum 'fail rate')", with clarification needed on the requirements of knowledge and foreseeability in debates on military AI (2023, 17).
This section uses the concepts of reliability, bias and explainability to explore how the selection of information by algorithmic DSS and resulting human/machine(-learning) interactions are shaped by the logic of the system, considering in particular the implications for proportionality assessments.As put by Schwarz, "[t]he more complex and speedy the digital back-end, the more limited are the human capacities, the greater the scope for mediation and the greater the capacity for the technology to direct the human's practices and focus", prioritising "a techno-logic geared toward speed, optimization, and efficient decision-making" (2021,56).The use of algorithmic DSS within complex environments can therefore hamper users' autonomy by shaping their choices (Tsamados et al. 2022, 222-223).It is argued that risks surrounding reliability, bias and explainability in data-driven AI require attention and mediation to avoid the displacement of human agency.Caution should be exercised to avoid the proliferation of a decontextualised, algorithmic worldview that diminishes space for the quality of reasoning required by IHL standards like proportionality.Notwithstanding the conclusions reached in the previous section, these risks may also be present for all kinds of algorithmic DSS that leverage machine learning models, whether to provide recommendations, information, or otherwise.

Algorithmic reliability in the fog of war
Whilst data-driven AI is often lauded as holding the potential to enhance situational awareness, these systems are subject to the limitations of machine learning algorithms in terms of consistent, reliable functioning.Machine learning may perform well in training when test data is drawn from the same overall set as training data, but may have limited accuracy in situations where there is limited data available or test data does not match training data in significant ways (Marcus 2018, 15).Highly accurate predictions may only materialise under conditions identical to the data the algorithm was trained on, an issue referred to in data science as "overfitting" (Dietterich 1995).Moreover, the accurate performance of machine learning is limited when these systems must generalise beyond variables reflected in the data they are trained on, as these models do not extrapolate outside of the correlations identified in training data (Marcus 2018, 5-6, 16).Insights from algorithms can also be incomplete, relying on assumptions that guide the collection of training data and factors that are quantifiable (Tsamados et al. 2022, 218).
These characteristics make the reliability of algorithmic DSS dubious in the fast-paced, dynamic, complex and unpredictable situations arising in military operations.Limits on the availability of data that represents the diversity of battlefield contexts, the inability of machine learning to account for novel circumstances beyond correlations drawn from training data, and the exclusion of qualitative considerations (as discussed above) constrain the reliability and accuracy of algorithmic output in this context.Machine learning "presumes a largely stable world" (Marcus 2018, 13;Sculley et al. 2014, 5), which is out of step with the reality of armed conflict and risks diffusing a limited or inaccurate vision of reality to those using Information DSS to support legal decision-making.Armed conflict is inherently changeable and gives rise to extraordinary situations, which may not be captured in the correlations that form the basis of algorithmic predictions.The accuracy of proportionality assessments entails that these nuances are part of the contextual legal reasoning of commanders.Moreover, as these algorithms are also extremely "data hungry" (Marcus 2018, 6), it may not be feasible for militaries to commit the huge amounts of time, human and financial resources to collecting and labelling training data sets reflecting all possible scenarios in armed conflict.
The limitations of machine learning also make these systems vulnerable to adversarial attacks and spoofing.Targeted attacks can leverage imperceptible, deliberate perturbations to inputs to bring about misclassifications (Szegedy et al. 2013).This can cause simple errors, such as the classification of a 3D printed turtle as a rifle (Athalye et al. 2017), with potentially serious consequences for the identification of civilians necessary to conduct proportionality assessments.
Finally, if inaccurate algorithmic output resulting from these limitations is relied upon by human decision-makers leading to unforeseen harm to civilians this could result in a systematisation of errors, which may not necessarily be treated as violations of IHL (Brunn, Bo, and Goussac 2023, 19-20;Pacholska 2023).If incidental consequences that arise from the known limitations of machine learning technologies automatically fall outside the protection of IHL because they are considered system errors or mistakes by human operators relying on these systems, this poses serious questions around the ability of commanders to effectively implement IHL when relying upon algorithmic DSS.

Bias in human/machine(-learning) interactions
The biases inherent in human/machine(-learning) interactions create a distorted vision of reality in terms of both the algorithmic output of an Information DSS and the user's response to this output.The limitations of data-driven algorithms are not only a matter of the technical features of the system, but are also interlaced with human bias (Pasquinelli andJoler 2021, 1265).Biases in machine learning generate "outputs representing a distorted, incomplete or misleading reality", and may manifest in the data, design or outcome of a system (Malgieri and Comandé 2017, 248).Pasquinelli and Joler highlight three types of bias in machine learning models: (1) historical bias (human bias that can be reflected in the output of algorithms); (2) data-set bias (introduced through the data labelling process by generating a "distorted view of the world, misrepresenting social diversities and exacerbating social hierarchies"); and (3) algorithmic or statistical bias (further amplifying historical and data-set biases through information compression, where algorithms process data in the most economic manner possible) (2021,1265).The naturalisation and systematisation of biases through datadriven algorithms has the potential to diffuse a limited worldview that can channel decision making and risks shaping legal decisions, such as proportionality assessments.The use of information selected based on a distorted or incomplete view of the battlespace could result in potentially grave consequences when informing legal assessments that entail determinations with a bearing on incidental harm to civilians.
Moreover, human cognitive biases may compromise the reliability of human decisionmaking when AI-enabled DSS are employed (Cummings 2019).This includes over-trust in the system and a consequent failure to scrutinise its output, finding only confirmatory information, and human deskilling for tasks that become automated (Mosier and Skitka 2019).As such, cognitive biases may mean the need to scrutinise these systems and their output is not even apparent to the commander.This is compounded by Noll's suggestion that if AI-supported decision-making is considered to increase objectivity and these systems are self-learning, this problematises the ability of humans to "stand their judgmental ground against the AI system" (2019, 88).5.2.3.Algorithmic opacity, explainability and legal substantiation Lack of explainability in machine learning-enabled DSS presents obstacles to commanders substantiating proportionality assessments when relying upon algorithmically generated information.Data-driven algorithms use opaque logic to produce predictions without the possibility of explanation, commonly referred to as the "black box" problem (Pasquale 2015).The opaque character of machine learning models flows from several factors, including the complexity, speed and scale at which these systems operate, as well as that the system's decision logic is not apparent and changes as the system "learns", resulting in output that does not "naturally accord with human semantic explanations" (Burrell 2016, 10ff).
The IHL proportionality standard creates the obligation to maximise situational awareness and collect and consider all information reasonably available, which will ultimately frame the proportionality assessment (Corn 2018, 772).However, the opaque character of Information DSS that flag information relevant to proportionality assessments may challenge the ability of commanders using these systems to adequately scrutinise the information produced by the system to determine its relevance and how much weight to give it in their proportionality assessments.The limits of explainability create the further risk that an explanation of why an algorithmic DSS generated specific predictions could inevitably fall outside the bounds of information that is "reasonably available", creating tension with the emphasis the proportionality standard places on the reasonable collection of information to make informed legal decisions.Moreover, the opacity of data-driven algorithms may also prevent commanders from predicting how these systems will function in advance.This is not to say that commanders require complete technical explanations of algorithmic DSS.However, the ability of commanders to properly implement the IHL proportionality standard requires scrutiny of the considerations that inform the proportionality assessment, which may be hindered by the opacity of machine learning.At a minimum, a baseline understanding of how the system operates, explanation of why a system produced specific output and justification of how this was used are required.In addition to this ex ante comprehension of the system and its outcomes, in order to determine the legality of proportionality assessments ex post, militaries and (international) courts and tribunals alike will also require demonstrable justifications as to why a particular attack was considered proportionate in light of corollary duties to investigate and prosecute alleged breaches of IHL.
By way of example, consider an Information DSS used to provide information regarding civilian presence in a particular area.This information can be used to assist proportionality assessments by giving an indication of the extent of civilian presence (and thus expected incidental harm), as well as whether any individuals present are combatants or civilians directly participating in hostilities (and are therefore excluded from the proportionality analysis).In recent years, certain targeting practices have been highly criticised for using "signatures" of suspicious behaviour as a way of identifying individuals as targets, for instance men of a certain age carrying arms in a particular area, with no other links to the conduct of hostilities (Benson 2014, 34-36).Whilst the legality of these practices under IHL is dubious anyway, the use of algorithmic DSS to provide information on civilian presence compounds the problem by directing legal reasoning in multiple ways.The prominence of historical training data in the decision logic of machine learning points to the likelihood that these sorts of erroneous considerationsor proxies thereofcould inadvertently entrench biases and inform the output and reliability of algorithmic DSS.This problem is exacerbated by the opacity of data-driven algorithms that prevent understanding of exactly what factors led a system to attribute certain probabilities on the presence of civilians to inform legal judgments.This example also illustrates that the "mere" selection of information is not neutral, but is itself a contextual and social judgment, which, as discussed in the previous section, falls beyond the scope of how machine learning functions.

Concluding remarks
Cutting through the polar extremes of hype and fearmongering in debates on military AI requires close consideration of the integration of specific applications of military AI and the human/machine(-learning) interactions that come about as a result.This article attempts a modest contribution to this effort by exploring how human/ machine(-learning) interactions impact the practices of legal reasoning in proportionality assessments under IHL.It argued that the challenges that DSS leveraging datadriven algorithms pose to legal reasoning risk displacing the exercise of human agency situated within these practices.This means algorithmic systems that attempt to replicate the proportionality weighing exercise are not fit for purpose, failing to capture the necessary contextual, qualitative and value-laden nature of proportionality assessments.Moreover, limits of data-driven algorithms that generate actionable intelligence information in terms of reliability, bias and explainability risk shaping legal decision-making and displacing human agency in ways incompatible with IHL.Awareness and caution of these limitations is necessary to actively mitigate the risks of data-driven AI and preserve the exercise of human agency in legal reasoning with IHL.
The emphasis on practices of legal reasoning is an invitation to view international law differently, when so often it has been treated as a checklist for the legality of military technologies, including emergent applications of AI.It is hoped that this article shows that consideration of how international law is engaged with and enacted provides new avenues to examine the ways human/machine(-learning) interactions have the potential to undermine the exercise human agency and alter legal reasoning as we know it.This takes seriously the call for human beings to "take responsibility for the rules governing our violence" and "act as custodians, interpreters, and appliers of the law" when it comes to the use of military AI (Kalpouzos 2020, 310).This is not to deny that discretionary legal reasoning by humans may be flawed, or that there are serious questions around the proper implementation of IHL and the protections for human life this framework seeks to provide.Nevertheless, the limitations and vulnerabilities of the current state-of-the-art of machine learning and the ways that these can impact legal reasoning when algorithmic DSS are relied upon suggest that AI might not always be the best solution for all problems.