Predicting the hydrolytic breakdown rates of organophosphorus chemical warfare agent simulants using association constants derived from hydrogen bonded complex formation events

ABSTRACT Organophosphorus (OP) chemical warfare agents (CWAs) represent an ongoing global threat, through either purposeful environmental release or the need to dispose of historic stockpiles. This presents a need for the development of novel decontamination technologies. Due to the toxic nature and legal limitations placed on OP CWAs, the use of appropriate OP simulants that mimic the reactivity but not the toxicity of the agents themselves is vital to decontamination studies. Herein, we show that association constants derived from non-specific hydrogen bonded complexation events may be used as parameters within models to predict simulant reactivity. We also discuss the limitations that should be placed on such data. GRAPHICAL ABSTRACT


Introduction
For almost a century, organophosphorus (OP) compounds have been used globally as pesticides and chemical warfare agents (CWAs) [1]. † However, due to the inherent toxicity of these compounds combined with limited effective treatments [2,3], the accidental or purposeful release of these agents represents a continued global risk, as demonstrated by recent events in Germany (2020) [4], UK (2018) [5] and Malaysia (2017) [6], as well as events in Syria (2013) [7] and Japan (1995) [8]. The ongoing development of novel decontamination and remediation technologies to limit the harmful effects of these OP compounds if released therefore remains imperative [2,9].
In general, OP simulants represent the only viable option when developing novel approaches both to combat OP CWA release and for subsequent remediation due to the highly toxic nature of, and legal restrictions placed upon, the live agents themselves [10,11]. However, it is not feasible for any single simulant to simultaneously mimic all the chemical properties of an OP CWA without also inheriting undesirable molecular traits. Careful consideration of the appropriate properties of any simulant chosen to aid in the development of novel OP CWA detection, decontamination, or remediation methodologies [12][13][14][15] is therefore required.
Within the field of supramolecular chemistry, the use of CWA simulants to develop such technologies is well documented [16,17]. For example, Cragg and co-workers have highlighted the importance of simulant choice through their work comparing the coordination of OP CWAs and simulants binding within a cyclodextrin cavity [13][14][15]. These findings are further supported through the work of Barlaz and co-workers [12]. In addition, work by Gale and co-workers [18] have shown that simple tripodal compounds containing amino acids can not only bind to OP CWA agents but also enhance the rate of their hydrolysis. In further work, this research group also demonstrated the use of responsive supramolecular organogels for not only sensing the presence of OP CWAs but also CWA encapsulation and decontamination [19][20][21]. Furthermore, Ward and co-workers have demonstrated that supramolecular cages can be used for catalysing the hydrolysis of organophosphates upon binding within the cage cavity and on the cages exterior surface [22,23]. Similarly, Dennison and co-workers [24] have utilised a trivalent Europium complex to successfully bind a variety of OP structural analogues, to determine whether these could lead to the identification of next-generation VX simulants for supramolecular and spectroscopic purposes. Finally, work by Borguet, Rosi, Johnson and co-workers [25] has shown that DFT modelling can be used to predict ligand design for the uptake into metal-organic frameworks (MOFs) of the OP CWA simulant dimethyl methylphosphonate (DMMP).
Predictive models that utilise the derivation of potential simulant structure-desired physical property relationships (more generally referred to as structure activity relationships -SARs) can be used by researchers to aid the effective identification of appropriate OP CWA simulants. Specifically, they permit the screening of any potential target molecules without first requiring their synthesis, accelerating the development process [11,26,27], to identify the most appropriate simulant for the work to be undertaken. This approach, used by Snurr and Mendonca [28], drew upon density functional theory (DFT) studies on OP hydrolysis to develop a quantitative structure activity relationship (QSAR) model that supports the identification of appropriate OP CWA simulants for decontamination purposes.
Within our own work, we have demonstrated how simple, low-level computational modelling may be used to predict hydrogen bond mediated aggregation events [29,30], the resultant antimicrobial activity [31], and most recently to identify appropriate OP CWA simulants for the development of supramolecular detection technologies, through the calculation of association constants, which may be predicted using computationally derived parameters [11]. Herein, we build on this prior work, exploring the use of low-level computational modelling and supramolecular host:guest association constants to predict the reactivity of potential OP simulants towards the presence of NaOH, a base typically used within standard decontamination solutions [32,33], using a data set produced from the library of OP simulants 1-22, Figure 1. It is hoped that our preliminary work in this area will aid other researchers towards the identification of appropriate OP simulants for the development of novel decontamination technologies.

Assessing simulant reactivity
Several OP CWA decontamination solutions/systems are commercially available, these include the US army developed skin decontamination M258 kit [43], a two-phase system that contains two packets to be used sequentially. Packet 1 contains a towelette impregnated with phenol (10 wt %), ethanol (72 wt %), NaOH (5 wt %), ammonia (≈1 wt %), and water (12 wt %), whilst Packet 2 contains a towelette impregnated with chloramine-B and a sealed glass ampoule filled with zinc chloride [32,43]. DS2, a second decontamination solution, consists of diethylenetriamine (70 wt %), ethylene glycol monomethyl ether (28 wt %), and NaOH (2 wt %) [32,33]. Super tropical bleach (calcium hypochlorite (93 wt %) and NaOH (7 wt %)), when supplied as a solid or slurry with water is also considered useful for OP CWA decontamination [33,44]. With the partial exception of super-tropical bleach, all of these decontamination systems/solutions have in common the presence of an alcohol, NaOH, and H 2 O. Here, the hydroxyl ion acts as a nucleophile, which attacks the electrophilic phosphorus centre, common to all members of this class of OP CWAs, neutralising/reducing the toxic effects of these agents [45][46][47]. When considering the library of potential simulants 1-22, the electrophilic centre analogous to that of the OP CWA is either the analogous P = O phosphorus or O = S = O sulphur atom.
The reactivity of potential OP CWA simulants 1-22 was evaluated by reaction of these compounds (see Scheme 1) in the presence of five equivalents of NaOH, with respect to the simulant, at 291 K in a methanol-d 4 :D 2 O 70:30 solution. These experimental conditions were a result of multiple optimisation studies in which [1]H NMR spectroscopy was used to monitor the relative proportions of reactant and simulant present with respect to time; the results of these studies have been summarised in Table 1. Due to the range of reaction rates observed for compounds 1-22 under these conditions, it was not possible to calculate experimental hydrolysis rate constants for all the simulants tested. Thus, the % simulant breakdown after 6 hours has been used as a comparative simulant reactivity value to create this dataset. This approach enabled us to include values generated from these studies for all the potential simulants. In addition, where an experimental rate constant could be calculated (1-4, 7, 12-16, 18, 21-22) all reactions were found to follow first order kinetics apart from that of compound 18 which was found to obey second order kinetics with respect to the simulant, at present the reason for this has not been determined. The specific reasons preventing rate order assignment are detailed within Table 1, and further supported by data supplied within the supplementary materials. †

Predicting simulant reactivity
A series of 1:1 host:guest association constants, determined by [1]H NMR in a CD 3 CN solution at 298 K, have previously been reported for compounds 1-22 with 23, a low molecular weight, non-specific, hydrogen bond donating receptor; these binding events have also been evaluated in terms of the series of computationally derived parameters listed in Table 2 [11]. We have previously reported the ability to use exhaustive parameter searches to rapidly identify predictive models for both the development of novel antimicrobials [31] and organophosphorus simulant binding [11]. Here, we used R data analysis software [48] to determine both direct and inverse predictive models for the percentage breakdown of potential chemical warfare agent simulants, building upon our previous work.
Organocatalysis by (thio)ureas is generally considered to arise from enhanced reactivity/electrophilicity of the substrate induced by hydrogen bonding with the catalyst, and catalyst efficacy is typically correlated with hydrogen bonding complexation strength of the substrate with the (thio)urea moiety [49][50][51][52][53]. We therefore hypothesised that 1:1 association constants, derived using a non-specific hydrogen bond donating receptor (23), could be used as a predictor of reaction centre electrophilicity when the principle hydrogen bond accepting group is directly joined to the reactive centre. Thus, we further hypothesise these association constants could be used in the prediction of reaction rates.
To test this hypothesis, we performed exhaustive parameter searches including association constants to determine the role of this parameter in predicting the reactivity of potential simulants towards NaOH, under the experimental conditions described. † All predictive models were determined using Equations 1 and 2, to allow for the determination of both two and three parameter combinations with a direct and inverse linear relationship. Five datasets were produced taking into account the limitations of the experiments conducted: (1) all data generated for all parameters and potential simulants listed in Figure 1 and Table 2; (2) exclusion of association constant (P 1 ) from the complete data set (1); (3) exclusion of simulant data for those compounds, which exhibit association constant values less than 10 M −1 from the P 1 data set, as these values are known to fall outside of experimental limitations; (4) exclusion of percentage breakdown data from P 0 , where potential simulant breakdown was found to be <10% over 6 hours, as these values were thought to fall outside of experimental limitations due to inherent integration errors; (5) and the exclusion of both association constant (P 1 ) < 10 M −1 and percentage breakdown values (P 0 ) < 10%.
This approach allowed us to ascertain the importance of an experimental association constant, generated from a non-specific host:guest binding study, in the prediction of simulant reactivity. The models produced using these exhaustive search methods for datasets 1-5 were then ranked according to how well they fit the data in terms of their R 2 values. From this ranking, we then selected the 10 best fitting models for Equation 1 and Equation 2. This resulted in the production of 20 models of best fit for each dataset 1-5, giving 100 models in total. From this list, we then extracted the parameters that feature in these models and took those as being the most relevant ones for the prediction of the simulant properties. A summary of these data is shown in Figure 2.
Comparing all five data sets, parameters P 2 , P 3 , P 4 , P 7 , P 8 , P 9 , P 10 , P 11 , P 14 , P 16 , P 17 , P 18 , P 19 and P 20 (see Table 2) occur ≤10 times within the 100 models. It is therefore unlikely that these parameters are significant for the construction of a predictive OP simulant breakdown model. These 100 models also indicate that P 13 (LUMO Energy) should be considered a significant parameter with a total of 64 Table 2. List of parameters (P) used to produce potential simulant predictive models. P 1 -P 20 are previously published data sets [11]. The experimental methods used to calculate P 0 -P 20 are detailed within the supplementary information. †.  occurrences overall; this is not surprising as this orbital is directly involved in the reaction process. Gratifyingly, the parameter with the second highest number of occurrences is P 1 (the association constant), with a total of 35 occurrences over the 80 models, where this parameter was included within the initial datasets (Figure 2(a,c,d,e)). Furthermore, the occurrences of this parameter were limited to models that were produced by data sets 3 and 5 ( Figure 2(c,e)). Importantly, it is these two datasets that consider the experimental limitations of association constant determination -in effect, the good fit arises after excluding the noise. In addition, as highlighted in Figure 2(d), where the experimental limitations are considered for the % simulant breakdown, but not for the association constant parameters, we see the occurrence of P 1 drop to zero, supporting the importance of including such limitations in these datasets. After screening, the 20 models with the highest R 2 values used data set 5 (Figure 2(e)) which takes into account the experimental limitations placed on both the reactivity and association constant data. Within these models, the association constant parameter (P 1 ) occurs 21 times. We therefore believe that this result confirms that experimental association constants for hydrogen bonded complex formation can be utilised in the construction of predictive reactivity models for the identification of OP simulants. When comparing the top 20 models, defined by R 2 analysis, through the fitting of datasets 1-5 to Equations (1) and (2), the models produced from dataset 5 reported the highest R 2 values. Of these top 20 predictive models, those produced using Figure 2. The frequency of occurrences in the top 10 models, produced by an exhaustive search of fits, ranked by R 2 analysis for each parameter to: i) Equation (1) (blue); and ii) Equation (2) (red), in a) dataset 1 (includes 440 data points in total), b) dataset 2 (includes 418 data points in total), c) dataset 3 (includes 240 data points in total), d) dataset 4 (includes 260 data points in total) and e) dataset 5 (includes 140 data points in total). Equation (2) reported higher R 2 values than from Equation (1). The models reporting the three highest R 2 values (R [2] = 0.9596, 0.9359 and 0.9204 respectively) have been identified as current lead predictive models and are shown in Figure 3. These models include i) Figure 3(a) -P 1 (association constant), P 4 (molecular volume) and P 8 (% polar surface area); ii) Figure 3(b) -P 1 (association constant), P 4 (molecular volume) and P 13 (LUMO); iii) Figure 3(c) -P 1 (association constant), P 11 (SAF) and P 13 (LUMO).

Physical property considerations
In addition to breakdown rate, there may be other physical properties required of a simulant to make its use practical in exercises such as live testing scenarios. The choice of simulant is therefore often a compromise between competing demands, for example the simulant might have to be readily available in advance of live experiment and thus require medium/long-term storage. Considering possible practical requirements, we explored several physical parameters for the simulants detailed herein: storage stability, viscosity, hydrolytic stability, density, and surface tension. The key values are summarised in Table 3; the remainder can be found within the ESI. †

Conclusion
We have identified three predictive models for potential OP CWA simulant breakdown in the presence of five equivalents of NaOH, in a methanol-d 4 :D 2 O 70:30 solution at 291 K. It is envisioned that the development of such predictive methodologies will enable the production of next-generation OP CWA decontamination technologies that rely on nucleophilic breakdown reaction mechanisms specifically. Additionally, we have shown the importance of considering experimental limitations on both association constant and reactivity datasets. The breakdown data are supported by a variety of physicochemical characterisation data, which are of use selecting an appropriate OP CWA simulant. It is hoped that the use of structure activity relationships, achieved using similar, predictive methodologies to those detailed here, will enable the increasingly effective identification of appropriate simulants to aid in combatting the threat of OP CWAs.  (1) and (2). The models shown in a-c were generated from dataset 5.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the UK Research and Innovation [MR/T020415/1].

Supplemenatry material
Supplemental data for this article can be accessed here.