A single method to analyse residues from five different classes of prohibited pharmacologically active substances in milk

ABSTRACT In the European Union, the use of veterinary drugs belonging to the A6 group is prohibited in food-producing animals according to Commission Regulation (EU) No. 2010/37. The aim of this study was to improve the analytical control strategy by developing a single method to analyse residues of prohibited pharmacologically active substances in milk. For this, a single method was developed to analyse 16 prohibited pharmacologically active substances belonging to five different substance classes at required or recommended levels: nitroimidazoles at 3 μg kg−1, nitrofurans at 0.5 μg kg−1, chloramphenicol at 0.1 μg kg−1, dapsone at 5 μg kg−1 and chlorpromazine at 1 μg kg−1. Milk sample preparation started with an acid hydrolysis combined with a derivatisation. These steps were followed by a clean-up consisting of a dispersive solid-phase extraction and a liquid–liquid extraction. Finally, the sample extracts were analysed by liquid chromatography combined with tandem mass spectrometry, operating alternately in the positive and negative mode. The method was fully validated according to Commission Decision 2002/657/EC for bovine milk and additionally validated for caprine milk. The validation proved that the method is highly effective to detect and confirm all 16 substances in bovine and caprine milk and, additionally to quantify 15 of these substances in bovine milk and 13 of these substances in caprine milk. This study resulted in a new multi-class method to detect, quantify and confirm the identity of 16 prohibited pharmacologically active substances belonging to five different substance classes in two types of milk.


Introduction
The European Union established regulations to control and enforce veterinary drug use in foodproducing animals (e.g. milk-producing bovine and caprine) (European Commission 2017) beca use of potential public health risks (Baynes et al. 2016). These risks include direct exposure to toxic veterinary drug residues and related metabolites and the chance of developing antimicrobial resistance in bacteria. Strict legislation regarding the application of veterinary drugs and their enforcement are established in the European Union: Regulation (EU) 2017/625 (European Commi ssion 2017) and Commission Regulation (EU) No. 2010/37 (European Commission 2010. Com mission Regulation (EU) No. 2010/37 (European Commission 2010 includes, in addition to maximum residue limits for registered veterinary drugs, a table with prohibited veterinary drugs, classified as A6 group in Council Directive 96/ 23/EC Annex 1 (European Commission 1996). In this paper these substances are referred to as prohibited pharmacologically active substances (PPAS). The PPAS group evaluated in this study includes nitroimidazoles, nitrofurans, chloramphenicol, dapsone and chlorpromazine.
Three other relevant PPAS are chloramphenicol (CAP), dapsone (DAP) and chlorpromazine (CPZ). CAP is a broad-spectrum antibiotic and has been prohibited as results of concerns about its genotoxicity, embryotoxicity and fetotoxicity, carcinogenicity and the possible contribution to aplastic anaemia (JECFA 2004;Baynes et al. 2016). DAP is under debate because of its carcinogenic and genotoxic properties (EFSA 2005;EMA 2012). The lack of information about reproductive toxicity and teratogenicity has made DAP a PPAS since 1996 (EMA 1996b). CPZ belongs to the class of tranquillizers. Prohibition of CPZ is based on the advice of the Joint FAO/WHO Expert Committee on Food Additives (JECFA) because of, among other things, the lack of relevant toxicological data (JECFA 1991;EMA 1996a).
The EURLs responsible for the analysis of PPAS have created guidelines for the analysis of most of these substances in food, including a recommended detection level in all kinds of food products, including milk (CRL 2007). The recommended detection level for the three prohibited NIZs including the corresponding metabolites is 3 µg kg −1 and for DAP 5 µg kg −1 . For NFs and CAP the guidelines refers to the regulation where a legal level is established, namely the minimum required performances limit (MRPL) The analysis of PPAS requires highly sensitive, selective and accurate methods. Many papers have been published on the analysis of a single class of PPAS residues (or in combination with regulated substances) in milk (e.g. NIZs (Thompson et al. 2009;Gui et al. 2011;Tölgyesi et al. 2012;Hernández-Mesa et al. 2014, 2015Mitrowska et al. 2014;Tuzimski and Rejczak 2017;Wang et al. 2019), NFs (Ryan et al. 1975;Galeano Dıáz et al. 1997;Chu and Lopez 2007;Rodziewicz 2008;Alkan et al. 2016;Śniegocki et al. 2018), CAP (Allen 1985;Sørensen et al. 2003;Ferguson et al. 2005;Nicolich et al. 2006;Mohamed et al. 2007;Rezende et al. 2012;Berlina et al. 2013), DAP (Suhren and Heeschen 1993;Van Rhijn et al. 2002;Hadjigeorgiou et al. 2009;Kaklamanos and Theodoridis 2012;Varenina et al. 2016) and CPZ (Ohkubo et al. 1993)), of which most use a chrom atographic technique coupled with mass spectrometric detection. In order to analyse all desired PPAS, five different single class methods are needed. In the last twenty years, analysis methods have been published to analyse all kinds of combinations of PPAS residues (in combination with regulated substances) in milk e.g. NIZs combined with DAP (Ortelli et al. 2009), NIZs with CAP (Cronly et al. 2010;Wang et al. 2016), NIZs with CPZ and CAP (Zhan et al. 2012) and NIZs with CPZ, CAP and DAP (Kibechu and Sichilongo 2012;Robert et al. 2013;Amelin et al. 2018;Jadhav et al. 2019). However, multiple methods for the analyses of NFs together with other substance classes are limited not only in milk but also in all kind of matrices (Perez et al. 2002;Xia et al. 2008;Shen et al. 2013;Kaufmann et al. 2015;Shendy et al. 2016;Zhang et al. 2017;Aldeek et al. 2018;Chen et al. 2020). For NFs analysis, a hydrolysis is required to include protein-bound residues and a derivatisation is applied to stabilise the metabolites and to enhance the mass spectrometric signal (Molognoni et al. 2021). However, using these hydrolysis and derivatisation only one validated multi-method for milk analysis is published (Kaufmann et al. 2015) and this method is a combination of only the NF metabolites and CAP.
To our knowledge, no fully validated method has previously been reported, including PPAS belonging to five different substance classes in milk using ultra high-performance liquid chromatographytandem mass spectrometry (LC-MS/MS). This study describes a new multi-class method and its validation for the detection, quantification and confirmation of identity of the PPAS in milk. This new approach yields a far more cost-effective surveillance of the PPAS in milk using one multi-class method instead of multiple single-class methods.
Stock solutions were prepared at 100 mg L −1 in methanol for most standards and internal standards. However, DAP and DAP-d 8 were prepared at 1000 mg L −1 in methanol and CPZ and CPZ-d 6 were prepared at 100 mg L −1 and 1000 mg L −1 , respectively, in ethanol. From the stock solutions, a mix solution was prepared at 3 mg L −1 NIZs, 0.5 mg L −1 NFs (DNSH was added separately), 0.1 mg L −1 CAP, 5 mg L −1 DAP, and 1 mg L −1 CPZ in Milli-Q. This mix solution and an individual solution of 0.5 mg L −1 DNSH were diluted 20 times in Milli-Q water to achieve the final standard solution (5-250 µg L −1 ). In addition, a mix solution of internal standards was prepared at 3 mg L −1 NIZs, 1 mg L −1 NFs, 0.3 mg L −1 CAP-d 5 , 5 mg L −1 DAP-d 8 , and 1 mg L −1 CPZ-d 6 , followed by a twenty-time dilution in Milli-Q water to a final concentration of 15-250 µg L −1 .

Optimised sample preparation
The final optimised sample preparation procedure is based on the method published by Mulder et al. (2005). The procedure for the sample preparation is as follows: 2.0 ± 0.05 g of homogenised raw milk was weighed into a 12 mL polypropylene tube and 80 µL of internal standard solution (15-250 µg L −1 ) was added. Subsequently, 5 mL hydrochloric acid solution and 50 µL 2-nitrobenzaldehyde solution were added. The sample solution was shaken head-over-head (Heidolph REAX-2, Schwabach, Germany), overnight at 37°C to hydrolyse protein-bound NF metabolites and to derivatise the metabolites into their nitrophenyl (NP)-derivatives. After cooling of the sample solution to room temperature, 500 µL of trisodium phosphate solution and at least 300 µL of sodium hydroxide solution were added to adjust the pH to 7.0 ± 0.5. Next, the sample solution was diluted with 10 mL Milli-Q water and transferred into a 50 mL polypropylene tube containing the AOAC dispersive SPE kit. The tube was shaken head-overhead for 5 minutes and afterwards the sample solution was centrifuged for 10 minutes at 3000 g (MSE Falcon 6/300, Heathfield, UK). After centrifugation, the sample solution was transferred into a clean 50 mL polypropylene tube. Additionally, a liquidliquid extraction was performed by adding 8 mL of ethyl acetate to the sample solution and subsequently this was shaken head-over-head for 20 minutes. After centrifugation (3000 g, 10 min), the ethyl acetate upper layer was transferred into a 12 mL glass tube. The liquid-liquid extraction procedure was repeated, and the ethyl acetate layers were combined. The ethyl acetate was evaporated under nitrogen at 40°C (TurboVap LV Evaporator Zymark, Hopkinton, MA, USA), and the remaining part was dissolved in 200 μL reconstitution solvent (0.1 (v/v)% formic acid in 20 (v/v)% acetonitrile). Finally, the sample extraction was centrifuged (3000 g, 10 min) and transferred into a vial to facilitate analysis by LC-MS/MS.

Method development
During the method development, several sample preparation steps were studied and optimised. The individual sample preparation steps were first evaluated aiming for high absolute recoveries. The absolute recovery is the relative response of a sample spiked before sample preparation compared to a sample spiked after sample preparation. The absolute recovery could not be established for the NFs because no derivatised NF marker metabolite standards were available for spiking the samples after sample preparation. Additionally, the optimisation was evaluated based on the signal-tonoise ratio of the observed peaks in the chromatogram, for all substances including the NFs. The signal-to-noise ratio was automatically determined by the software during processing of the data.
In the first experiment, three different clean up procedures were selected ( Figure 1) based on inhouse experience and expectations from the literature. Procedure A is in principle based on a method for prohibited veterinary drugs in urine, as published by León et al. (León et al. 2012). In the procedure, the enzymatic hydrolysis as applied by León was omitted since insignificant glucuronidation of CAP was expected in milk (Nouws et al. 1986). On the other hand, an acidic hydrolysis and derivatisation step were added for the NFs. These steps were followed by the capture of the water content by MgSO 4 , included in the extraction salt package (EN 15662). In procedure B, the same procedure was applied, but the water content was not captured by MgSO 4 but separated from the acetonitrile by using 2 g NaCl. In both procedures A and B, the acetonitrile layer was cleaned with an AOAC dispersive SPE kit and concentrated. Procedure C is described in the section optimised sample preparation above.
A second experiment was performed to optimise procedure C regarding the individual constituents of the AOAC dispersive SPE kit. In this experiment, the sample preparation was performed as described in the optimised sample preparation section, only the use of the AOAC dispersive SPE kit was adjusted. The SPE kit was replaced by the individual constituents (PSA, C 18 , and MgSO 4 ), all possible combinations of these constituents, or was completely excluded. The absolute recoveries and signal-to-noise ratios were used to determine the influence of all the individual constituents and their interactions. Statistical evaluation of the absolute recoveries was carried out using a factorial design as described by Andries and de Vries (Andries and De Vries 2007).
In experiment three, the sample preparation was performed as described in the optimised sample preparation section, and the ethyl acetate liquidliquid extraction was optimisation. Four different variations were tested. After the described dispersive SPE clean-up, the substances were extracted using a single or double liquid-liquid extraction procedure with 8 or 6 mL ethyl acetate.

LC-MS/MS analysis
After sample preparation, the LC-MS/MS analysis was carried out on an Acquity UPLC (Waters, Milford, MA, USA) or a Nexera UHPLC (Shimadzu, Kyoto, Japan). An Acquity UPLC BEH C 18 analytical column (Waters, Milford, MA, USA) of 100 × 2.1 mm with a particle size of 1.7 µm, was placed in a column oven at 35°C. Into the LC column, 10 µL of the sample extract was injected. The substances were chromatographically separated by gradient elution using a flow rate of 0.4 mL min −1 . The gradient started at 100% mobile phase A and linearly increased to 30% mobile phase B in 4 minutes. The next 2 minutes were isocratic, followed by an increase of the percentage mobile phase B to 70% in 1.5 min and to 100% in the next 1.5 min, with a final hold of 1.0 min and a reequilibration time of 2 min at 100% A.
After the chromatographic separation, the substances were introduced directly into a Q-Trap 6500+, Q-Trap 6500 or Q-Trap 5500 mass spectrometer (Sciex, Framingham, MA, USA). The mass spectrometer was operated in polarity switching mode (i.e., alternately in positive and negative electrospray ionisation mode). The operating parameters were curtain gas flow 40 psi (N 2 ), nebulising gas flow 50 psi (N 2 ), heater gas flow 50 psi (N 2 ), source temperature 400°C and ion spray voltage (-) 4500 V. The precursor ions were fragmented to product ions using collision induced dissociation (N 2 ). The scheduled Multiple Reaction Monitoring (MRM) transitions (60 s window in positive and 120 s window in negative mode) are presented in Table 1. Data were processed using the Multiquant software V2.1.1 (Sciex, Framingham, MA, USA), resulting in a response of each transition. The response was corrected using the corresponding isotopically labelled internal standards. Since no labelled internal standards were available for MNZ-OH and TNZ, HMMNI-d 3 was used for these two substances. The response factors were calculated by dividing the area of the most abundant product ion of the substance by the area of the internal standard.

Method validation
The method was fully validated for bovine milk, according to the criteria described in Commission Decision 2002/657/EC (European Commission 2002). The validation was carried out at three days by two different technicians, on three different LC-MS/MS systems and included bovine milk samples (n = 24) sampled from milk tanks. The milk samples were collected by the Dutch Food and Consumer Product Safety Authority on different, randomly selected farms at various time points during the year. Ideally, a validation is carried out using incurred certified reference materials. However, these materials are not available for this specific application; therefore, blank samples were spiked as an alternative.
The spike level or the validation level of NIZs was 3 µg kg −1 and DAP was validated at 5 µg kg −1 as recommended for milk by the EURLs (CRL 2007). Recently, the RPA for NFs and CAPs was revised by Commission Regulation (EU) 2019/1871 (European Commission 2019), namely 0.5 µg kg −1 for NFs and 0.15 µg kg −1 for CAP. Since the validation occurred before the establishment of these new RPAs, the validation was carried out at 0.5 µg kg −1 for NFs and 0.1 µg kg −1 for CAP. As a result, for CAP, the highest spike level corresponds to the revised RPA (European Commission 2019). Finally, 1 µg kg −1 was used as the validation level of CPZ because no RPA or recommended concentration exists.
During the validation, blank samples were spiked at three different levels: 0.5, 1.0 and 1.5 times validation level. Note that Commission Decision 2002/657/EC (European Commission 2002) states to use 1.0, 1.5 and 2.0 times the MRPL for prohibited substances, but because lower levels are more relevant for enforcement, validation levels were lowered. In addition, a detection capability as low as possible is preferred for PPAS because of their zero-tolerance policy.
The following validation parameters are related to a quantitative confirmatory method and were determined: selectivity, linearity, trueness, repeatability (RSD r ), repeatability including matrix variation (RSD r* ), within-laboratory reproducibility (RSD RL ), decision limit (CCα), detection capability (CCβ), confirmation of the identity, stability and robustness.

Selectivity
The selectivity was determined using 21 blank bovine milk samples analysed with the addition of only the internal standards. Selectivity was assessed by evaluating the signal of the blank materials for interferences at the retention times corresponding to the PPAS.

Linearity
On three different validation days, a matrix matched calibration line was prepared by adding standard solution of the PPAS (5-250 µg L −1 ) to aliquots of a blank bovine milk sample at 0, 0.25, 0.5, 1.0, 1.5 and 2.0 times the validation level. The added concentration of a substance was plotted against the internal standard corrected response and the response factor (as explained in section LC-MS/MS analysis). Least squares linear regression was used to create a matrix matched calibration line. The linearity of this line was accepted if the coefficient of correlation was at least 0.990.

Trueness, repeatability and within-laboratory reproducibility
Trueness and repeatability, including matrix variation (RSD r* ) and within-laboratory reproducibility (RSD RL ), were determined using seven different  blank bovine milk samples (other samples than the three used for the matrix matched calibration lines) on each individual day. These samples were spiked at 0.5, 1.0 and 1.5 times the validation level, prepared and analysed. In addition, the true repeatability (RSD r ) was determined as described in Commission Decision 2002/657/EC (European Commission 2002), using seven aliquots of a sin gle blank bovine milk sample. These aliquots were spiked at validation level, prepared and analysed.
The whole procedure was repeated on two more days, to obtain 21 results for each validation level with different milk samples and 21 results of the same milk sample. Concentrations were calculated using the matrix matched calibration line prepared and analysed under the same circumstances. The average measured concentration of the spiked samples was divided by the theoretical added concentration, resulting in the trueness. The RSD r* , RSD RL , and RSD r were calculated using analysis of variance (ANOVA). According to Commission Decision 2002/657/EC (European Commission 2002), the trueness of the method should comply with the established criteria for a quantitative analysis. The criteria depend on the validation levels: for levels between 1 and 10 µg kg −1 , a trueness of 70%-110% is accepted and for levels lower than 1 µg kg −1 a trueness of 50%-120% is accepted (European Commission 2002). The relative within-lab reproducibility (RSD RL ) is accepted below the value calculated from the Horwitz equation (Horwitz et al. 1980). However, the Horwitz equation is not applicable to the lower concentration range (<120 μg kg −1 ) (Thompson 2000) and therefore a complementary model was suggested. We adopted these more stringent criteria of Thompson. Based on these criteria, the RSD RL and RSD r are accepted below 22% and 14.7%, respectively. For RSD r* the same criteria were applied as for RSD r in this validation, being a worst-case approach.

Decision limit (CCα) and detection capability (CCβ)
For a quantitative confirmatory method, the decision limit (CCα) and detection capability (CCβ) have to be established (European Commission 2002). Because zero tolerance applies for all PPAS, the decision limit CCα (α-error is 1%) indicates the lowest level at which substances can be detected and confirmed (European Commission 2002). Results above the CCα should be considered to be noncompliant. However, for NFs and CAP, a second CCα (α-error is 5%) has to be determined because these substances have a legally established reference point of action (RPA) (European Commission 2019). In this case, the CCα RPA based on the RPA means the limit at and above which it can be concluded with an error probability of α that the actual quantitative result of the sample is above the RPA. However, results below RPA but above CCα based on zero tolerance should be reported to the competent authority as the authority has to retain a record of these findings in case of recurrence.
The CCα and CCβ based on zero tolerance were calculated with the calibration curve procedure according to ISO 11843 (European Commission 2002). According to the EURL guidelines, CCα should be lower than the recommended detection concentration or the minimum required performances limit, and thus, the RPA for a confirmatory method (CRL 2007). In this validation, the method is considered as applicable for the detection of PPAS if both CCα and CCβ are below the MRPL, RPA or concentration recommended by the EURL.
The CCα RPA and CCβ RPA were calculated according to the procedure described in Commission Decision 2002/657/EC (European Commission 2002) specified for registered veterinary drugs with a maximum residue limit. The procedure includes the reproducibility of spiked samples at the RPA.
If one of the quantitative validation parameters does not comply for a certain substance, the method is not applicable for quantitative analysis of this specific substance. However, for such a substance a qualitative approach can be used. The detection capability can be determined by the investigation of fortified blank material at and above the decision limit. In this case, the concentration level, where only ≤ 5% (n is minimal 20) false compliant results remain, equals the detection capability of the method (European Commission 2002). In other words, at least 20 of the 21 samples with addition at the relevant detection level showed a response with a minimum signal-to-noise ratio of 10.

Confirmation of identity
Another validation parameter relates to the confirmation of the substance identity. The substance identity is confirmed if the relative retention time of that substance deviates at maximum 2.5% compared to the average relative retention time of the same substance in the matrix matched calibration standards. Furthermore, the relative abundance of both product ions (ion ratio) should not deviate from the matrix matched calibration standards more than the criteria described in Commission Decision 2002/657/EC (European Commission 2002). Over 95% of the validation samples should comply with these confirmatory criteria, as required for a confirmatory analysis.

Stability
For quantitative analysis, it is important to study the stability of the substances included. In the stock solution, the stability was determined and evaluated following the procedure and criteria as proposed by Berendsen et al (Berendsen et al. 2011). In addition, the stability in the final extract was established by reanalysing seven extracts (of one day only at 1 times validation level) after one and a half week of storage in the freezer. For acceptance, the stability of the extracts, the trueness and RSD r* still need to comply with the validation criteria as mentioned before to confirm that no significant degradation has occurred.

Robustness
The robustness of the method has become apparent from the method development, whereby minor and major variations have been made and the influences are discussed in the method development section.

Additional validation
An additional validation was performed for caprine milk at validation level, using seven different caprine milk samples. Caprine milk was quantified based on a matrix matched calibration line made from bovine milk because it would be far more efficient in practice if bovine and caprine milk can be run in a single series with a single set of quality control samples. The validation parameters selectivity, trueness, repeatability including matrix variation (RSD r* ) and confirmation of the identity were determined and assessed as described for bovine milk. The within-laboratory reproducibility (RSD RL ) was estimated based on the RSD r* , using the equation RSD RL = 1.6 times RSD r* (van Reeuwijk 1998). Finally, if all validation parameters comply, the CCα is considered to be below the validation level and CCβ will be equal to the validation level because the additional validation included insufficient data points for a calculation of CCα and CCβ.

Method development
Three different experiments were performed to develop a method for the analysis of PPAS in milk. The first experiment studied three different clean up procedures; A, B and C (Figure 1). In procedure A, we encountered practical issues during the removal of the high water content after derivatisation. The water content was removed by MgSO 4 present in two extraction salt packages. However, this resulted in agglomeration of the MgSO 4 , even with the use of glass beads, which complicated mixing and homogenising of the sample solution. Also, in procedure B practical issues were encountered. In this procedure, the water was not removed but separated from the acetonitrile by using NaCl. In the acetonitrile layer unwanted gelation occurred. The gel formation made the acetonitrile transfer impractical and reduced the effective volume that could be transferred and used in the next steps. No practical issues were encountered during the sample preparation of procedure C, making it the preferred procedure (from the practical perspective) even though this procedure consists of more steps, including a time-consuming ethyl acetate liquid-liquid extraction. Time can be saved for a large sample series by rapidly freezing (approx. 20 minutes) the aqueous part at < −70 °C and immediately decanting the ethyl acetate layer.
The absolute recoveries of the three procedures for all substances except NFs were compared (Figure 2). The NF recoveries could not be established because no derivatised NF metabolite standards were available. The evaluation of the NFs was performed based on the signal-to-noise ratios, but as the signal-to-noise ratio is relevant to obtain detection limits as low as possibly achievable, signal-to-noise ratios for all substances were evaluated (Supplementary information (SI) Table  S1). For all substances, absolute recoveries (NFs are not included) of procedure B were lower than for procedures A and C (Figure 2). Absolute recoveries of procedures A and C were in the same order of magnitude. Note that DMZ, MNZ-OH, HMMNI, DNSH and CAP showed in generally relatively low signal-to-noise ratios and, therefore, a sufficient recovery for especially these substances is crucial. Even though absolute recoveries for some substances were below 65%, this was sufficient to detect them at the relevant levels. The results of the absolute recovery of procedure A are comparable with the results published by Kibechu et al. who use a similar AOAC clean-up (only without hydrolysis and derivatisation) in milk. The absolute recoveries in that research vary between 33% and 71% for MNZ, RNZ, DMZ, CAP, DAP and CPZ (Kibechu and Sichilongo 2012). In our method, for CPZ only the absolute recovery was severely higher in procedure A compared to procedure C. Nevertheless, for all substances procedure C showed higher or comparable signal-to-noise ratios compared to procedure A, also for the NFs (SI Table S1). Based on this observation and the limited practicability of procedure A, procedure C was preferred. Using procedure C detection of all PPAS at low levels was achievable. Therefore, the focus was on further optimisation of procedure C.
In procedure C, an AOAC SPE kit was included in the sample preparation. Using the AOAC SPE kit is not straightforward in an aqueous environment because some substances included in this method might have affinity to the C 18 material present in the kit. Nevertheless, the first experiment surprisingly showed that the absolute recoveries and signal-to-noise ratios of the substances were sufficient for this method to detect all substances at relevant levels. To even further improve the method, the influence of PSA, C 18 and MgSO 4 in the kit was studied. In most combinations of the individual kit constituents, an unwanted gel was formed during the ethyl acetate liquid-liquid extraction. Similarly, a thick gel was formed during the sample preparation without the addition of PSA, C 18 or MgSO 4 . This set-up (a hydrolysis and derivatisation followed by liquid-liquid extraction with ethyl acetate, without the addition of PSA, C 18 or MgSO 4 ) is similar to the method published by Kaufmann et al. (2015). Kaufman also observed an emulsion in the ethyl acetate layer using milk as matrix, but in the published method the liquid-liquid extraction was followed by a further clean-up using SPE, even though the final extracts could still be very cloudy. On the other hand, in the method published by Figure 2. Absolute recoveries of procedure A (white), B (light grey) and C (black) for PPAS at 1 µg kg −1 ; chloramphenicol at 0.3 µg kg −1 . The nitrofurans are not included because no derivatised marker metabolite standards were available. Rodziewicz (2008) sample preparation of milk was also performed with a hydrolysis and derivatisation followed by liquid-liquid extraction with ethyl acetate. The author did not mention any practical issues, this might be due to the fact that the raw milk was centrifuged and the upper fat layer was removed beforehand (Rodziewicz 2008). Fat removal of raw milk was also performed by Alkan et al. (2016) prior to hydrolysis, derivatisation and liquid-liquid extraction and again no practical issues were mentioned.
In our second experiment only four combinations of constituents resulted in a practical sample preparation: the AOAC SPE kit, self-mixed PSA, C 18 and MgSO 4 , the combination of C 18 and MgSO 4 , and only C 18 . Apparently, the gel formation is reduced by the presence of C 18 , except for the combination of PSA and C 18 . No explanation has been found for this observation.
The absolute recoveries of the four combinations without practical implications are presented in Figure 3. In general, the absolute recoveries are lower in the sample preparation which included only C 18 , as could be expected because most substances have affinity with C 18 in aqueous environment. In particular, it can be expected that the absolute recovery of CPZ is negatively influenced since this substance is the most lipophilic substance included. Indeed, lower absolute recoveries of CPZ were observed for the sample preparation with only C 18 (21 ± 0.2%) and also with C 18 and MgSO 4 (20 ± 2%) compared to absolute recoveries of the sample preparation with the AOAC SPE kit (57 ± 15%) or self-mixed PSA, C 18 and MgSO 4 (45 ± 2%) (Figure 3). For CPZ, the negative effect on the absolute recovery of C 18 material was statistically confirmed (p-value = 0.00002). However, the combination of all three constituents resulted in higher absolute recoveries compared to using only C 18 or C 18 and MgSO 4 . Apparently, PSA influences the affinity of CPZ to C 18 material as becomes apparent from the significant interaction of C 18 and PSA (p-value = 0.0018). Thus, the combination of PSA, C 18 and MgSO 4 is needed to prevent gel formation and it results in the overall highest absolute recoveries. In addition, the signal-to-noise ratios as shown in SI Table S2 hardly differ regardless of the constituents used in the preparation. Therefore, the sample preparation with the AOAC SPE kit is preferred or the cheaper alternative selfmixed PSA, C 18 and MgSO 4 .
In experiment three, the liquid-liquid extraction was optimised. For both the volume and the number of replications, the absolute recovery was Figure 3. The average absolute recovery of sample preparation with the AOAC SPE kit (white) or this kit replaced by 400 mg PSA, 400 mg C 18 and 1200 mg MgSO 4 (light grey), by 400 mg C 18 and 1200 mg MgSO 4 (dark grey) or by 400 mg C 18 (black), for all PPAS at validation level and half the validation level, except the nitrofurans. The error bars indicate ± standard deviation (n = 2). determined, as well as the signal-to-noise ratio (Figure 4 and SI Table S3). The absolute recovery is in general higher when using a second extraction and a little bit higher using two times 8 mL instead of two times 6 mL ethyl acetate. Remarkable, CPZ shows an unexpected result. For this substance, the single extraction shows higher absolute recoveries and signal-to-noise ratios compared to the duplicated extraction. No explanation has been found for this observation. However, the negative influence of the double extraction on the CPZ response does not outweigh the positive effect for all other PPAS. Therefore, the final optimised method consists of a double liquid-liquid extraction using 8 mL ethyl acetate.

Validation
The aim of the validation was to assess the quantitative confirmatory aspect of the method. During the validation, blank bovine milk samples were spiked at 0.5, 1.0 and 1.5 times validation level. An example of chromatograms from bovine milk spiked at 1.0 times validation level is presented in SI Figure 1 for all PPAS. Table 2 presents an overview of the validation results for bovine milk. The validation parameters were calculated on the basis of 21 results (seven samples per validation level on three different days). Only for IPZ, AOZ, AHD and SEM, 20 results were used at 0.5 times validation level because a single sample showed severe chromatographic fluctuations. These fluctuations resulted in substances eluting outside of the detection window, so no data was obtained. For the following analysis, the scheduled MRM windows were broadened to overcome this problem.

Selectivity
Selectivity was assessed by investigating the signal of the blank materials for interferences. No interfering signals were observed in the blank samples at the retention times corresponding to the transitions of the substances to be validated. Therefore, the selectivity of the method is considered sufficient.

Linearity
The matrix match calibration line was injected at the beginning and at the end of the sample series. The linearity of the line was expressed by the coefficient of correlation. In all cases, the coefficient of correlation complied with the criterion: they were between 0.992 and 1.000. For this reason, the linearity of all substances is accepted in the range between 0.25 and 2 times the validation level.  Table 2 shows the results of the validation parameters trueness, true repeatability (RSD r ), repeatability including matrix variation (RSD r* ) and withinlaboratory reproducibility (RSD RL ). Following the validation criteria in Commission Decision 2002/ 657/EC (European Commission 2002), a trueness of 70%-110% is accepted for levels between 1 and 10 µg kg −1 a trueness of 50%-120% for levels lower than 1 µg kg −1 . For all substances except for CPZ the trueness was between 95 and 107%, and thus the trueness complies with the criteria for all substances other than CPZ. The trueness of CPZ was above 120% for two validation levels (1 µg kg −1 and 1.5 µg kg −1 ). This criterion is probably exceeded because of the sub-optimal sensitivity of one LC-MS/MS instrument for CPZ. Therefore, the Table 2. The validation results in bovine milk; trueness, repeatability (RSD r ), repeatability including matrix variance (RSD r* ), withinlaboratory reproducibility (RSD RL ), decision limit (CCα), detection capability (CCβ) based on zero tolerance and CCα RPA and CCβ RPA based on the RPA. n is the number of samples used for the calculations and is only presented if lower than 21. The underlined values exceed the established criteria.
validation did not prove the applicability of the method to quantify CPZ. However, CPZ can be analysed qualitatively using the reported method. Overall, it can be stated that the qualitative screening of PPAS is more important than the quantification because the use of PPAS is prohibited overall. In case of a suspect sample for CPZ, a quantitative result can be obtained using a validated single substance method or by applying multiple level standard addition. The RSD r was between 2.2% and 13.5% for all substances and at all validation levels, thus below the limit of 14.7%. Therefore, the RSD r is accepted. Also, the RSD r* and RSD RL , comply with the established criteria for all substances and at all levels. (Kaufmann et al. 2015)

Decision limit (CCα) and detection capability (CCβ)
For all PPAS that complied with the quantitative performance criteria, CCα and CCβ were determined based on a zero tolerance. The results are presented in Table 2. CCα and CCβ were below the lowest validation level and thus below the recommended concentrations (CRL 2007) and/or RPAs (European Commission 2019). Therefore, the method is applicable for analysis of all these PPAS in bovine milk. The validation of CPZ showed a deviating trueness at two levels. However, CCβ could be established because the validation showed 21 (100%) non-compliant results at 0.5 times validation level. Therefore, CCβ is equal or below 0.5 µg kg −1 . Consequently, for CPZ, the validation proved that the method is applicable for qualitative analysis of CPZ in bovine milk.
The CCα and CCβ results of NFs and CAP in the reported method are slightly higher compared to the validation results of NFs and CAP in milk published by Kaufmann et al. (Kaufmann et al. 2015) However, in this method only two substance classes (NFs and CAP) were included and in the method described in this study five different classes were included. In addition, the CCα and CCβ values described in this study are low enough to detect the PPAS at the required or recommended levels. On the other hand, the CCα and CCβ values presented in this study shows a strong improvement compared to the multimethod published by Kibechu et al. including NIZ, CAP, DAP and CPZ (NFs not included) and with LOQ values between 6 and 37 μg kg −1 (Kibechu and Sichilongo 2012)

Confirmation of identity
Following Commission Decision 2002/657/EC (European Commission 2002), the identity was confirmed for all substances, in all validation samples. Therefore, the validation proved that the method is applicable to confirm the identity of the PPAS at relevant levels.
It should be noted that confirmation of the identity of SEM does not automatically confirms the use of nitrofurazone at dairy farms although SEM is the mentioned marker metabolite in the guidelines (CRL 2007). The specificity of SEM as marker metabolite for nitrofurazone has been under debate for years (EFSA 2015;Abernethy 2015;Stadler et al. 2015). SEM has been detected despite no nitrofurazone being used, especially in processed milk products. SEM in dairy products might be a by-product which is produced during the manufacturing process. So far, SEM has not been detected in raw milk samples as a by-product and this allows us to use SEM as a marker metabolite for nitrofurazone in this method which is specific for raw milk samples (Stadler et al. 2015).

Stability
The stability was tested for the reference standards in solution and in the final extract. The NIZs, NFs (except DNSH) and DAP demonstrated to be stable in stock solution for at least one year and CAP for at least six months when stored in the fridge. CPZ is stable for at least two years stored in the freezer and DNSH is stable for at least one month in the freezer.
The stability in extracts was tested by reanalysing seven final extracts after 1.5 weeks storage in the freezer. For all substances, the trueness was between 96% and 108% and the RSD r* between 1.4% and 10.3% and both are acceptable: no relevant degradation is observed. Therefore, the final extracts are stable for at least one and a half weeks stored in the freezer.

Additional validation
The method was additionally validated for caprine milk, and the results are presented in Table 3. The selectivity of the method is considered sufficient because no interfering signals were observed in the blank samples. The trueness of all substances complied with the criteria except for MNZ and TNZ. The RSD r* complies for all substances and the RSD RL complies for all substances, except for MNZ-OH. Therefore, MNZ, TNZ and MNZ-OH cannot be analysed quantitatively, but can be analysed qualitatively using the reported method. The CCα and CCβ could not be calculated, due to the limited number of data points. However, all substances were detected (with signal-to-noise ratio > 10) at validation level and thus it is concluded that CCβ is equal to or below the validation level. Consequently, CCα is in all cases below the validation level. The identity of all substances was confirmed in all cases and therefore the method is suitable for confirmation of the identity of the PPAS.

Validation performances
The reported method for the analysis of PPAS in milk is relative time-consuming. However, the complete validation showed that all PPAS of five different substance classes could be detected and confirmed in caprine and bovine milk at relevant levels within a single method. In addition, 13 out of 16 PPAS could also be quantified in caprine milk and 15 out of 16 PPAS could be quantified in bovine milk.

Disclosure statement
No potential conflict of interest was reported by the author(s).