Detección de SARS-CoV-2 en aguas residuales como alerta temprana en el Área Metropolitana de la Ciudad de Buenos Aires (BAMA)

RESUMEN Agua y Saneamientos Argentinos S.A. (AySA) brinda servicios esenciales que abarcan la producción de agua potable y el tratamiento de aguas residuales para una población que supera los 14,5 millones de habitantes en el Área Metropolitana de Buenos Aires (AMBA), Argentina. En respuesta a la declaración de pandemia por COVID-19 por parte de la Organización Mundial de la Salud (OMS), AySA desarrolló una metodología para evaluar la carga genética viral del SARS-CoV-2 en aguas residuales sin tratar. Este enfoque tenía como objetivo aprovechar el potencial de la vigilancia epidemiológica basada en aguas residuales, basándose en experiencias internacionales. Para monitorear la carga viral en las secciones representativas del sistema de recolección de aguas residuales, empleamos una técnica de ultracentrifugación adaptada para concentrar muestras. Después de esto, se realizaron la extracción de ARN y RT-qPCR para cuantificar el gen Orf1ab del SARS-CoV-2. Este estudio se realizó de diciembre de 2020 a junio de 2021, para anticipar la segunda ola de la pandemia. Al analizar los datos de cuatro grandes plantas de tratamiento de aguas residuales, identificamos asociaciones estadísticamente significativas entre la carga genómica viral log10 y los casos positivos log10 informados una o dos semanas después. En base a los resultados obtenidos, concluimos que los niveles de virus del alcantarillado podrían ser un buen predictor de futuros casos clínicos a ser diagnosticados en un futuro inmediato.


. Introduction
At the onset of 2020, the COVID-19 epidemic was declared by the WHO as a public health emergency of international concern and, on March 11th, it was characterized as a pandemic [1], and different strategies for the epidemiological surveillance of the infection took greater relevance.Before the first cases of infected patients were registered in the Netherlands, scientists from the KWR Water Research Institute [2] had already detected traces of coronavirus in wastewater.This finding corroborated the understanding that, although SARS-CoV-2 primarily spreads through the air, it can also be excreted in feces and urine from both symptomatic and asymptomatic individuals [3][4][5][6].
Studies and meta-analyses indicate that the prevalence of SARS-CoV-2 in patient feces of patients ranges from 15.3% and 83.3% [7][8][9].Furthermore, viral shedding in stool was detected in 48.1% of patients and could persist for up to ≥ 33 days from the day of illness onset even after viral RNA negativity in respiratory specimens [7,10].
Signals of about 100 bases from small RNA regions, detected in SARS-Cov-2 PCR methods, have high chances of outlasting RNA genomes and intact virions [11].The RNA decomposition of various non-enveloped viruses in different matrices, such as domestic and hospital wastewater, has also been studied, and some studies have also investigated the decomposition of SARS COV2 in wastewater, depending on temperature and placing the T90 (time required for 1 -log10 reduction) in 8.04 to 27.8 days [12].
Although wastewater-based epidemiology (WBE) is less sensitive than clinical trials and largely dependent on the viral load in the patient's feces, this method could be superior to clinical trials in the early stages of COVID-19 infections in highly populated places, especially in those populations with high percentages of individual showing mild symptoms or no symptoms at all [2,12,13].
The strategy of WBE has a long history of successful application in poliovirus eradication surveillance, hepatitis A/E, detection of illicit drugs, microbiological antimicrobial resistance (MAR) monitoring (ARM), and being is broadly used with non-enveloped viruses excreted in fecal matter and urine [14][15][16][17][18][19].
In April 2020, researchers in Spain reported the first detection of SARS-CoV-2 RNA in untreated wastewater samples collected from six wastewater treatment plants in Murcia, the region with the lowest prevalence in the Iberian Peninsula.Environmental surveillance data were compared with reported COVID-19 cases reported at the municipal level, revealing that community members were shedding SARS-CoV-2 RNA in their feces even before the first cases were reported by local or national authorities in many of the cities where sewage samples were taken.In the Valencia region of Spain, they were able to detect SARS-CoV-2 RNA in sewage samples when cases reported in that region were only incipient.A rapid increase in viral RNA was also evident, anticipating the subsequent increase in the number of reported cases [6,20,21].Sewage is a source of information that provides data on human health, which could be used as a tool to refine the public health response to a pandemic caused by a pathogen.Sewage surveillance provides an additional tool for assessing the presence and prevalence of infectious diseases when clinical testing capacity is limited.In addition, the incorporation of population-wide data can provide information for modeling schemes [22][23][24][25].
AySA is a public management company, in charge of providing drinking water and sanitation services to a densely populated region of Argentina.This procedure was carried out entirely in our Molecular Biology laboratory, set up with the company's own resources and managed by professionals belonging to the same company.
With this insight, the WBE strategy could be an effective disease surveillance tool to analyze the virus spread of the virus within a settled population [6,10], for that motive, our objectives were to develop a molecular method to detect and quantify SARS-CoV-2 viral load in wastewater in the BAMA and to implement possible longitudinal wastewater surveillance as a possible complementary tool to help in the public health system.
We have studied the spread of SARS-CoV-2 infection in the BAMA, a region with a population of approximately of 15 million inhabitants [26,27], and designed a simple sampling plan which included the inlet of four treatment plants, to assess the viral load by using an adapted ultracentrifugation technique as a concentration method, followed by RNA extraction and quantitative reverse transcription-polymerase chain reaction (RT-qPCR) for measuring the Orf1ab gene target for SARS-CoV-2 [20,28] in our laboratory.

Sampling sites and samples collection
The present study covers a period between December 2020 and June 2021 in the Metropolitan Area of Buenos Aires (BAMA), Argentina (Figure 1), composed of 14.  [26,27].For each influent point in these wastewater treatment plants, the retention time was less than 22 h.
To avoid dilution effects, sampling was not scheduled on rainy days.
As Escherichia coli (E.coli) is known to be a fecal contamination indicator [29], it was measured to identify the optimal hour range to take the samples.From each sample point, 3 liters of punctual samples were taken between 7:30 AM to 11:30 AM.These samples were divided into 1 liter for molecular analysis and 2 liters for additional determinations: Biological oxygen demand (BOD5), Chemical oxygen demand (COD), total suspended solids (TSS), total Hydrocarbons (HC), oil and grease, methylene blue active substances (MBAS), phenolic compounds, and pH in situ (Table 1) which were used to determine the characteristics of the incoming tributary.The determination of E. coli was carried out by multiple tube fermentation technique [30] and the results expressed in most probable number in 100 ml (MPN/100 ml).Samples were transported and preserved at 4°C in our Central laboratory until their concentration step.

Wastewater concentration
The samples were concentrated on the same day of the collection.
PAC was chosen as an alternative option instead of AlCl3 (as it was difficult to import to Argentina in the pandemic period) widely indicated for the concentration of enteric viruses in wastewater [14,31].
The concentration process consisted of an ultracentrifugation-adapted method.Briefly, an aliquot of 350 ml was taken from the homogenized sample and 3.5 ml of poly aluminum chloride (PAC) solution (0.9% w/v in aluminum) was added to a 500 ml centrifuge tube (Corning -ref 431,123).
The mixture was homogenized by vortex (Vortexer) and pH was adjusted to 6.0 with 1 N HCl or 2.5 N NaOH.Then centrifugation at 1700 × g for 20 minutes at 15°C was made.
The supernatant was discarded and the pellet was resuspended with 8.75 ml of 3% w/v meat extract solution (Merck) by shaking at 1000 rpm, for 1 minute, to elute the viral particles attached to solid matter.In addition, the high protein concentration of the beef extract facilitates the flocculation of viruses during precipitation [5,6,[32][33][34].
Second centrifugation at 1900 × g for 30 minutes at 15°C was made and the supernatant was discarded again.The new pellet was resuspended in 1.75 ml of Phosphate-buffered saline (PBS) and homogenized by shaking at 1000 rpm, for 1 minute.The concentrate was transferred to nuclease-free tubes and stored at 4°C and the RNA extraction was made on the same day.

RNA extraction
The RNA extraction was carried out with Viral Nucleic Acid Extraction Kit II (GENEAID) following the manufacturer instructions named Original Protocol, briefly: 200 μL of the concentrated sample were homogenized and mixed with 400 μL of lysis buffer for 1 minute, and incubate at room temperature for 10 minutes, followed by a 5 minutes centrifugation at 10,000 × g.The pellet was disposed of and the supernatant was processed according to the manufacturer's instructions and RNA was eluted with 50 μL ultrapure water and stored at −20°C until processing.The characteristics of the RNA extract obtained were measured, using a microvolume spectrophotometer (Nanodrop®, ThermoFisher) and measuring the A260 /A280.
To avoid possible inhibitors present in the sample, 25 µl polyvinylpyrrolidone 40% (PVP) (Sigma) was added to the concentrate and a heating stage of 10 minutes at 75°C was included prior to the centrifugation stage and then following the manufacturer instructions.We call this strategy 'Adapted protocol.' A microvolume spectrophotometer Nanodrop (ThermoFisher), was used to measure A260, A280, and A230.As additional data, this equipment allows the quantification of RNA in ng/ml obtained in the extraction according to Beer-Lambert law: C = A/ (ε * b) where C = concentration in ng/ml, A = absorbance, □ = molar absorptivity, b = length of light path and for RNA.
The RNA purity was expressed as A260/280 index.According to the instructions of the device, we know that an absorbance between 1.8 and 2.2 for A260/280 index indicates an RNA/DNA with adequate purity.The other measure A260/230 index indicates that a value less than 1.8 results that contaminating substances that were co-extracted that could be inhibitors of the subsequent PCR.

Viral detection and quantification
Viral RNA was detected by TaqMan 2019 nCOV Assay Kit v 1 (Applied Biosystems), in a Step One Plus (Applied Biosystems).TaqMan 2019 nCoV Control Kit v1 (Applied Biosystems) was used as a positive control.The Negative control was molecular biology quality water (Applied Biosystems).
The reaction volume was 25 μL.Each RNA obtained as well as the positive and negative controls were processed in duplicate.
The RNA quantification (measured as the number of gene copies/L) was obtained by interpolating the quantification cycle (Ct) to a calibration curve from 5 duplicate serial dilutions of TaqMan 2019 nCoV Control Kit v1 (10,000 copies/μL).

Data analysis
To assess the possible of the relation between the SARS-CoV-2 quantified RNA and the positive diagnosed cases, two variables were described: a. 'Viral load' (independent variable): defined as the base-10 logarithm of the product between ORF1ab gene concentration (number of gene copies/L), and the daily flow at the WWTP inlet on the sampling collection day (m3/d).
b. 'Diagnosed cases' (dependent variable): outlined as the base-10 logarithm of the sum of positive cases diagnosed in a given interval of days [35,36].
For each treatment plant (each of them considered separately) different linear regressions were performed, associating positive clinical cases with viral load, by served region.
Linear regression was expected to be an adequate mean to compare different time intervals and to analyze the degree of correlation between the variables [37].
For each WWTP different intervals of initial and central days were tested.Considering the sampling collection day as 'day zero,' 'Initial day' was defined as the beginning of the interval, while 'Central day' was defined as the center of said interval.
Regressions were analyze d considering: Interval start from day −20 to day +18 and time intervals between 1 and 22 days.
Given that the definition of the said interval had a direct impact on the 'Diagnosed cases' variable, 858 intervals were considered for each WWTP.Taking the sampling collection day as 'day zero,' these intervals comprised all possible intervals considering the beginning of the interval at day −20 up to day +18, and the interval's duration ranging from 1 day to 22 days of accumulated positive cases.Linear regression models were built in RStudio® for all the datasets to identify possible outliers for each WWTP.Studentized residuals and Cook's distances were analyzed for each model.Once the outliers were identified, both the datasets and the linear regressions were rebuilt.
The objective of considering this range of intervals was to visualize how the strength of the correlation depends on the interval.
On the other hand, the Ct qualitative results, were correlated with a color related to the amount of viral load.Going from a green color (low load) to a red color (high load).Green color (low load) in the range between Ct = 34.1 to Ct = 37 Yellow Color (medium load) in the range between Ct = 29.5 to Ct = 34 and Red color (high load) with Ct <29.5.

Process control
Since virus losses can occur at the different stages of the process, to control it, a process control virus such as mengovirus (MGV) (MGV standard, CEERAMTOOLS®) was added at the beginning of the process and, when compared with that obtained from the MGV virus without a matrix (standards), obtain the degree of recovery by measuring the extraction efficiency of the same, described in ISO15216-1, 2017 [31].Validation was carried out using 20 effluent samples collected over the course of two weeks under the same conditions as the samples to be analyzed.Once in the laboratory, 10 μL of MGV was inoculated to those samples, following the same concentration and extraction processes.The extracts obtained were also diluted 1/10.
The PCR and a MGV standard curve were performed according to the manufacturer's instructions.The RT cycling conditions: 45°C for 10 minutes, followed by a warm-up to 95 8°C for 10 minutes, and 45 amplification cycles of 15 seconds at 95°C, followed by 45 seconds at 60°C with the acquisition of signal in this last stage.The FAM channel was used for fluorescence reading, using ROX as a passive reference.
The recovery rate of MGV was calculated following the manufacturer's instructions.The ISO15216-1, 2017 defined the minimal sample extraction efficiency as equal or major as 1% [31].
The log10 of genomic copies (gc) of MGV was obtained by extrapolation quantification cycle (Ct) to a calibration curve from 5 duplicate serial dilutions of Mengovirus

Results
The identification and quantification of fecal microbial contaminants carried by wastewater are shown in Table 1, and with these results it is verified that are essential to estimate the impact of sewage discharges on water bodies, and at the same time, to trace the source of contamination and evaluate treatment processes [38].With these determinations we also verified that our samples are much diluted, and we came to the conclusion that one of the causes could be the rainfall contribution.
With the results of the E. coli multi-tube assay, the sampling collection was defined between 7:30 a.m. to 11:30 a.m. which showed a range of higher discharge.Table 2.
The use of PAC as an alternative option is validated with results of the degree of Recovery of MGV.The range obtained varies between 3.3% and 84.8% measured in % Extraction Efficiency (Table 3).
Polyvinylpyrrolidone (PVP) was included as an optional step during RNA extraction to bind polyphenols and prevent acid oxidation and its use is validated for measuring A260/A280 in nanodrop [34,[39][40][41].The indexes obtained with the two protocols (Table 4) showed that the extracts originated with the original protocol had indexes below the acceptable quality results and it is observed that when carrying out the adapted protocol the results of the indexes were within the optimum.These results were shown in Table 4.
The ORF 1ab calibration curve showed a linear range between 16 copies/μL to 10,000 Copies/μL an amplification efficiency greater than 90% was taken as valid.A detection limit of 80 copies/μL (Ct value = 33.9121).
For each WWTP, different intervals (initial and central days) were tested and significant correlations were obtained by using 12 days intervals: For WWTP1 the coefficient of determination (R2) was: 0.505 with a p-Value of 3.06E-04; For WWTP2 the R2 was 0.772 with a p-Value of 3.51E-07; For WWTP3 the R2 was 0.621 and the p-value was 3.70E-05 and for the WWTP4 the R2 was 0.55 and the p-value was 1.81E-04.
An integral linear regression was tested for all plants, taking −4 as starting day and an interval of 12 days of accumulated positive cases for all plants.A correlation of R2 = 0.8951, p-value<2.2E-16,RSE = 0.2106, and adjusted R2 = 0.8937 (Figure 2), showed a satisfactory relationship which could reinforce the idea of implementing the WBE in the region as a predictive complementary tool [42].Results were shown in Figure 2.

Conclusions and discussion
It is widely reported that the presence of E. coli is used worldwide as an indicator of fecal load, so our approach  to determine the sampling time slot would be another valid method [29] [43].
We chose PAC as the precipitant of our concentration method, and this modification was an important finding at the national level, which allowed other agencies responsible for monitoring wastewater to detect SARS-Cov-2 in their study regions.
One of the main obstacles to successful PCR-based analysis is the co-purification of inhibitory components, such as polyphenols, polysaccharides, proteins, secondary metabolites, and several acids, such as humic, fulvic, and tannic acids that easily inhibit Taq polymerase.Many of these compounds are not eliminated by commonly used extraction kits, so the improvement obtained with the adapted protocol to the A260/A280 index could be due to the inclusion of PVP in the extraction stage, which would prevent the phenolic compounds extracted together from damaging the RNA and because the proteins were digested by the action of the proteinase K enzyme and then removed by centrifugation [39,41].
Our proposed range of 12 total days covers the cumulative cases of both asymptomatic and symptomatic people, all of which contribute to the increase in the detected viral load.Taking into account this range, an initial day (that has the best correlation between the viral load of the samples and the accumulated positive cases for each treatment plant) was calculated, determining that each plant has an individual behavior according to the served area, with a different offset regarding the positive cases in the population.For WWTP1 with the highest predictive power (7 days before), we could anticipate possible disease peaks in the area one week in advance.This behavior, being the treatment plant that receives the largest amount of wastewater within the concession area, it could also be inferred that the same is extrapolated to the entire concession area.
With the qualitative results obtained, we graphed the heat maps (Figure 3) that allow us to see in a simple way the exponential increase in reported cases with a forecast similar to that obtained with the four WWTP, which predicted the second wave of COVID-19 (May 2021).
The successful outcome of an integrated molecular detection method, which includes collection, sampling, concentration and detection, has served as a starting point for establishing the main objective of the future development of a wastewater monitoring system, which should be expanded throughout the region, be organized and maintained over time to study a greater number of sampling areas and be able to track different pathogens of interest to health, the variants of care, the detection of outbreaks and sudden increases, as well as specific sites or populations, as part of the WHO-health approach, where the WBE has acquired special relevance [15].This method, being easily accessible, allows it to be applied in low-income regions [11,44].In this study, the time interval for the variable 'diagnostic cases' is consistent with incubation times proposed in a metaanalysis for different studies around the world, with an average of 6,38 days [45].
Wastewater analysis could represent an optimal early warning system as a source of information for the health area [46].The fecal traces of SARS-CoV-2 represented a very useful tool for tracking how and where the disease spread in the population [8,23,25,38].
Wastewater is a source of information that provides data on human health, which could be used as a tool to refine the public health response to a pandemic caused by a pathogen.
Surveillance of wastewater is therefore an additional tool to assess the presence and prevalence of infectious pathologies when the capacity, both in the country and in clinical analysis, is limited.Likewise, the incorporation of data from the entire population can provide information when putting together modeling schemes [2,20,32].
Taking into account the investigations carried out in different countries (in addition to Argentina), we can affirm that the detection of SARS-CoV-2 in wastewater, even when the prevalence of COVID-19 is low, would indicate that the surveillance of wastewater could be a sensitive tool to monitor the circulation of the virus in the population.

Figure 1 .
Figure 1.Study region map: this map shows the enlarged region where the four wastewater treatment plants (WWTP 1 to WWTP 4) are located, each one marked with a delimitated area of influence.The plants are located in Argentina, in the province of Buenos Aires (the shaded region in the smaller map), and within the province of Buenos Aires, the square region shows the study area called BAMA.

Figure 2 .
Figure 2. Global correlation: the existing correlation between the positive clinical cases and 19 the copies of the ORF1ab gene detected is shown.

Figure 3 .
Figure 3. a) Heat map for December 2020, b) Heat map for April 2021: the increase in viral load in the different areas of BAMA is shown, evidencing the beginning of the second wave of SARS-Cov-2.

Table 1 .
Physicochemical characteristics of the influent received at each treatment plant: the results obtained show that a diluted effluent enters each plant.

Table 2
. E-colimetry results: the table shows the amount of E. coli bacteria in most probable number in 100 ml (MPN/100 ml) quantified in a time slot.

Table 3 .
Evaluation of the degree of recovery: values measured in % of extraction efficiency obtained by extrapolation from the mengovirus (MGV) standard curve.

Table 4 .
Optimization of the extraction process: the results obtained in Nanodrop® using the original and adapted extraction protocol are shown.The results show that with the adapted protocol the parameters measuring RNA purity increase as does the amount of RNA obtained.