Application of near infrared spectroscopy in sub-surface monitoring of petroleum contaminants in laboratory-prepared soils

ABSTRACT This study was conducted to determine the suitability of near-infrared (NIR) spectroscopy for subsurface monitoring of crude oil contaminants in different soil types under different moisture conditions. Calibration tests, carried out in both wet and dry soils, with crude oil contents 5–30% indicate an inverse relationship between spectra and oil content with R2 values ≥0.98 for dry soil mixtures. A derived index termed “Oil Index” was generated by calculating a spectral ratio between the main oil peak in the NIR region and a spectrally inactive part of the spectrum within the visible region. It was used in a series of experiments designed to test whether NIR probes could yield useful information about the movement of oil through different soils. Synthetic crude oil was dropped into a series of soil samples of different particle size distribution and moisture content. Analyses of the 3D distribution of values of the Oil Index demonstrate that it is possible to estimate and map synthetic crude oil concentration in the subsurface of the soil samples. Results showed that while the Oil Index provided a reasonable estimation for oil concentration in dry samples, this was not the case for wet samples, although the Index was useful in understanding the pattern of movement of oil contaminant in wet soils. The work indicates that this technique may enhance field investigation of oil contamination, providing an accurate in-field technique.


Introduction
Chemical contamination is a significant environmental issue in many locations where petroleum is produced, stored, or distributed (Wartini, Malone, and Minasny 2017). It may affect the health of vegetation (Buzmakov, Egorova, and Gatina 2019;Mendelssohn et al. 2012) and modify soil characteristics (Townsend, Prince, and Suflita 2003;Wang, Feng, and Zhao 2010;Wang et al. 2013). Understanding the 3D distribution of the oil as it fills available pore spaces in the ground is an important consideration for any land mitigation or remediation.
Surveys to detect, quantify, and monitor oil movement on different substrates (soil and water) are normally done using conventional invasive ground investigation techniques, such as boreholes, sampling, and associated laboratory testing. Although these approaches construct a robust geochemical characterization of contamination, they can be expensive, require safe disposal of used samples, and provide only a discontinuous sampling. Surface geophysics is also effective. Electrical Resistivity Tomography (ERT) surveying, either on its own or in combination with other geophysical techniques, is a proven method for oil contamination investigation (Cassiani et al. 2014;Coria et al. 2009;Delgado-Rodríguez et al. 2014;Ioane et al. 1999). The principle used is to delineate contaminated sites based on readings of apparent resistivity from an electrode array and identify anomalies that can be related to the presence of petroleum relative to the background (Brown et al. 2003;Delgado-Rodríguez et al. 2006a, 2006bShevnin et al. 2003). Problems with these techniques, however, arise due to the often cumbersome equipment set up required and the fact that readings can be affected by many factors including soil/rock type, clay content, moisture content, salinity, and electrode geometry.
Near-infrared (NIR) spectroscopy has also been used in the investigation of oil contamination. Absorption features in NIR spectra related to crude oil have been reported between wavelengths 1100 nm and 2300 nm, considered indicative of various C-H stretching and saturated CH 2 groups (Chakraborty et al. 2015;Douglas et al. 2019;Forrester et al. 2013;Mullins, Mitra-Kirtley, and Zhu 1992). Characterization and discrimination of samples containing different components of crude oil and other derivatives including alkanes, benzene, polycyclic aromatic hydrocarbons (PAHs), gasoline, diesel and chlorinated hydrocarbons have been achieved using NIR spectroscopy (Douglas et al. 2019;Forrester et al. 2013;Malley, Hunter, and Webster 1999;. Similarly, it has been shown that NIR analyses can be successfully applied to the characterization of crude oil components (Saturates, Aromatics, Resins and Asphaltenes (SARA) (Aske, Kallevik, and Sjöblom 2001;Hannisdal, Hemmingsen, and Sjöblom 2005). Distinctive spectral signatures of oil and various soil organic signatures have enabled qualitative discrimination of Total Petroleum Hydrocarbon (TPH) in contaminated areas (Aske, Kallevik, and Sjöblom 2001;Chakraborty et al. 2010;Malley, Hunter, and Webster 1999;Morgan et al. 2009;Okparanma and Mouazen 2013;Pabón, de Souza Filho, and de Oliveira 2019;Scafutto and Souza Filho 2016). Prediction of phenanthrene (a Polycyclic Aromatic Hydrocarbon (PAH)) in some UK soils has also been demonstrated in laboratory studies of soils with different moisture content using Partial Least Square (PLS) regression analysis with full cross-validation (Okparanma and Mouazen 2013b). Similar studies have been conducted on the prediction of PAH contents of oil-contaminated samples using NIR spectroscopy (Douglas et al. 2019;. In most of the investigations mentioned above, calibrations were done using data from Gas Chromatography analysis. Other integrated techniques were used in view of achieving more rapid and improved quantification of TPH in contaminated soils (Chakraborty et al. 2015). A combined model produced using XRF + NIR data outperformed the use of NIR spectroscopy alone, although with limited field application. Similar approaches have been demonstrated (for instance, Ghandehari et al. 2008;Klavarioti et al. 2014) in real-time evaluation of oil contaminants, where time-constrained experiments enabled collection of a time-series of data in 2D space. The capillary movement of mineral oil in dry and partially saturated soils and NIR response of alkanes, aromatics, and chlorinated hydrocarbons were determined. This paper examines how NIR spectroscopy can be used to understand 3D contaminant' movement and distribution through dry and wet soil, extending the studies mentioned above. Two sets of experiments were conducted: • Oil content calibration experiments to establish relationships between spectra and different oil/moisture contents. • Oil monitoring experiments to establish whether 3D movement of oil through a body of soil (when dry and when wet) could be monitored using the NIR technique.

Soil samples and mixtures
All experiments used laboratory-prepared artificial soils comprising mixtures prepared for this work using commercially available air-dried quartz sand and air-dried kaolin clay. The sand was oven-dried at 104°C for 24 hours and then homogenized by manually breaking the lumps and mixing thoroughly. The kaolin was also dried but at a lower temperature of 60°C for 48 hours to preserve the clay lattice structure.
Mixtures of cooled sand and clay were prepared by volume rather than weight due to the large difference in mass between sand and clay. Two mixtures were prepared: SAND 100 (100% sand) and SAND 90 (9:1 sand: clay (10% clay)). From these mixtures, sub-samples were taken for testing with varying amounts of oil and/or water added.

Crude oil sample
To reduce the health and safety risks associated with handling crude oil, based on UK guidelines, a crude oil substitute was used instead of actual crude oil. The oil sample was obtained from Breckland Scientific Supply Ltd., United Kingdom. It is an artificial crude oil with a composition very similar to the natural ones (see Breckland data sheet; Product Code = S3101693-500ML). It is a gray, low viscous liquid containing about 60% Saturates and <10% Aromatics (see Figure A1).

Instrumentation
Spectral data capture was done using a high-resolution Analytical Spectral Devices (ASD) LabSpec4 spectrometer by Malvern Panalytical Ltd., UK. It has a 3 nm spectral resolution within the vis-NIR region (350-700 nm) and 10 nm within the SWIR (1400-2100 nm). Data acquisition was controlled using Indico Pro Software installed on a PC. Prior to scanning, a baseline calibration was done using a white reference Spectralon™ Disc and thereafter every 30 minutes. The signal-to-noise ratio was also reduced by setting the internal scan cycle at 30 during each reading.

Calibration experiments
Two calibration experiments were carried out. We first consider the spectral response of each soil type to the oil content. This involved the progressive addition of 25 mL of crude oil up to 150 mL to a fixed 500 mL volumes of each dry soil mixture (SAND 100 and SAND 90 sub-samples), resulting in a range of oil concentrations from 5% to 30%. At each dosing, the oil-containing soil was thoroughly mixed using a glass rod and the surface flattened before spectral scanning. Readings were taken using an ASD trumpet foreoptic at three separate points on the surface and averaged to obtain a representative spectrum for a given concentration (Table 1).
The second calibration experiment considered the spectral response of changing moisture content in oil-contaminated samples at a fixed oil concentration. For this calibration, 25 mL of water was added progressively to the soil mixtures with 30% oil content. This gave approximately 5% to 20% water saturation relative to the dry calibration. After each addition, the mixtures were homogenized thoroughly using a glass rod before spectral scanning. Three separate measurements were also taken at each moisture content using the ASD trumpet foreoptic probe and averaged to obtain a representative spectrum.

Oil monitoring experiments
Four 'boxed' samples (two SAND 100 , two SAND 90 ) were prepared to investigate oil penetration in 3D. 8000 mL of each soil was placed in a 9000 mL capacity rectangular plastic boxes. A pre-contamination baseline set of measurements was taken by measuring spectra in a grid pattern on the surface of each boxed sample. It was considered a reasonable assumption that this surface measurement in a well-mixed soil sample will be representative of the spectral response at any depth in the sample prior to treatment ( Figure A2).
Two sample boxes (one each of SAND 100 and SAND 90 ) were contaminated in a dry condition. To each sample box, 500 mL of synthetic crude oil was added by slowly pouring it from a measuring cylinder held just above the sample surface in the center of the box. The other two sample boxes were contaminated in a moistened state. Prior to adding 500 mL of oil as above, 250 mL of deionized water was added to this set of boxed samples. This was done using a spray bottle to minimize sample disturbance.
After the addition of oil to all four boxes, they were left undisturbed for 7 days to allow natural infiltration and spatial movement to occur. Throughout this process, all samples were stored in a vertical laminar fume cabinet ( Figure A3) to ensure no harmful emissions accumulated.
After 7 days, each sample was subject to 3D spectral probing. All measurements were again carried out in a fume cabinet for purposes of safety. Measurements were captured using the same ASD Labspec4 with an 18 mm steel chamber probe. This probe enables the capture of spectra through a sealed, optically clear window at the base of the probe. The window is mounted at 45° to the probe body, which enables it to be pushed into the sample. Spectral scanning was carried out on a grid with a 50 mm lateral spacing ( Figure A4). It was established by initial testing that this resulted in little disturbance to adjoining measurement points. At each grid coordinate, the probe was pushed vertically into the soil to depths of −25 mm, −50 mm, −75 mm, and −100 mm and readings were taken at each depth. It was ensured that after each set of measurements, the probe was thoroughly cleaned to minimize cross-contamination at the next sampling point.
Because the study was about a proof of concept for mapping oil contamination on the subsurface and estimating relative concentrations within a set of samples, replicas were not used. However, an in-depth investigation is ongoing on the real-time monitoring of oil contaminant in different soil types.

Data analysis
Data were analyzed using Microsoft Excel and Golden Software Surfer 13 version. To compare sample homogeneity of each soil mixtures, reflectance values at 1400 nm; which characterize the uncontaminated soil (accords with Zhao et al. (2018)). Values at this wavelength were converted to grid files in Surfer 13 and surface maps produced. 1720 nm was considered a reasonable wavelength to indicate the presence of oil in soils. Due to the difference in albedo between the dry and wet soil samples, which was observed to affect the oil peak intensities, it was reasonable to compare peak ratios in different regions of the spectrum. A ratio between the oil-reactive wavelength at 1720 nm and a relatively unreactive part of the spectrum, chosen to be 450 nm, was created. The resultant ratio is termed here the "Oil Index" (O ix ) (equation 1). This was used for baseline measurements; to characterize contaminated vs uncontaminated samples and to estimate relative oil concentrations.

Spectral signatures
The spectra of oil, water, and soil substrates (uncontaminated and contaminated) are shown in Figure 1. They are distinguished mainly by the spectral geometry between 1300 nm and 2300 nm. The characteristic absorption peak at 1720 nm was indicative of crude oil content. This agrees with (Clark et al. 1990) and others who have found good relationships in that region. For instance, 1716 nm (

Calibration plots
Plots of spectra and regression curves of reflectance/Oil Index vs oil content (Figure 2) show inverse relationships in all the dry mixtures, with Coefficient of Determination (R 2 ) values of 0.99. Although there was a decrease in Oil Index values with increasing moisture and clay content (compare Figures 2 c vs d), the inverse relationship remains the same for the wet soil mixtures. Oil Index values ranged from approximately 9 at 5% oil to 2.1 at 30% oil content for the SAND 100 , while 7.9 and 1.55 were recorded at 5% and 30% in SAND 90 . In the wet calibration experiment, only SAND 100 showed a good correlation between Oil Index and increasing moisture content by up to 20%. R 2 value of 0.99 was obtained. This indicates a good instrument sensitivity at higher fluid content in sand. In the SAND 90 sample, however, a poor relationship was obtained (Figure 2f). This was envisaged to be due to the dilatancy effect of the clay-rich mixture on the probe sensitivity rather than the spectral response of the contaminant. This phenomenon is being investigated in an ongoing study.

3D oil monitoring in SAND 100
Sub-surface probing of the contaminated soils was carried out at −25 mm intervals, to a maximum depth of −100 mm along three profile lines, allowing for the production of slice-maps. Each slice represents equal depth. The spatial distribution of oil contaminant for SAND 100 is shown in the picture in Figure 3a, with the 3D delineated feature from spectral data in the slice-map directly beneath the picture. Areas with high oil concentration are characterized by Oil Index values of 6.5 to 8.0, while areas with values of 8.50 and above have low oil. The limited lateral movement of the oil in SAND 100 was associated with the high porosity of the sample where vertical infiltration was dominant, hence the broad funnel-shaped feature delineated by the white dotted line on the slicemaps.
In the wet mixture, Oil Index values were variable due to moisture content. From 6.4 at the bottom to 8.9 at the top. This pattern was expected. The higher spectral values (low oil concentration) at the top layers indicate the nature of movement of the two fluids (oil and water) in a porous sand medium where they moved quickly and accumulated at the bottom due to higher porosity and poor oil-retention of sand. This corroborates the observed scenario in the dry mixture. However, the spatial variation along individual slices was envisaged to be due to the relative displacement of oil and water, a scenario that was not observed in the dry SAND 100 . As depicted in the slice-maps, greater variability of Oil Index values occurs along each slice compared to dry SAND 100 . For instance, at −70 mm depth (third slice), Oil Index values around the central part differ greatly. Values from 7.25 to 7.50 and up to 8 were recorded, although visually, the oil appeared to be uniformly distributed.

3D oil monitoring in SAND 90
In dry SAND 90 , oil was observed to spread more laterally compared to dry SAND 100 . Visual observation during the experiment showed a pool of oil formed around the top-left corner of the box (white oval dotted line) due to a slight dip on the mixture's surface (Figure 4a top picture). This feature has also been delineated on the slice map. The area with the highest oil concentration (Oil Index <6.0) occurs at the left corner of the map at −75 mm (third slice). This coincides with the area where the oil pool formed during oil dosing. Outside this area, Oil Index values range from 6.5 to 6.7 throughout the contaminated soil mixture. The similarity in index values suggests a uniform distribution of the oil contaminant across the entire mixture compared to the SAND 100 where lower values were restricted to the central part and increased outward toward the edges at both ends. Small pockets of low Oil Index (about 6.3) found in discrete areas (e.g. around 150 mm at −25 mm (2nd slice), and around 250 mm at −50 mm (3rd slice)) in SAND 90 were attributed to possible preferential movement and accumulation of oil through minor fissures in the soil. In wet SAND 90 , it was envisaged that the movement of oil would be influenced by the presence of moisture in the available pore spaces. The Oil Index values were also reduced due to the combined influence of clay and moisture. Values range from 4.4 at the bottom to 5.85 at the top (compare with 6.5 to 6.7 in the dry mixture). In addition, variability of Oil Index values was observed on individual slices, although, minimal compared to wet SAND 100 (Figure 4b). This phenomenon also suggests relative displacement of oil and water during movement. For instance, at −25 mm (top slice), Oil Index values range from 4.90 at the left corner to 5.6 around the central part and 4.95 at the lower right corner. Similarly, at −50 mm (2nd slice) and −75 mm (3rd slice), Oil Index values vary across the slices and the lowest values (<4.9) at the bottom slice suggest fluid accumulation. This was tested and the result was presented in the next section.

Relative concentration in soil mixtures
The average concentration of oil in the two dry soil mixtures (SAND 100 , SAND 90 ) after 7 days was compared. Considering the contaminated mixtures, the average Oil Index for SAND 100 was 7.7 which equals approximately 8.5% oil content (calculated from the line equation of the calibration plot). Similarly, 6.6 was found to be the average Oil Index in SAND 90 and equals 8.8% oil. It was anticipated that since equal volume of oil was added to the different soil mixtures and kept under the same condition over time (7 days), they should have similar concentrations. This was found to be the case. Both dry SAND 100 and SAND 90 compare in their average oil concentration (8.5% and 8.8% respectively) over the same period. The slightly higher value in SAND 90 was not unexpected, as clay-rich soils have better fluid retention capacity and are less susceptible to fluid escape due to evaporation compared to sands. Although this scenario gives an idea of possible soil/oil interaction, there are no studies published in the literature using NIR technique to compare with the findings in this study.
In the wet soil mixtures, estimation of average concentration was less straightforward, especially in SAND 90 . This was not entirely unexpected as Oil Index was slightly modified by clay and water as reported in the literature (Okparanma and Mouazen 2013a). Since differential movement of oil and water in the soil mixtures could have occurred due to their immiscibility, estimation of oil concentration was done by spectral visualization analysis at selected anomalous points on the slice maps to understand spatial distribution of oil and possible soil/oil interaction.
The spectral plots adjacent to each slice in Figure 5 represent the different points under consideration. Discrimination of different oil vs water was based on characteristic peaks at 1720 nm and ~1400 nm, respectively, while relative oil concentration was determined using the intensity of oil-absorption peaks. Higher oil content would have stronger peaks and vice versa. This approach is similar to the work of (Pabón and de Souza Filho 2016) where oil concentration was related to the "depth" of absorption peaks. On the top slice (depth = 25 mm), all the points E1, E2, and E3 have absorption peaks of oil and water and differ only in their intensities. On the second slice (depth = 50 mm), spectral geometries of points F1 and F3 (having oil peaks) clearly differ from F2 (no oil peak). Similarly, points G1, G2, and G3 (depth = 75 mm) differ in both spectral geometry and oil peak intensities. Similarly, at the bottom slice (depth = 100 mm), points H1 and H4 show stronger oil absorption peaks compared to H2 and H3. This explains the relative displacement of oil and water during their movements from top to bottom of the sample.

Discussion
The findings in this study corroborate the reported inverse relationship between reflectance and oil concentration (Douglas et al. 2018;Okparanma and Mouazen 2013b;Pabón and de Souza Filho 2016;Scafutto and Souza Filho 2016;Wartini, Malone, and Minasny 2017). Empirical calibration using known oil concentrations also serves as a simple model for the validation and elimination of errors (e.g. Pabón and de Souza Filho 2016). In addition, the Oil Index derived from peak ratios in this study proved useful in estimating oil concentration. This accords with similar approaches in the literature (e.g. Forrester et al. 2013;Hannisdal, Hemmingsen, and Sjöblom 2005;Klavarioti et al. 2014;Okparanma and Mouazen 2013a). In a recent work, Scafutto and de Souza Filho (2019) calculated peak ratios of hydrocarbon signatures in the NIR region and applied them to discriminate different hydrocarbon types in oil-contaminated samples. Similarly, peak ratios have been effectively used in remote sensing to estimate oil concentrations. These include Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI) (Adamu, Tansey, and Ogutu 2015;Lassalle et al. 2019;Mena et al. 2017;Ozigis, Kaduk, and Jarvis 2019) and Hydrocarbon Index (HI) (Kühn, Oppermann, and Hörig 2010). The Oil Index used in this study also serves the purpose of estimating oil concentration.
In terms of subsurface investigations, 3D analysis of oil-contaminated soils using NIR Spectroscopy has been sparsely reported in the literature. This may be attributed to the limitations of current probes and the lack of a standard method of investigation. Hence, the approach, in this study provides an empirical method for investigating the spatial distribution of oil contaminants in both dry and wet conditions. Similar to ERT, deduction of spatio-temporal movement of oil and water in the subsurface was done using Oil Index and anomalous patterns were identified on slice maps. Furthermore, the estimation of the average concentration in the different soil samples was significant. The slightly higher volume average in the SAND 90 (8.8%) relative to the SAND 100 (8.5%) agrees with the sorption models where adsorption of oil onto soil particles has been reported to increase with increasing clay content (Nudelman, Rios, and Katusich 2000;Ríos and Nudelman 2005;Wu, He, and Chen 2013). Similarly, the relative displacement of oil and water in wet samples, as described in this study, gives an idea of possible desorption which has been reported to increase with changing temperature and higher moisture content (Wu et al. 2015). This finding can be further explored and developed to enhance the field application of NIR Spectroscopy.

Conclusion
The work aimed to demonstrate whether information useful for the understanding of contaminant movement can be determined using NIR spectroscopy and also whether the method developed could be applied as a real-time monitoring protocol in support of effective management plans. The findings show the potential application of NIR spectroscopy for subsurface investigations of oil-contaminated sites in dry and wet conditions. While Oil Index alone proved useful in estimating oil concentration in dry soils, spectral visualization of discrete points was needed in wet soils. Furthermore, the slice-maps which captured the spatial distribution of oil contaminants enabled the estimation of average concentrations and enhanced our understanding of oil movement through different soils. Although there are generic computer models (mainly numeric models) that are used to model the movement of contaminants through the soil profile via pore spaces by complex transport mechanisms (Bear and Cheng 2010), these models, are difficult to adapt taking into account local field conditions. Hence, this study attempts to bridge that gap.
One limitation of depth-probing is the slight substrate disturbance during measurement. Nevertheless, the direct and rapid in-situ investigation and ability to discriminate between different fluids based on their spectral signatures outweigh that limitation. In addition, with improved probe design (i.e., much smaller diameter and higher resolution), insignificant disturbance of the soil will further enhance this technique to achieve rapid in-situ and costeffective sub-surface investigation in small oil-contaminated fields.