Post-processing noise reduction via all-photon recording in dynamic light scattering

ABSTRACT Dynamic light scattering (DLS) is widely used for the characterization of the size distributions of polymers and nanoparticles in dispersions. The time correlation function of the scattered light intensity can be calculated from the intensity fluctuation with time and converted into the size of the scatterers. It has been difficult to apply DLS to a dispersion containing large pollutant particles, however, because the pollutants moving in the dispersion can give rise to intense noise signals from time to time during data acquisition. In conventional DLS, this type of noise renders the entire measurement useless. Here we report a novel software-based DLS system in which the arrival times of all the scattered photons are recorded using a time-to-digital converter, and the time correlation function is calculated exclusively from the uncontaminated parts of the data in post-processing. We demonstrate the validity of this noise reduction scheme by evaluating the silica nanoparticle size in a dispersion containing a small number of micrometer-sized PMMA particles as a model contaminant. Graphical abstract


Introduction
Precise determination of polymer and particle sizes is important in the development of novel materials functioning as catalysts [1][2][3], inks [4,5], biomaterials [6][7][8], and carriers for drug delivery [9][10][11][12]. For nanometer-scale targets, conventional optical microscopy or static light scattering using visible light offer insufficient spatial resolution. Electron microscopy can achieve much higher resolution, but it requires samples to be placed in a high vacuum, which limits its application to liquids and biological targets. Smallangle X-ray and neutron scattering (SAXS and SANS, respectively) [13,14], which are standard techniques for determining the sizes and shapes of nanoscale samples, also have drawbacks. Biological and polymer samples can be easily damaged by the high photon energies in SAXS, and in SANS, deuteration of the solvent is often employed to suppress the incoherent scattering from the hydrogen atoms and to obtain appropriate scattering contrast between the solute and the solvent. Furthermore, both SAXS and SANS typically require large-scale, expensive apparatus.
Dynamic light scattering (DLS) is a unique technique that can determine the size distributions of nanometer-scale objects in dispersions, despite of its use of a visible light source [15,16]. In a conventional DLS measurement, the time correlation function is estimated using an autocorrelator from the temporal fluctuation of scattered light intensity due to the Brownian motion of the target nanoparticles in the laser focal volume. DLS has been widely used because it requires no special sample pretreatment and is applicable to in situ measurements, its measurement times are short, its sensitivity is high, and its apparatus is low cost [17][18][19][20][21]. DLS is also susceptible to noise generated by larger pollutants such as dust particles, however, because the scattering intensity from nanoparticles is proportional to the square of the volume of the scatterer [22][23][24]. In the case of monitoring the synthesis of an ink, for example, large aggregates of ink particles, though low in concentration, hindered the precise determination of the actual size distribution [25]. Although such unwanted pollutants may be filtered out of the dispersion in advance, the process of filtering can perturb the original size distribution. Moreover, even in a pre-filtered dispersion, large aggregates may be produced during the measurement. This poses a significant difficulty for the application of DLS to real-time, in situ measurements.
A partial solution to the problem of signal contamination by pollutants has been provided by performing the DLS measurements using a backscattering geometry [26]. The scattered electric field from large particles is suppressed at large scattering angles, due to the destructive interference within the particles. However, the efficiency of the noise suppression offered by this approach strongly depends on both pollutant size and laser wavelength, which leads to incomplete removal of the noise under realistic conditions. A more effective approach is to perform measurements multiple times using an autocorrelator and take the average over 'uncontaminated' signals that are free from significant noise [27][28][29][30][31][32]. For example, Bossert et al. proposed a statistical moment analysis of the photon counting distribution by recording the temporal variation of the photon counting rate with the timescale of sub-seconds [31]. Malm and Corbett proposed the criterion to judge whether the time correlation function obtained from the autocorrelator-based DLS system is contaminated or not using the deviation of the polydispersity index [32]. Because the measurement run needs to be repeated until noisefree signal is obtained, however, this scheme cannot be applied to an irreversible time-dependent system which allows only limited measurement time. Moreover, the parameters such as accumulation time required for each measurement run had to be optimized on the real sample via a laborious trial-and-error process.
Recently DLS system based on a software has been proposed to solve the problems of the autocorrelatorbased DLS [33][34][35][36][37]. In software-based DLS, the time correlation function is calculated using photon counting data and a computer. Parameters such as accumulation time and threshold for the noise reduction can be set after the measurement. This scheme is promising to offer much wider application possibilities as well as more robust and flexible noise reduction. For example, a software-based DLS was successfully applied to diffusion correlation spectroscopy to probe blood flow and demonstrated better signal-to-noise ratio than conventional autocorrelator-based DLS [36]. Though the potential application of the software-based DLS for noise reduction has been often mentioned, its experimental demonstration has not been reported yet.
In this article, we propose a novel noise reduction scheme based on a software-based DLS system that records the arrival times of all the scattered photons, using a photodetector combined with a time-to-digital converter. From such datasets we can calculate the time correlation function for a time span of arbitrary length as short as 0.1 s. Our scheme enables us to remove the large-pollutant noise in post-processing, and the parameters for the noise reduction are set based on the convergence of the time correlation function. The parameters can be set after the measurement is over, and using this advantage, we demonstrate the accurate construction of the time correlation function from the contaminated dispersion, which contains such a large amount of noise signal from pollutants that the existing noise reduction techniques do not work well. We also demonstrate that the particle size obtained from our scheme depends very little on the parameter settings. The methodology employed in our software-based DLS system will offer various choices for data analysis that cannot be performed by the autocorrelator-based DLS. This is an important novelty from the viewpoint of data science that requires a large number of raw data, which can be converted into various descriptors.

Materials
Monodisperse silica nanoparticles, whose nominal radius was 100 nm (803847-1ML, 9.6 × 10 12 particles/mL, Sigma-Aldrich), were used as model dispersion particles. From TEM measurements ( Figure S1, Supporting Information), we estimate the average particle radius to be 101 ± 6 nm (mean ± one standard deviation; sample size s = 148). Poly(methyl methacrylate) (PMMA) microparticles having an average particle radius of ~10 μm (NMB-2020, ENEOS LC COMPANY) were used as model pollutant particles. The silica nanoparticle dispersion was diluted using pure water to obtain 1 × 10 4 particles/nL (1 nL is a typical irradiated volume viewed by a detector) and used as a reference sample. For the demonstration of noise reduction, the PMMA microparticles were added to the dispersion as pollutants. The particle densities of the mixture were 1 × 10 4 particles/nL for the silica nanoparticles and 1 × 10 −1 particles/nL for the PMMA microparticles. Because the microparticles gradually sink to the bottom of the cell due to gravity, the sample was shaken before measurement.

Dynamic light scattering (DLS)
All the DLS measurements were performed at room temperature (23°C). A schematic of the developed DLS apparatus is shown in Figure 1. The vertically polarized output of a He-Ne laser (05LHP991, Pacific Lasertec) at wavelength λ 0 = 632.8 nm was focused on a quartz cell filled with the sample dispersion. The scattered light was collected at a scattering angle of approximately 90° and focused onto a photon counting module (C11202-050, Hamamatsu Photonics). Details of the calibration of our DLS system are presented in Sect. 2 of the Supporting Information. The intensity of the incident laser was adjusted by a neutral density filter to obtain a count rate of 10-20 kcps. Although the count rate can be increased up to Mcps level, we reduced it to see the effect of pollutants, which increases the count rate drastically, more clearly. The electronic signal pulses from the photon counting module were stretched by a homebuilt pulse stretcher circuit, which eliminates the ringing noise and afterpulsing, and transferred to a time-to-digital converter (TDC) constructed with digital modules (NI-9402 & cDAQ-9174, National Instruments). The TDC recorded the arrival time of each detected photon. The temporal resolution of the module was 12.5 ns, i.e. slower than the state-of-theart autocorrelators by approximately four orders of magnitude [38] but still sufficiently fast to measure the relaxation of the time correlation function of the scattered light intensity, which was typically >10 μs. Each measurement involved the detection of 10 6 scattered photons. All the arrival time information was stored in a text file and analyzed after the measurement.

Data processing
In a DLS measurement, one can estimate the size of nanometer-scale objects in dispersions by obtaining the time correlation function G 2 ð Þ from the number of scattered photons n(t, Δt) detected between times t and t + Δt, as where τ is the correlation time and ::: h i T 0 ;Δt indicates time averaging. In actual calculations, the integral in Eq. (1) is substituted by a summation, and normalized by the absolute scattered light intensity, nðtÞ h i 2 T Δt , as described in Sect. 3 of the Supporting Information. We confirm that the normalized time correlation function T Δt does not sensitively depend on Δt, as shown in Figure S2 (Supporting Information). On the one hand, it is desirable to set Δt as short as possible in order to obtain the correlation function down to the short time regime, because the correlation function can be defined only for τ ≥ Δt. On the other, too short Δt would lead to a long calculation time, which is proportional to (Δt) −2 . As a compromise, we selected Δt = 20 μs in this study. τ was set to τ = mΔt where m is an integer.
We now consider the DLS specifically from monodisperse nanoparticle dispersions. In the case of dispersion without large pollutants, one can approximate that the magnitude of the scattered electric field is always Gaussian [15]. As we discuss in the following paragraph, this approximation is not valid when the sample dispersion contains large pollutants. Under this assumption, g 2 ð Þ τ ð Þ is expressed as where (2Dq 2 ) −1 is the relaxation time and β is a coherence factor [39]. D is the diffusion constant and q is the momentum transfer defined as k B , T, η, R h , n r , and θ are the Boltzmann constant, absolute temperature, viscosity of the solvent, hydrodynamic radius, refractive index of the solvent, and scattering angle, respectively. One can obtain the particle size R h by fitting the experimentally obtained time correlation function to Eq. (2), if all the other parameters are known. When the dispersion contains large pollutants that do not show Brownian motion, by contrast, the signal includes occasional but intense light scattering from pollutants, as shown in Figure 2(a). In this case, the Gaussian approximation does not hold, leading to the failure of reasonable estimation of the R h values. In the present work, we take advantage of our all-photon recording scheme using a TDC and circumvent this problem by performing post-processing on the noisy signal. The post-processing procedures are overviewed in Figure 2(b). We first divide the scattered light intensity data into short time spans, as shown in Figure 2(a), each typically 1 s long. Later, we will discuss the analysis in the case where the time span is as short as 0.1 s. We then calculate the time correlation function for each time span. Because Eq. (2) should converge to 1 when τ ≫ (2Dq 2 ) −1 , we accept the time correlation function only if it converges to 1 at long τ, which is given by the criterion as and shown schematically in Figure 2(c). Specific correlation time to judge the convergence, τ c , is chosen in the 10-100 ms range. This is sufficiently greater than the decay time of the present system (2Dq 2 ) −1 = 0.7 ms (Eq. S2 in Supporting Information). The time window width 2Δτ in Eq. 5 is fixed to 4 ms (200 data points), which is large enough to suppress the random noise. We set the throshold A th to be just slightly larger than 1, considering a small but finite deviation originating from the limited measurement time. Finally, we take the average of the time correlation function from different time spans that satisfy the criterion, and fit the averaged g 2 ð Þ τ ð Þ data to Eq. (2) to obtain the nanoparticle size R h as the fit parameter. The present noise reduction scheme can be applied to the combination of the solute particles that follow the Brownian motion, with their size typically up to several micrometers in diameter [40], and the pollutants that are larger in size and do not follow the Brownian motion. The lower limit in the solute size is several Step-by-step description of the post-processing noise reduction scheme proposed in this article. (c) Schematic of time correlation function g 2 ð Þ τ ð Þ evaluation. The function from each time span will be added to (excluded from) the averaging in the next step if it is smaller (larger) than a preset threshold A th at a sufficiently long correlation time τ c , as shown in the left (right) panel.
nanometers, which is given by the detection limit of our DLS system.

Evaluation of software-based DLS
We first perform the DLS measurement on the silica nanoparticle dispersion without large pollutants as a reference sample. Figure 3 compares the time correlation function of the scattered light intensity measured by our software-based DLS system (black line) with the one measured using the commercial DLS system (red). After optimizing the aperture size in front of the collection lens and detector, the coherence factor β of 0.9 was obtained for our system, indicating a high coherence for the present setup. Two time correlation functions overlap almost perfectly after vertically scaling the latter, demonstrating the validity of the newly developed DLS system. The time correlation function obtained from the commercial DLS system gives a silica nanoparticle radius R h of 104 ± 8 nm (s = 100), consistent with the value obtained by TEM (101 ± 6 nm).

Implementation of post-processing noise reduction
Next, we tested the new system by using a silica nanoparticle dispersion containing large-size pollutants. A small number of 10-μm-radius PMMA microparticles were added to the silica nanoparticle dispersion as the model pollutant. Using our newly developed DLS system, we obtained the scattered light intensity as a function of time (Figure 4(a)). The signal mostly consists of random fluctuation of the photon count at <100 counts/ms. This resembles the signal from the reference sample without PMMA ( Figure S3, Supporting Information) and can therefore be attributed to the scattering by the silica nanoparticles, as schematically illustrated in the top-right panel of Figure 4(a). The signal occasionally shows large spikes with counts an order of magnitude higher, howeverfor example, during the spans t = 20-23 s and 48-52 s. The spikes are absent in the signals from the  reference sample without PMMA ( Figure S3) and are therefore attributed to scattering by PMMA microparticles passing the laser focal volume, as illustrated in the top-left panel of Figure 4(a). If we calculate g 2 ð Þ τ ð Þ using the entire signal including the large spikes in Figure 4(a), the obtained time correlation function does not converge to 1 even at a long correlation time of τ = 0.1 s (dashed black line, Figure 4(b)), indicating failure of the Gaussian approximation. We note that this time correlation function is expected to be equivalent to that obtained using a commercial, autocorrelator-based DLS system.
We now perform the post-processing noise reduction on the noisy signal in Figure 4(a). As a demonstration, we show in Figure 4(c) the time correlation function calculated for every 1 s after t = 18 s. We see that the time correlation functions for t = 18-19 s and 19-20 s approach 1 at a correlation time τ c of 10 ms, which is sufficiently long compared with the relaxation time of the nanoparticles, (2Dq 2 ) −1 ~ 0.7 ms, in our setup. The convergence indicates the validity of the Gaussian approximation for these time spans. For t = 21-22 s and 22-23 s, in contrast, the time correlation function is well above 1 throughout the correlation time of <0.1 s. These time spans contain huge spikes, as shown in Figure 4(a). The spans shaded yellow (gray) in Figure 4(a) represent those with a time correlation function satisfying (not satisfying) our criterion of the convergence, Eq. (5), with parameters τ c = 10 ms and A th = 1.03. Here, we see that in addition to the time spans with obvious spikes, there are also time spans, e.g. t = 38-43 s, whose signals seemingly free from spikes but which do not satisfy the criterion. The blue solid line in Figure 4(b) represents the average of the time correlation functions for only the yellow-shaded, 'uncontaminated' time spans. Fitting the blue line to Eq. (2) gives the hydrodynamic radius R h .
We performed 60 repeated DLS measurements and data processing on the same sample to obtain better statistics. The results are summarized in Figure S4 (Supporting Information). We obtained R h = 111 ± 4 nm as the average of 60 measurements. This value shows good agreement with that obtained for the reference sample without PMMA microparticles, 104 ± 3 nm (s = 6) within the standard deviation. The small discrepancy can be attributed to the imperfect removal of the scattered light from the PMMA microparticles, which can be improved by changing the time span as shown later. We would like to emphasize that our DLS system requires no more than a single measurement even for contaminated samples, which is in contrast to the previous noise reduction schemes [28,31] that require measurement runs to be repeated until an uncontaminated data set is obtained.

Robustness and flexibility of post-processing noise reduction
Thus far we presented results with fixed values for the parameters τ c and A th in Eq. (5). We now present an evidence of how the choice of these parameters can affect estimations of nanoparticle size. We performed the post-processing noise reduction on the measurement run shown in Figure 4(a) with different parameter combinations. The accepted/rejected time spans are presented in Figure S5. It is apparent that more time spans are accepted for larger A th . Similarly, we performed post-processing on the 60 independent measurements shown in Figure S4. Figure S6 in the Supporting Information compares the time correlation functions obtained after post-processing using different values of τ c and A th . The statistical average of the nanoparticle radius for each parameter set is listed in Table S1. Although there may be a tendency, within the error, for higher τ c and A th values to generate slightly larger R h values, we can conclude that the estimated radius is basically insensitive to the choice of the parameters, which is an important feature for a robust analysis scheme.
The greatest advantage of our DLS system, in comparison to the conventional system based on an autocorrelator, is the flexibility in the reconstruction of the time correlation function. All the parameters can be optimized after the measurement based on the recorded photon counting data. The time span for the data accumulation, for example, should be as short as is possible to avoid the noise from contaminants, but sufficiently long to guarantee a reasonable signal-to-noise ratio. Here we analyze the representative data with the time span of 0.1 s, which is much shorter than the shortest employed in the data analysis of DLS with contaminated samples, 1 s [32]. The result is shown in Figures 5 and S7 (Supporting Information). The data are taken from Run 19 of the 60 repeated DLS measurements ( Figure S4, Supporting Information) that show a relatively large R h . The fitting analysis of the time correlation function calculated with the time span of 1 s gives R h = 115 nm, which is larger than the actual sample size. In contrast, the time correlation function calculated with the time span of 0.1 s gives R h = 104 nm, in good agreement with the actual sample size. This result demonstrates the applicability of our DLS system to the sample with a significant scattering from the large pollutants, as far as the multiple scattering does not affect the time correlation function. We have so far limited our sample to a monodisperse nanoparticle dispersion. The same noise reduction scheme can in principle also be applied to the polydisperse dispersion samples that contain more than one type of nanoparticle. In the latter case, the particle size distribution is generated by analyzing the time correlation function obtained via our method, using techniques such as inverse Laplace transformation and maximum entropy method [16,[40][41][42].
One of the novelties of our software-based DLS system lies primarily on the flexibility of the time span setting for the post-processing. In principle, the noise reduction scheme proposed here can also be applied to the time correlation functions obtained by repeating short-time measurements using an autocorrelatorbased DLS system. In practice, however, the appropriate time span for the measurements is often unknown, and a lengthy trial-and-error process is then necessary to determine the time span. In this regard, our DLS system based on all-photon recording has a significant advantage when it is applied to systems whose size can irreversibly evolve with time, e.g. polymers and nanoparticles undergoing morphological changes during synthesis and gelation [43][44][45]. Because our scheme can offer full control of the parameters for analysis after a single long-time DLS measurement, one does not need to know the time scale of the morphology change in advance, in contrast to the conventional DLS apparatus that can acquire time correlation functions only at a pre-fixed time interval. In addition, the time spans for the calculation of time correlation functions do not need to be identical throughout the entire measurement time in our scheme. In the case of the heatinduced aggregation of proteins [46], for example, it would be more convenient to use a long time span at the beginning, when the scattered light intensity is low, and a shorter one later, when the aggregation leads to higher scattered light intensity.
We would like to emphasize that the most important point of our methodology is the flexibility of the data analysis using all the photon arrival times. One of the promising applications is the direct evaluation of the distribution of the photon number detected in Δt, which can be calculated from the recorded photon arrival times. In the case of noise-free dispersion, this distribution should follow Poisson distribution. Therefore, we could detect the contamination as the deviation of the photon number distribution from Poisson distribution, which may show better noise rejection ability because of the large sample size though has never been implemented by the existing DLS system. Like this, our software-based DLS will offer various choices for data analysis that cannot be performed by autocorrelator-based DLS.

Conclusions
We proposed a DLS apparatus based on an all-photon recording, post-processing scheme for noise reduction. The developed system requires only a single measurement and uses a time-to-digital converter to record all the photon arrival times, instead of an autocorrelator. We demonstrated the validity of our noise reduction scheme by applying it to evaluate the size of silica nanoparticles in a dispersion containing a small amount of large PMMA microparticles. By rejecting the data in the time spans affected by the intense noise signal from the pollutants in post-processing, we successfully extracted the uncontaminated time correlation function and reasonably estimated the nanoparticle size, in spite of occasional but strong scattering from pollutants. Further, by taking the advantage of its flexibility in the reconstruction of the time correlation function, we showed that the developed DLS apparatus has better noise reduction ability compared to the existing noise reduction techniques with the autocorrelatorbased DLS. The developed DLS apparatus can be potentially applied to time-resolved DLS measurements.

Note
Supplemental material is available: TEM observation experimental method and results, calibration of the DLS apparatus, calculation of the time correlation function, scattering data for the reference sample, effect of noise reduction for all the measurement data, parameter dependence of the noise reduction.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the 'Nanotechnology Platform Project' operated by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan [No. 2020-NISB-073].