Assessing suspended sediment fluxes with acoustic Doppler current profilers: case study from large rivers in Russia

ABSTRACT Surrogate measures are becoming increasingly used to measure suspended sediment flux, but only few particular computer techniques of data processing are recently developed. This study demonstrates capabilities of acoustic Doppler current profilers (ADCPs) to infer information regarding suspended-sand concentrations in river systems and calculate suspended sediment flux via big data analytics which includes process of analyzing and data mining of measurements based on ADCP signal backscatter intensity data. We present here specific codes done by R language using RStudio software with open-source tidyverse and plotly packages aimed to generate tables containing data of suspended load for cells, verticals and whole cross-section based on backscattering values from 600 kH Teledyne RDInstruments RioGrande WorkHorse ADCP unit, as well perform estimates of morphometric, suspended sediment concentration (SSC) and velocity characteristics of the flow. The developed tools enabled to process large data array consisting of over 56,526,480 geo-referenced values of river depth, streamflow velocity, and backscatter intensity for each river cross-section measured at six case study sites in Russia.


Introduction
Sediments are an integral and dynamic part of aquatic systems and play a major role in the hydrological, geomorphological, and ecological functioning of river basins (Chalov, Golosov, & Tsyplenkov et al., 2017;Collins & Walling, 2007;Kemp, Sear, & Collins et al., 2011;Walling, 2006). It is a conglomerate of organic and inorganic materials that can be soil particles, mineral matter, decomposing organic substances, inorganic biogenic material which can be transported as a suspended (floating in the water column) or bed-load (carried in near-bed layer). Most mineral sediments come from bed erosion and soil and bedrock weathering. Organic sediments are typically detritus and decomposing materials such as algae and further are related to suspended sediments. River sediments carry tremendous quantities of chemicals including the most challenging for the environment such as heavy metals, nutrients, PAH (Horowitz, 1985;Townsend, Uhlmann, & Matthaei, 2008).
Due to the comprehensive nature of origin, studies on sediment transport remain one of the most complicated segments of land surface hydrology. Traditional measurement methods are often used to estimate suspended-sediment transport rates based on the deployment of appropriate instantaneous physical suspended sediment samplers (e.g Edwards and Glysson (1999), and Davis (2005)) and, if sampling from a boat, holding the boat stationary for water column sediment sampling (Diplas, Kuhnle, & Gray et al., 2008). Studies on the interaction of flow and sediment transport, quantitative assessment of hydrodynamic sorting of suspended sediment within the depth column are mostly based on laboratory datasets (Julien, 2010); whereas recent field data from the Amazon River (García, 2008), Ganges River (Lupker, France-Lanord, & Lavé et al., 2011), Parana River  significantly extended knowledge on suspended sediment behavior over large rivers cross-sections. Dimensionless Rouse number (Ro) is widely applied to describe the ratio of upward and downward forces acting on the grains in the fluid: where k is von Karman's constant (0.41), ωs is settling velocity (m/s) of the sediments, which is a function of grain size, shape and density, V � is bed shear velocity, and β the ratio of sediment and water momentum diffusion coefficients, generally assumed to be 1. Within small-and medium-sized rivers conditions lower than the critical Rouse number, Ro* = 2.5 indicates that a sediment particle begins to contribute to the suspended load (Lynds, Mohrig, Hajek, & Heller, 2014). Whereas higher values indicate that a sediment particle is most likely transported as part of the bed load. Rouse number explains the theoretical Rouse profile of sediment distribution: S -suspended sediment concentration (SSC), S a -near bottom suspended sediment concentration, h -river depth, z -normalized depth at point, Ro -Rouse number, a -nearbottom layer at distance from the bottom a = 2×D 50 , where D 50 -mean bottom sediment diameter. As a milestone in the history of sediment transport, the Rouse formula (1) has been widely used for decades (Chalov, Moreido, & Sharapova et al., 2020;Graf & Cellino, 2002;Zheng, Li, Feng, & Lu, 2013). However, limitations of this theory were widely known (Julien, 2010). Nevertheless, all these studies rely on limited empirical data and validation was nearly impossible. The empirical data to establish criteria of each mode are very limited (Bouchez, Métivier, & Lupker et al., 2011;Lupker, France-Lanord, & Lavé et al., 2011).
This and other aspects of surface water sediment transport were recently significantly improved due to application of sediment concentration surrogate technologies measurements which are based on optical, laser, and acoustic principles (Gray & Gartner, 2009;Pomázi & Baranya, 2020). Acoustic technologies, based on commercial acoustic Doppler current profilers (ADCPs), have been recognized as potential tools for the quantification of sediment transport in natural streams using the echo intensity levels as a measure of acoustic backscattering strength. The acoustic methods are based on assessing the velocity within a unit volume of water by measuring the Doppler shift of the frequency of the ultrasonic signal emitted by the ADCP instrument and reflected from the suspended matter within this volume. Common operation frequencies for ADCPs cover the range between 300 and 3,000 kHz. To measure water discharge, the ADCP unit is mounted on the moving boat or other vessel and transmits acoustic signals into the water column towards the river bottom. The echoes of the signal reflected from small particles of mineral and organic matter are referred to as backscatter intensity. The latter is subsequently attributed to different depths within the measured range to the bottom yielding the backscatter and velocity vertical profile. Commonly, the ADCP units have 1 to 9 ultrasonic emitters which operate at a various frequency range which allows for simultaneous quality control of the received signal. Furthermore, the ADCP units are additionally equipped with echosounders and GPS receivers which enable measurements within the local (by tracking the river bottom) or global coordinate systems. This allows for accurate water velocity measurements as the boat speed is subtracted from the stream flow velocity. As the boat moves across the river from one bank to the other the vertical profiles are seamlessly combined to form the cross-sectional velocity map which allows for stream flow discharge calculation (Mueller & Wagner, 2009).
The ADCP-discharge measurement provides large amount of backscatter and velocity data that is received with every measured cross-sectional profile and thus is applicable for suspended sediment concentrations analyses. The use of down-looking ADCPs to estimate SSC has been investigated by many researchers (Boldt et al., 2012;Boldt, 2015;Dominguez Ruben et al., 2020;Gartner, 2004;Guerrero, Szupiany, & Latosinski, 2013;Latosinski, Szupiany, & García et al., 2014;Moore, Le, Hurther, & Paquier, 2013;Mullison, 2017;Szupiany et al., 2016Szupiany et al., , 2019Wall, Nystrom, & Litten, 2006;Wood, Szupiany, Boldt, Straub, & Domanski, 2019). Several software tools have been developed (STA described in Boldt et al. (2012); ASET used in Szupiany et al. (2016)); and commercial software such as Aquavision's ViSea to process ADCP data for use in estimating SSC. Nevertheless, the cited software is mostly related to calculating suspended sediment flux, whereas do not allow to process particular data of sediment concentrations within river cross-section in combination with morphometric and hydraulic information. At the same time the implication of programming and design tools that are nowadays realized in packages for modern highlevel programming languages such as R, Python, etc., represent reliable background for developing such methodology which can improve understanding of sediment behavior and related hydrological phenomena.
This paper aims to apply acoustic inversion techniques using commercially available, down-looking acoustic Doppler current profilers (ADCPs) to quantify suspended sediments fluxes in river channels. In particular, we aim (1) to develop an integrated freely available open-source tools of ADCP big data analytics (process of analyzing and data mining of measurements), (2) to demonstrate its applications for understanding sediment concentration of particulate matter distribution over cross-sections, (3) considering accuracy of ADCP data, to estimate sediment concentrations, attempt to improve the assessment of the suspended sediment flux and related phenomena of suspended to bed load partitioning. In the paper, we describe 6 case studies that enable testing the big data analytics over various conditions. The methodology was tested over the largest Arctic rivers of Russia (namely the Ob, Yenisey, Lena and Kolyma rivers, case studies 1-4), at the Selenga River which is the largest tributary of the Baikal Lake (case study 5) and along the Moskva River and its tributaries draining the largest megacity of Russia -the Moscow city (case study 6).

Data collection and methods
Both large and small rivers were encompassed during the ADCP methodology application. The variability of conditions from the case studies allowed for testing the proposed methodology robustness. The collected datasets (freely available from Zenodo, Chalov, Moreido, and Ivanov et al. (2022)) were used here to develop the proposed tools for ADCP application in sediment and particulate chemicals transport research.
Case studies 1-4 ( Figure 1(a)) comprise the four largest Arctic Siberian rivers which were studied under the ArcticFLUX project in 2018-2021 (Vihma, Uotila, & Sandven et al., 2019) and included continuous ADCP measurements along with a sampling of the dissolved and particulate organic matter, nutrients, and metals fluxes based on unprecedentedly dense river cross-section inventory multiple times per year. Here the measurements were done at constant cross-section at each river located upper from the impact of recipient seas (tides, surges) near the cities of Salekhard (Ob River), Igarka (Yenisey River), Zhigansk (Lena River) and Cherskiy (Kolyma River).
Case study 5 is the Selenga River, which originates in Mongolia, and contributes about 50% of the total inflow and 82% of sediment load into the Baikal Lake. The study here was based on detailed ADCP measurements over 20 transects (named S1 . . . S26, Figure 1b) in the lower 200 km of the river course.
Case study 6 is the Moskva River which is a relatively small stream in Central Russia influenced by dam regulation, water transfer projects and tremendous wastewaters loads from the Moscow city reaching population of 15 mln (Bityukova & Koldobskaya, 2018;Kirillov, Makhrova, & Nefedova, 2019). The urban sewage from Moscow City, which contributes to half of the total water flow downstream of the city, is rich in nutrients and organic matter and therefore plays a large part in the Moskva River organic pollution (Tereshina, Erina, & Sokolov et al., 2020). ADCP measurements and water quality sampling network at 38 points have been established since 2019 along the Moskva River (named M1, M2) and 17 tributaries (named T01, T02) (Table 1, Figure 1).
For each of the case study rivers (Table 1) water discharge Q and suspended load data collection were carried out by discharge measurements using an Acoustic Doppler Current Profiler (ADCP) unit. We used the Teledyne RDInstruments RioGrande WorkHorse ADCP unit with working frequency of 600 kHz mounted on a moving boat. This ADCP system is a downward-looking profiler that broadcasts forward, aft, right-and left-lateral acoustic signals, each angled approximately 20° from the vertical transducer (called a Janus configuration Hauer and Lamberti (2017)). The velocities V and backscatter intensities BI are measured at each depth which can be further used to count flow velocity and echo intensity distributions over a cross-section. The collected data is stored in binary format, which can be natively exported to ASCII format.
For each cross-section three samples (surface, middle layer and near bottom) per three verticals distributed along the transect were obtained. In total, nine samples were taken during a single ADCP measurement. Water samples were pumped out with a filterless submersible 12 V pump from three layers (top, midsection and near-bottom) to account for the vertical distribution of the suspended sediment. Pumping was done with relatively low pressure and speed (approximately 1 liter per minute) which provides relatively unchanged linear velocity (isokinetic sampling). For each sample in a depth profile, the boat was repositioned at its original location, and sampling was performed while drifting at the river water velocity. The depth and width were further marked at the ADCP profile, the point indicated correspondence between measured BI from ADCP and sampled SSC. We used the raw BI values from the ADCP and no backscatter correction was applied (see Szupiany et al. (2019) for BI correction procedure discussion).
The water samples were then filtered for suspended material through a pre-weighted 0.45-μm membrane filter to determine suspended sediment concentration by gravimetric method. The suspended sediment grain size was measured with a Fritsch Analysette 22 NanoTec Laser particle sizer (FRITSCH GmbH, Industriestrasse 8.55743 Idar-Oberstein, Germany). All grain sizes were classified into three categories: clay (grain sizes d <5 μm), silt (d = 5-50 μm), and sand (>50 μm). Based on these results, all rivers were classified as single-modal or bi-modal suspended sediment distributions according to existence 1 or 2 maxima in grain size classes distribution. To calibrate relationship between backscatter and suspended sediment concentration we used power-law least-squares fitting between the raw backscatter values BI and the measured suspended sediment concentration SSC (see additionally in Efimov, Chalov, and Efimova et al. (2019)) for the specific rivers and hydrological seasons. For this purpose, only profiles with sufficient amount of simultaneous SSC gravimetric and ADCP-based BI measurements carried out under constant discharge conditions were considered.
Since we used the raw uncorrected BI values, we constructed separate formulas for each of the case studies.
Here the low R 2 is explained by contrasting conditions under field measurement used for the relationship. For the Lena River the SSC=f(BI) relationship was done based on 18 measurements carried out on 08 and 16 June 2019 (n = 18), which lead to the following relationship (R 2 = 0.57): For the Kolyma River the R 2 values were equal to 0.31 based on 27 SSC measurement carried out simultaneously with the ADCP profiling on 26 July, 7 August and 16 August 2019: For the Selenga River the measurements were carried out between 27 July and 1 August, 2018 (see additionally in Chalov, Liu, and Chalov et al. (2018)) on an extensive  river reach. Based on three samples per ADCP profile, the following fits were obtained (n = 29, R 2 = 0.42): Finally, for the Moskva River the relationship was based on the two measurements carried out at downstream section M38 of the studied transect (Figure 1) of ADCP profile (n = 18): The datasets for the Yenisey River did not provide sufficient number of SSC gravimetric measurements to capture significant relationship (n = 9).
The dataset was processed in R language using RStudio software (Rstudio Team, 2019) with open-source tidyverse (Wickham, Averick, & Bryan et al., 2019) and plotly packages (Sievert, 2020). The tidyverse package is a state-of-the-art collection of R packages designed for table data manipulation, processing, filtering and analysis. This tool was initially applied to eliminate errors and flaws in the dataset of vertical suspended sediment concentration profiles for each measured river cross-section. The plotly package is the open-source graphical package that allows creation of multiple types of interactive figures (see Supplementary 1). The full code can be accesses from https://sediment.ru/ data/R-adcp_v01.zip. The full code consists of three parts with different functionss).
Code 1 is used to convert the backscatter intensity (BI) values to SSC (mg/l), fill in gaps in the original ASCII file, including in the near bottom layer using the Rouse number. In particular, this code provides a calculation of bed load by the L.C. van Rijn model (van Rijn, 2007). The output of the code is a series of tables containing data of suspended load for cells, verticals and the whole cross-section.
Code 2 is used to create a database of morphometric, SSC and velocity characteristics of the flow to find their relationship with the Rouse number. This code creates correlogram graphs for visual analysis and correlation tables of linear regression between SSC, predictors and their simple mathematical transformations.
Code 3 searches for the relationship between the Rouse number and morphometric characteristics, as well as between the Rouse number and SSC using machine learning methods. The output of this code is a series of figures of the distribution of the Rouse number of cells, verticals and whole cross-section within its predictors. Also, this code calculates linear regression coefficients with formulas for the dependence of SSC on morphometric and hydraulic characteristics.
ADCP datasets were applied for assessing suspended load flux (Figure 2). For each ADCP cross-section suspended sediment load Q R was calculated by averaging the SSC over cross-section with ADCP-based water discharge Q: The processing of flux estimate requires additional operations due to technological limitation of the ADCP which contains blank (unmeasured) areas in near-bottom part of the transect. For this, extrapolation of suspended sediment concentration in the bottom part of the profile SSC was made by Rouse curve and logarithmic velocity curve by Grishanin (Grishanin, 1972) for each vertical separately to account for SSC distribution in the river cross-section V hi -mean velocity on the depth i, V surf -near-surface water velocity; h -river depth, znormalized depth at point, I -water level slope. This estimate Q RADCP was further compared to the estimate of Q R calculated using the traditional point sampling based on 9 samples over cross-section: Bed load fluxes were estimated using a simplified formula for bed load transport on natural sand-bed river bed load transport data sets (van Rijn, 2007): where ρ s is sediment density, ε is an empirical coefficient (taken 0.015), d 50 is the mean size of bed load sediments, R -specific gravity. The bed load in each cross-section was calculated based on q G as: Further partitioning between bed load and suspended load was assessed as R/G = Q R Q R þQ G for each transect. Additionally, for each vertical the Rouse number (Eq. 1) were calculated using interpolation from eq. 2. Using standard statistical methods, the Rouse number for each vertical and cross-section averages were compared with factors that affect the distribution of particulate matter in the river cross-section. For example, we used morphometric parameters of cross-sections: z -normalized depth at point,h b -distance from the water surface, m, h -depth from the bottom, m.
dist norm -normalized distance from the bank, dist c -distance from the bank of the point with the maximum depth, dist -distance from the bank, m. This value provides a measure of exact vertical position in the cross-section related to the maximal depth. ADCP datasets were further used to describe suspended sediments' vertical distribution by improved Rouse law, e.g. following Nie, Sun, & Zhang et al., (2017) as: These operations resulted in a large data array, e.g. for two of four Arctic rivers (Ob and Yenisey) consisting of 350,000 values of SSC, velocity, distance from the bank, depth of the point and cross-section. More features were constructed from the initial variables by simple mathematical transformations, like polynomial feature expansion (degrees = −1, 2, 3) and logarithmic expansion. By using rational feature selection, we found that the especially important variables are the normalized depth from the bottom z (Eq. 14) and the normalized distance from the bank (Eq. 15). Data processing flowchart is given in Figure 3.

Results
We processed the dataset formed by 56,526,480 values of 120 hydraulic and morphometric predictors (SSC, velocity, over verticals and zand distance from the bank, depth of the point and cross-section, and their simple mathematical transformations like polynomial feature expansion, degrees = −1, 2, 3; and logarithmic expansion). We analyzed the correspondence between sediment distribution over verticals and z and dist norm parameters (Figures 4 and 5). Suspended sediment distribution depends mainly on normalized depthz (Rcor = 0.3, p-value <2.2 × 10 −16 , n = 471,054). We obtained a significant relationship by dividing the SSC values by the near-bottom SSC (at the maximum depth for each ensemble) values. Figure 5 demonstrates high variations in concentration (related to the near bed concentration) related to specific Rouse numbers Ro for each vertical (the power in vertical sediment distribution power law (Eq. 16)), that controls the increase of SSC from the surface to the bottom. The Rouse number Ro in this case controls the local environment hydraulic sediment flux distribution at each vertical and profile. Sediment concentration relationships with flow velocity (Rcor = −0.05, p-value <2.2 × 10 −16 , n = 471,054)  (Table 2). Various ADCP cross-sections demonstrate decrease of Ro towards dist norm = 1 which corresponds to channel maximal depth vertical. This emphasizes the increase of sediment concentration gradients at the central sections of river profiles.
For selected cross-sections total suspended flux was estimated based on ADCP measurement using eqs. 3-7 and compared with flux estimate which is based on point samples taken during discharge measurement (9 samples over the cross-section). Due to uniform distribution of organic and inorganic suspended load concentrations across the fluvial section, Q R estimates based on ADCP data are differed from determined by point samples. The difference varies from −3 to −84% (Table 3). As far as ADCP data is homogeneously distributed over cross-section, these differences might be interpreted as accuracy improvement of ADCPbased flux estimate compare to traditional methods.
In average, across the three case studies (Table 3), the application of ADCP processing enhances the sediment flux estimation accuracy up to 51%.

Discussion
The results obtained in this study using big data analytics significantly improve knowledge on the specific aspects of sediment transport. Here we discuss these applications in relation to sediment concentration of particulate matter distribution over cross-sections and suspended to bed load partitioning.   This study relies on the calibration method of backscatter intensity to capture sediment concentration. The calibration methods used in our study assume that the echo intensity is governed by average sediment concentration (Sakho et al., 2019). This approach leads to the relationships between SSC and BI characterized by regional equations (eqs. 3-7). Differences between relationships are explained by various particle size distributions. The Lena and Kolyma rivers represent single-mode particle size distribution (silt fraction dominates and form over 40% of suspended sediments), whereas the Ob, Yenisey and Moskva rivers are distinguished by bi-modal particle size distribution (clay and sand fraction transported). This is in line with theoretical (Latosinski, Szupiany, & García et al., 2014) and empirical evidences  which indicate that the sand fraction dominates the backscatter measurements for 600 and 1200 kHz frequencies. For the case study of the Parana River , acoustic method for estimating suspended-sand concentration which considers various classes of suspended sediments resulted in mean deviations within about 40% from sampled concentrations for all survey locations. From our dataset, we compared linear fits between BI and SSC for sand fractions (>50 μm). For that, the full suspended sediment concentration samples were reduced to macro class above 50 μm size, which represents the concentration of sand fraction in the river flow (SSC >50 μm , mg/l). The results indicated that R 2 was increasing after changing SSC to SSC >50 μm at eq. (3) from 0.28 to 0.45 and for eq. (4) from 0.59 to 0.67. The particle size analyses were not performed for the remained rivers. This finding confirms the statement that backscatter from the wash-load fraction (typically below 62 μm size) is negligible compared to backscatter from the sand fraction (Latosinski, Szupiany, & García et al., 2014) and should be considered in further studies.
We also attribute the low R 2 values of the constructed regional BI-SSC relationships to the fact that we had used the raw BI values, not corrected for intrinsic and ambient noise . These findings generally confirm previous research that substantial correlation with corrected backscatter and SSC exists, while raw backscatter intensity does not reasonably predict SSC. Further research is recommended to proceed with corrected BI and to estimate the error values for each of the individual ADCP units used in the study.
Another source of the uncertainties beyond the proposed methodology is related to the field methods of water sampling. Significant errors might occur due to spatial discrepancies occurred between sampled water required for calibration and ADCP measurements. Additionally, ADCP profile shows instantaneous flux distribution which might be significantly varied over short time interval (seconds and minutes) due to vortices and coherent flow structures (Buffin-Bélanger, Roy, & Kirkbride, 2000;Ferguson & Church, 2004). The important development of the approach presented here is related to the theory of vertical distribution of suspended sediment concentrations. The obtained results here significantly enhance the extension of the large river datasets available for this theory.
The novel semi-empirical equations were found for the case-study rivers. e.g. for the Ob and Yenisey Rivers, the Rouse number is in the range of 0 to 0.2 ( Figure 6) evaluated as the average of the Rouse number (1) for each cross-section (totally 15 cross-sections), while the Rouse number is unique for each ensemble. For the Ob and Yenisey we tested dependency for low values of SSC from 56.8 to 88.2 mg/l with mean particles diameter 0.01-0.03 mm (Chalov & Efimov, 2021), and depths from 4 m to 50 m: Assuming the depth a as the depth of the last cell, which was measured by the ADCP (near the bottom non-measured layer), we fitted the sediment concentration vertical distribution curve by adjusting the Ro values. This yields the significant relationships (Rcor = 0.86, p-value <2.9·10 −5 , n = 15): The Ro dependence from h max is explained by better mixing of suspended load in the midstream compared to the area near banks. This phenomenon is likely related to the formation of secondary flow velocity cells and activation of new sources of sediment. Also, Figure 6. The observed values of Ro and partitioning conditions of sediment transport R/(R+G) under various Rouse and h d 50 . Small rivers, h < 0.5 m -data from Guy, Simons, and Richardson (1966) and Julien (2010) and measurements at the Moskva River conducted within this study. Theoretical lines are from Einstein integral (Shah-Fairbank & Julien, 2015). it can be explained by relatively homogeneous distribution of near-bottom sediment concentrations within cross-section which then requires smaller gradients of sediment concentrations changes at higher depths. This equation is in line with recent theoretical (Julien, 2010;Lane & Kalinske, 1941) and empirical (Zheng, Li, & Feng, 2012) developments, e.g. fractional advection-dispersion equation (FADE) model which was developed to describe anomalous diffusion of sediment (Chen, Sun, & Zhang, 2013).
The ADCP estimates provide a reliable and easily obtained data to calculate suspended sediment load. Simultaneous flow and sediment transport measurements improve procedure of calculating riverine flux which for large rivers is usually based on irregular measurements (Liu, Wang, & Wang et al., 2021;Mu, Zhang, & Chen et al., 2019). The estimates of the suspended flux presented in Table 1 and their discrepancy from estimates using traditional methods are explained mainly by the low spatial resolution of the traditional sediment sampling methods, and generally is corresponding with previously estimated differences . Given that the suspended sediment characteristics of the rivers described in this study are similar to many other sand bed rivers, especially large river systems throughout the world (Chalov, Liu, & Chalov et al., 2018;Latrubesse, 2008), the presented analysis advances efforts to provide more accurate and higher spatial resolution data for fluvial suspended-sediment studies.
Generally, to test the relationship between the observed Ro numbers, we plotted our results ( Figure 6) of the relationship between ratio of suspended to total load and Rouse number based on the CSU Laboratory data for the mixed load are from Guy, Simons, and Richardson (Guy, Simons, & Richardson, 1966) and for small rivers located within Moscow area conducted within this study. For that, both with ADCP suspended load fluxes, bed load fluxes (q G ) were estimated using equations (11)-(13).
Further partitioning between bed load and sediment load was assessed as R/G = Q R Q R þQ G for each transect. Lines shown are those from the Einstein Integrals of Guo and Julien as obtained by Shah-Fairbank (Shah-Fairbank & Julien, 2015) for h/d 50 100 and 100,000. It can be clearly seen that measured conditions in the Selenga River generally close to fit line for the rivers at h/d 50 = 100,000 ( Figure 6). At the same time this shows that one can obtain an extremely large variability in sediment concentration and thus R/(R+G) ratio in deep sand-bed rivers when the Rouse number is fairly large (Ro >0.5). Case study on the Selenga River is quite instructive in this regard, which reflects conditions of abrupt shift from mixed and bed load dominated channel (Ro <2) to suspended sediment dominated channel (Ro >2). The average Rouse number presented in Figure 6 can be significantly varied between single verticals, e.g. for the measurement on the Ob River done at 01/07/19 the average Rouse number 0.2 corresponds to changes from 0.01 at the midstream up to 0.8 at the near-bank area. On the large rivers (with depth over 10 m and h d 50 >100,000) suspended load dominates (Q R /(Q R +Q G ) = 0.5-0.75) under broad hydraulic conditions Ro (Ro = 0.001-2). Here we plotted our measurements at the Ob and Yenisey rivers and similar studies done at the Ganga River (Lupker, France-Lanord, & Lavé et al., 2011). These results for the first time indicate that partitioning between bed and suspended load on small and large rivers are similar but subjected to scale effects. Further studies based on ADCP datasets will provide a novel knowledge on the total sediment flux hydrodynamic partitioning which is one of the vaguest questions in river hydrology (Chalov, Moreido, & Sharapova et al., 2020;Turowski, Rickenmann, & Dadson, 2010). If the accuracy of ADCP application is considered, presented here big data processing approach will lead to tremendous enhancement of suspended sediment data (Lehotský, Rusnák, Kidová, & Dudžák, 2018;Pomázi & Baranya, 2020) and contribute novel concepts to understanding of suspended sediment fluxes behavior and estimates within the river profiles.

Conclusions
The ADCP big data analytics provides a fundamental shift in suspended sediment studies as a cost-effective and accurate technology. The main conclusions of the study are as follows: (1) We have developed an integrated freely available open-source tool consisting of 3 R-language codes for ADCP data processing which enables sediment concentration assessment within cross-sections of large rivers.
(2) The ADCP big data provide novel approach for flow hydraulics and sediment distribution studies on large rivers. The novel semi-empirical equation for sediment concentration was parameterized for the sandy (average diameter 0.01-0.03 mm) large rivers (average depths varying from 4 to 50 m). (3) The results derived in this study are used to improve accuracy of fluvial sediment transport estimates which are required to improve fundamental understanding of land-ocean coupling. Acoustic technologies show potential for estimating suspended load transport both quickly and accurately using big data on sediment concentrations. Another important output of our study is the understanding that specific methodological requirements should be fulfilled under processing ADCP data: e.g. raw backscatter values are not recommended as far as they do not reasonably predict SSC.
It is shown that suspended load and the sediment concentration profiles are well explained by the Rouse type equation, which allows us to estimate the instantaneous hydrodynamic partitioning of the sediment flux. The sediment flux estimates obtained by using this approach appear to be realistic, and the ADCP datasets provide a novel knowledge on partitioning of suspended and bed load.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
The study is done under implementation of Russian Scientific Foundation support (project 21-17-00181). Additionally, field studies at the Selenga River were supported by Russian Fund for Basic Research (project 18-05-60219), field studies at the Lena River -by Russian Scientific Foundation support (project 21-17-00181), field studies at the Moscow River catchment -within Scientific Foundation project 19-77-30004. The analytical experiments were supported by the Ministry of Science and Higher Education of Russian Federation under the Agreement 075-15-2021-574. Also, this paper has been supported by the Kazan Federal University Strategic Academic Leadership Program.