Assessment of physical condition and anthropogenic disturbance of streams of the northcentral United States

Abstract The recent introduction of large geospatial databases and virtual measurement devices for streams of the United States have the potential to greatly improve stream classification systems as well as answer fundamental questions about river morphology. The physical condition of over 800 streams of the adjoining Upper Midwest and Temperate Plains ecoregions of the northcentral US were analyzed using principle components analysis of 10 selected site variables. Delineation was along three axes, with the first axis corresponding to differences in base flow, temperature, and soil permeability; the second corresponding to stream gradient, depth to bedrock and water table, and composite topographic index; and the third corresponding to stream sinuosity. Separation of streams into the two ecoregions was distinct, and primarily along axis 1. Adding a secondary matrix of 10 anthropogenic and geographic predictor variables produced a similar ecoregional separation, with latitude, percent of non-native plants, and overall intact habitat corresponding with axis 1. Natural and anthropogenic differences in streams of these two ecoregions appear inexorably linked, a situation probably common throughout the developed world.


Introduction
Streams and rivers provide innumerable ecological services, such as organic carbon processing, downstream transport of sediment and other materials, and exchanges of O 2 , CO 2 , and CH 4 with the atmosphere (Cole and Caraco 2001;Raymond and Cole 2001;Allan 2004; Thorp et al. 2006;Bastviken et al. 2011). While covering only $0.5% of the Earth's land surface, the actual impact of streams and rivers on natural systems is undoubtedly greater than their land area suggests, due to hyporheic connections and flood events (Downing et al. 2012). Thus, understanding the natural and anthropogenic environmental variables affecting riverine systems is of critical importance.
The recent development of large geospatial data products, such as the National Hydrology Dataset Plus (NHDPlus) and Stream Catchment (StreamCat) databases, both administered by the US Environmental Protection Agency (USEPA) (Hill et al. 2016), have greatly increased accessibility to information about streams and their surrounding watersheds. Further, the development of tools such as Google Earth allows for making virtual measurements, such as stream width, without visiting a specific site. Recent papers by McManamay et al. (2018) and McManamay and DeRolph (2019) were some of the first to utilize the USEPA databases to assess stream conditions. In the latter paper, stream conditions for the conterminous US were inventoried based on natural factors such as size, gradient, and temperature; in the former, an additional layer of anthropogenic disturbance was added to streams of the eastern US. More recently, Isaak et al. (2020) utilized variables from the USEPA databases and elsewhere to classify thermal regimes of rivers of the western US. To date, the rivers of the central US have received less attention.
The purpose of this paper was to survey natural and anthropogenic variables of streams in the central US, specifically the Temperate Plains and Upper Midwest ecoregions (Herlihy et al. 2008;Omernik and Griffith 2014;Herlihy et al. 2020) (Figure 1). The former region is composed of $9,00,000 km 2 of tallgrass prairies and hardwood forest, with most habitats converted to pasture or row crops. The latter region composes $4,00,000 km 2 and remains mostly covered by either hardwood or coniferous forests. Previous stream classification studies within portions of the study area were of smaller scale (n 100 stream sites) and included only site-specific variables (Wang et al. 1998;Goldstein et al. 2002). The study area has some overlap with the eastern US of McManamay et al.'s (2018) study, mostly within IN, MI, and OH, but is otherwise an area that has not been studied using the geospatial datasets mentioned above. Further, potentially important variables, such as stream sinuosity, not addressed in McManamay et al. (2018) were included in this study. The specific objectives were to delineate overall patterns in physical conditions of stream sites in the two ecoregions, and relate these patterns to geographic and anthropogenic variables.

Materials and methods
Stream sites were chosen to maximize spatial distribution within the two ecoregions. To access the NHDPlus and StreamCat databases, the WATERSKMZ kml file (https://www. epa.gov/waterdata) was downloaded into Google Earth (GE). This interface allowed measuring streams along the transect using GE, while also having USEPA data about the same stream sites readily accessible. Once a general area was selected using GE, the nearest watershed was identified using NHDPlus Surfacewater Features. A line transect was placed across the streams of this watershed, at an angle that maximized stream intersections, and all of the sites intersected by the transect were selected for data acquisition.
Ten variables were used to classify the 819 selected stream sites (Table 1). Stream gradient (variable #1) was estimated using GE by determining the elevation at the site and the elevation of the stream 1-5 km above the site. The difference between the two values was divided by the measured length to calculate percent gradient. Stream sinuosity (2) was estimated by measuring a straight line in GE 1-5 km above the site to the site. Dividing the measured value of actual channel distance to this point by the value of the straight line calculated stream sinuosity. Mean summer stream temperature (3) for all years available (2008,2009,2013,2014), base flow as a percentage of total stream flow (4), the percentage of organic matter in the surrounding soil (5), soil permeability (6), mean distance to bedrock (7), mean distance to water table (8), composite topographic index (CTI) (9), and total runoff value (10), all at the local (Hydrologic Unit Code-12) catchment scale, were determined using the StreamCat database.
Ten geographic and anthropogenic variables were used as predictors of site characteristics (Table 2). Intact habitat (variable #11) was estimated from StreamCat by summing all land area within a site's local catchment not classified as row crop, developed, or pasture, and expressing the sum as a percentage of total land area as 'percent intact habitat'. All determinations used the most recent (2011) land cover data set. The percentage of each catchment occupied by non-native plants (12), percentage of site stream flow impounded by dams (13), percentage of catchment area under impervious surface (14), total length of roads (15), and number of people living within the catchment (16) were determined from StreamCat. Latitude (17), longitude (18), and elevation (19) were determined for each site using GE. The total watershed area upstream of each site (20) was determined from the NHDPlus database. To delineate regional differences in stream site physical characteristics, and to assess the relative importance of geographic and anthropogenic variables in predicting these differences, the 819 measured sites were ordinated with principal components analysis (PCA) using the default settings of PC-ORD v.7. The primary data matrix consisted of the 10 physical variables for each of the 819 stream sites. A secondary matrix of the 10 geographic and anthropogenic variables was then joint plotted with the determined ordination.
Based on the PCA results, value range maps of the classes of the most important variables were then produced in GE to show geographic and ecoregional trends. Class ranges were adapted from McManamay et al. (2018). 100% stacked column graphs were made of catchment-level intact habitat per variable class within each ecoregion to estimate differences in disturbance per stream type. Maps of percent intact habitat associated with each stream site were also produced. Intact habitat was measured on three scales: local catchment as used in the PCA analysis, the 100 m riparian mask on either side of each stream within the local catchment, and the entire land area upstream of each site. All intact habitat data were obtained from the StreamCat database.

Results
The PCA ordination produced three significant axes (Figure 2). Axis 1 (27.4% of variation explained, p ¼ 0.001) generally delineated sites of the two ecoregions based on differences in stream temperature, base flow, and soil permeability. Axis 2 (19.7%, p ¼ 0.001) corresponded to intraregional differences in gradient, distance to bedrock and water table, and CTI. Axis 3 (13.0%, p ¼ 0.001) corresponded to intraregional differences in stream sinuosity. Three predictor variables: latitude (R 2 ¼ 0.22), percentage of non-native plants (R 2 ¼ 0.35), and percentage of intact catchment habitat (R 2 ¼ 0.33) all corresponded to PCA Axis 1. No other predictor variable had R 2 > 0.1 with any axis (Table 3). Due to the covariance among many variables of both PCA axes (Figure 2), value range maps were made only of the most important variable for each of the three axes: base flow, stream gradient, and stream sinuosity. High and very high base flow (>60%) was strongly associated with the Upper Midwest ecoregion, with most very high base flow stream sites found in northwestern Lower MI (Figure 3a). Stream gradient was generally low (<1%) throughout most of both ecoregions (Figure 3b). Nearly all moderately-high or high gradient streams (>2%) were found in northern MI or MN, and drained into either Lake Superior or the Saint Croix River. Most streams had low to medium sinuosity (1-2) (Figure 3c). Occasional highly sinuous streams were found throughout the study area, but most commonly in the western portion of the Temperate Plains.  Streams with high or very high levels of intact upstream habitat (>60%) were found almost exclusively in the northeastern half of the Upper Midwest ecoregion (Figure 4a). Streams with high or very high intact local catchment habitat were found at a greater frequency throughout the study area (Figure 4b), as were streams with high or very high intact riparian habitat (Figure 4c). Taking the lowest value for each site from the three different scales (Figure 4d) demonstrated that nearly all streams of the Temperate Plains had low or very low intact habitat (<40%). The only very high intact habitats (>80%) were found in northeastern MN, northern WI, northern MI, and a small area of southern IN. Similar trends were found throughout the classes of the most important determined variables and also among size and temperature classes of streams ( Figure 5).

Discussion
Natural variation between streams of the two ecoregions was predominately that of differences in base flow and temperature. High base flow indicates high groundwater recharge and stable cold temperatures; such high values are known throughout the Upper Midwest, particularly northern Lower MI (Holtschlag 1997;Wolock 2003;Kelleher et al. 2012;Mayer 2012;McManamay et al 2018). Conversely, streams of the Temperate Plains have lower base flow and, thus, are more prone to flooding and warmer temperatures. Differences in stream temperature were also influenced by latitude and, thus, air temperature, as streams of the Upper Midwest were generally further north and in cooler ecosystems than those of the Temperate Plains.
Differences in stream sinuosity and stream gradient were less important on an ecoregion scale, since most streams of both ecoregions were of similarly low gradient and moderate sinuosity. Both variables, however, were locally important and produced outlier streams. The tributaries of Lake Superior, for example, are known for being high gradient and for containing unique assemblages of aquatic insects (Houghton 2015). Conversely, the very low gradient landscape of the western Temperate Plains produced the most sinuous streams (Tester 1995).  Similarly, variation in intact habitat was primarily on a landscape scale, and did not vary much between size of streams or other variable classes. Watershed-scale upstream habitat generally increased with latitude, with almost all undisturbed streams occurring in northern MI, MI, and WI. This pattern has been previously noted in MN (Houghton 2004), and renders difficult the separation of latitude and habitat effects. Intact habitat along the 100-m riparian mask of streams, while more abundant in northern MI, MI, and WI, was considerably more evenly distributed throughout the study area than was upstream habitat. Riparian buffers are frequently employed to mitigate the effects of agricultural input, and are frequently the only remaining forested habitat in agricultural watersheds (Craig et al. 2008;Burdon et al. 2020).
The primary exception to patterns of intact upstream habitat was a collection of streams in southern IN, protected by the Hoosier National Forest. Three streams had very high intact habitat based on the combination of riparian mask, catchment, and upstream habitat, and 5 had high intact habitat. These streams were $500 km from streams with similar levels of intact habitat. While most of these streams were tiny headwater springs protected by small watersheds, three streams: Little Blue River (latitude: 38.2909, longitude: À86.4890), Negro Creek (39.0033, À86.2625) and Patoka River (38.4422, À86.4873), had watershed areas of 14-128 km 2 . While this study is not an exhaustive inventory of streams in this region, these results do suggest southern IN as an area of high conservation priority due to containing some of the least disturbed streams of the Temperate Plains.
One challenge to discerning the important variables affecting stream conditions is the presence of multiple co-varying temperature gradients. While percent of base flow had the strongest association with stream site temperature (Figure 2), latitude and percent upstream habitat exhibited similar patterns. All three factors clearly delineated the two ecoregions along a single axis. Base flow, latitude, and intact habitat are all known to influence stream temperature, due to cold groundwater upwelling, effects on air temperature, and differences in shading and sunlight penetration respectively (Ebersole et al. 2003;Allan 2004;Dugdale et al. 2020). Since all three variable gradients occur in the same direction, separating their relative importance will be difficult. Isaak et al. (2020) likewise found multiple temperature gradients when analyzing rivers of the western US.
While this study is not directly analogous to McManamay et al.'s (2018) study of the eastern US due to differences in variables measured and techniques used, some comparisons between results do emerge. In McManamay et al (2018), over half of the watersheds of the eastern US had !10% agricultural land cover, and 37% had !25%. If results are comparable, then the situation is worse in the northcentral US. Two-thirds of Upper Midwest watersheds had !10% agricultural land cover and 44% had !25%. In the Temperate Plains, >99% of watersheds had !10% agricultural land cover and 72% had !25%.
This study represents the first attempt to classify streams of the northcentral US using modern databases and methods. Such results are likely broadly applicable to ecosystems throughout temperate biomes, especially those with high levels of anthropogenic disturbance. It also indicates the importance of variables, like stream sinuosity, that are not readily available in USEPA databases and probably should be added to future iterations of them. Further research is needed throughout the US and elsewhere on the effects of these natural and anthropogenic stream variables, and stream classification systems in general, on biological assemblages such as fish or benthic macroinvertebrates.

Disclosure statement
No potential conflicts of interest were reported by the author(s).

Data accessibility statement
All data used in this study are available on the Open Science Framework site. https://osf.io/gqjt3/?view_ only=d4f9d5bbcf184d28881cc26b2a572e6d

Funding
No external funding was received for this study.

Notes on contributor
David C. Houghton is a professor of biology at Hillsdale College and the director of the G.H. Gordon Biological Station. His research involves the biological diversity of aquatic organisms, particularly caddisflies, and the effects of natural and anthropogenic impacts on their assemblages.