Investigation of soil microbiota reveals variable dominant species at different land areas in China

Abstract Soil microbiota is associated with plant growth and nutrition. Investigation of plant–soil interaction is essential for revealing the changes of microbial dynamics in the soil. Vegetation types and human activities, such as agriculture, had severely affected soil microbial structure and function. In this study, 16S rRNA were analysed to identify microbial structures in the soil. The total organic carbon (TOC) and total nitrogen (TN) in these soil samples were also analysed. TOC and TN in these soil samples were different, which might be due to different vegetation types. The main phyla in these soil samples were Actinobacteria, Proteobacteria and Acidobacteria. Furthermore, the genera in these soil groups were highly diverse, and most of the bacteria could not be assigned to any known genus. This indicated the presence of novel bacterial genera in these soil samples. A fraction of the dominant operational taxonomic units in the soil microbiota was identified, several of which played functional roles in soil nutrition. The linkage between the soil microbiota, especially the dominant species, and soil nutrients was analysed in this study. The culturomics and other omics technologies would help to isolate some novel microorganisms, which might lead to the recovery of functional microbial agents for plant growth.


Introduction
The soil microbiota contains highly diverse microbes, which play vital roles in soil nutrition and plant growth [1][2][3]. The dominant topsoil bacterial phyla include Proteobacteria, Actinobacteria, Acidobacteria, Planctomycetes, Chlorflexi and Verrucomicrobia [4,5]. Under different stress conditions, the root microbiome primarily regulates the interaction between soil and plants to adapt the stress [6][7][8]. The dominant phyla and genera of microbiota are significantly different at a spatial scale [9,10]. Thus, it would be interesting to reveal soil microbiota at a different geographical distance, which in turn will, enhance the existing understanding of the relationship between plant-soil microbiota [11].
Various environmental factors, such as temperature, pH and humidity, were proved to affect architecture of the soil microbiota [12][13][14][15]. The aridity, soil properties and vegetation types are the drivers of the soil microbiota in the southern hemisphere, but the primary factor was the temperature [16]. However, a previous study on the park green experiment showed that the soil pH determined microbial diversity and composition by altering soil nutrient content [14]. Moreover, pH was the primary factor correlated with soil microbiota in Chinese Krast Rocky desertification regions [17]. Thus, the soil microbiota at different areas were affected by specific environmental factors.
A global analysis of the topsoil microbiome suggested that the soil microbiota were strongly affected by environmental factors rather than the geographical location of the soil [5]. Specifically, the soil microbiota were primarily affected by local precipitation and soil pH [5]. Previous field surveys demonstrated that low microbial diversity and abundance were available in drylands, which decreased further when aridity increased [18,19]. Thus, the investigation of soil microbiota and its correlation with different environmental factors is of a profound interest.
Soil carbon and nitrogen are important elements for plant growth, and their metabolism is associated with vegetation types [20]. The core microbiota at different crop fields varies remarkably. Even then, the role of microbes in maintaining the soil nutrient levels remains largely unexplored [9]. Predicting the response of soil microbiota to the crops would facilitate the management of soil bacterial communities for sustainable high crop yields via modulation of soil microbiota [21,22]. Some rhizosphere microorganisms are beneficial for plant growth, plant yields and disease resistance [23,24]. The global citrus rhizosphere microbiota comprises the dominant species assigned to the phyla of Proteobacteria, Actinobacteria, Acidobacteria, and Bacteroidetes [25]. Certain beneficial microbes were also identified in this study, which might help to improve citrus production and fruit quality [25]. Nowadays, Synthetic microbiota is used to provide new insight into the plant-microbe interactions that can release the power of soil microbiota under dynamic and spatial resolution [26]. In one previous study, synthetic microbiota composed with six Pseudomonas strains promote plant growth [27].
In this study, the soil microbiota of the grass-root rhizosphere collected from 112 soil samples were assessed, and the structure and network of the species were investigated. Besides, the total organic carbon (TOC) and total nitrogen (TN) of these soil samples were analysed. Based on the data, the potential relationships between soil microbiota and soil characters were investigated and discussed.

Sample collection
The soil samples were collected from four different areas in central China and Xinjiang Province (Table S1). Based on the collection sites, the 112 soil samples were classified into four groups: ZMS, SS, XJ and ZR groups that contained 4, 5, 8 and 95 soil samples, respectively. All the soil samples were collected from the plant root rhizosphere of several different plant species (Table S1). The ZMS soil samples were collected from a wheat plantation; wheat is the main crop in central China. The SS soil samples were collected from the wild grass from Songshan mountain of Zhengzhou, Henan Province, China. The ZR soil samples were collected from the grass rhizosphere at Zhengzhou, Henan Province, China. The XJ soil samples were collected from the grass rhizosphere at remote places, Urumqi, Xinjiang Province, China.

Determination of total organic carbon and total nitrogen content
The water in the soil samples was removed by drying the soil samples in an oven at 50 °C. The inorganic carbon in the soil samples was removed using 2 M hydrochloric acid [28,29]. The TOC contents of the soil samples were measured with the CHNOS Elementar cube (Vario EL III) [28,29]. The TN content was measured in the dried soil without acid pretreatment [28,29]. Normally, the accuracy of the TOC and TN contents was <2%, and the maximum error was <5%. The average TOC and TN contents were calculated as described previously [28,29].

Soil DNA extraction
The total DNA was extracted from 0.5 g soil sample using DNAeasy PowerSoil Pro kit (Qiagen, Germany) [30,31]. The DNA concentration was determined, and the appropriate amount (5-20 ng) of DNA was used for 16S rRNA analysis. The V3-V4 regions of 16S rRNA genes were amplified with Phanta HS Super-Fidelity DNA polymerase, and the 16S rRNA fragments were used for high-throughput sequencing [32,33]. The raw reads were filtered as described previously [34]. The clean reads were assembled and used for diversity analysis [31,35]. The 97% identity was used for operational taxonomic unit (OTU) classification [36]. The raw reads of these 112 soil samples were submitted to the SRA database (the accession numbers are: SRR15429497-SRR15429591 and SRR15524480-SRR15524496).

Co-occurrence analysis of the soil samples
The co-occurrence of the dominant OTUs was determined by co-occurrence network analysis [37], which employed Spearman's correlation (ρ > 0.6; P values < 0.05) [34,38] to determine positive and negative correlations between the dominant OTUs [34,38]. The parameters of the network, such as edge number, degree and betweenness, were calculated using igraph package, and the visualisation of the network was conducted through the R software [34,39].

Characteristics of the soil samples
TOC and TN of the soil samples in each group were determined (Table 1 and Table S2). The TOC in the ZR group (1.49 ± 0.4%) was higher than the other three groups. This might be due to the accumulation of organic carbon in the ZR group soil samples as more organic carbon might be available in the city area [40] ( Table 1). The TOC of soil samples from the ZMS, SS and XJ groups was similar (about 1%) ( Table 1). Like TOC distribution, TN of soil samples in ZR groups (0.13 ± 0.03%) was higher than that of soil samples in the other three groups. Especially, the nitrogen contents in SS and XJ group soil samples were low [41].

Microbial diversity of the soil samples
The 112 soil samples were divided into four groups, and a total of 10,613,869 high-quality 16S rRNA gene sequences were obtained (Table S3). These 16S rRNA gene sequences were classified into 15,905 OTUs based on 97% identity. The average sequence number of each soil sample was 94766.7, and the average OTUs were 2472 (Table S3). In the ZMS group, the average OTU number was higher than the other three groups, indicating that the number of microbial species in the ZMS group (2628 ± 819) was higher than the other three groups (Table 2) [21]. The Chao1 parameters were similar with the Richness number, suggesting the sequence number was enough and most microorganisms were covered [42] ( Table 2). The high Shannon_2 index hinted that these soil samples had highly diverse microorganisms. Especially, ZR group had higher diversity than the other three groups. The Simpson, dominance, and equitability numbers suggested there might be some dominant species in each sample ( Table 2).

Phylum-and genus-level microbial distribution of the soil samples
The highly abundant phyla in these four groups were Actinobacteria, Proteobacteria and Acidobacteria, accounting for 79.45% of all total soil microbiota ( Figure 1a). In the ZMS group, the Actinobacteria, Proteobacteria, and Acidobacteria accounted for 37.05%, 25.32%, and 10.77%, respectively, of the total microbial numbers. In the SS group, the Actinobacteria, Proteobacteria, and Acidobacteria accounted for 37.73%, 35.45%, and 20.83%, respectively, of total microbial numbers. In the XJ group, the Actinobacteria, Proteobacteria, and Acidobacteria accounted for 44.75%, 25.37%, and 6.59% of total microbial numbers, respectively. In the ZR group, the Actinobacteria, Proteobacteria, and Acidobacteria accounted for 32.68%, 34.29%, and 16.99% of microbial phyla, respectively. Besides, the Firmicutes, Thaumarchaetota, and Bacteroidetes accounted for 4.99% of the soil microbiota ( Figure 1a).
The dominant genera in these four soil groups were different (Figure 1b). A total of 1113 genera were identified in these four groups. The dominant genera (>2%) in these four groups were different (Figure 1b). In the ZMS group, Gaiella, Gemmatimonas, Nocardioides, Bacillus, Gp6 and Gp1 were the dominant genera, accounting for >2% of all the microorganisms. In the SS group, Mycobacterium, Gp1, Gp16, Nitrososphaera, Streptomyces, Gp3 and Gaiella were the dominant genera, accounting for >2% of all the microorganisms. In the ZR group, Gp6, Gaiella and Nocardioides were the dominant genera accounting for >2% of all the microorganisms. In the XJ group, Nocardioides, Gaiella and Streptomyces were the dominant genera, accounting for >2% of all the microorganisms.

Dominant species in the soil microbiota
More than 50% OTUs of the four groups could not be assigned at the genus level. In the meanwhile, some dominant OTUs showed >98% identity with known microbial isolates. This indicated that a fraction of these dominant microorganisms could be cultured (Table 3). Similar with genus-level analysis, abundances of the dominant OTUs in these four groups were different (Table 3). OTU_4, which was predicted to be Bradyrhizobium lupini, symbiotic nitrogen fixer of the soil, was found to be abundant in the SS group (6.98%). OTU_27, which was predicted to be Blastococcus colisei isolated from limestone, was abundant in the XJ group (6.98%). OTU_3, which was predicted to be Sphingomonas limnosediminicola, accounted for 1% of all the four groups [43]. OTU_2, which was predicted to be Pseudarthrobacter phenanthrenivorans isolated from a creosote polluted site, was abundant in ZMS, XJ and ZR groups, but its composition in the SS group was 0.21% [44]. OTU_140, predicted to be Mycolicibacterium moriokaense, was found to be abundant in the SS group (2.61%) [45]. Especially, OTU_1, which is an unknown species, was only distributed in the ZMS group (1.90%). As ZMS soil is used for wheat planting, this might be associated with wheat growth. Other dominant OTUs (>0.48%) in these four groups were uncultured or common soil microbes, and their roles in the soil microbiota needed further investigation.

Beta-diversity of the soil microbiota
The PCoA analysis suggested that the soil microbiota from several different areas had similar microbial diversity ( Figure 2). However, the XJ group samples were clustered, and they were different from the other three groups. In the meanwhile, the soil samples from the SS and ZMS soil groups were similar. One sample in ZR group was similar with some samples in SS and ZMS group (Figure 2). The UPGMA analysis further showed that most soil samples from the same group were clustered together ( Figure 3). Especially, the soil samples of ZMS, SS and ZR groups were clustered together, which were different from soil samples in the XJ group ( Figure 3). Moreover, two ZMS group samples, ZMS1 and ZMS86, were different with other samples (Figure 3).

Co-occurrence network analysis of the soil microbiota
Analysis of the dominant OTUs using the co-occurrence network analysis can predict the potential keystone taxa in the microbiota [31,34]. The 49 most abundant OTUs were selected for co-occurrence network construction, and all the 49 OTUs composed >0.3% of the soil microbiota. The generated network had 250 edges; 215 edges were positively correlated, and 35 were negatively correlated in the network (Figure 4). The  connection of the network was 0.26, and the average degree was 11.36. The average path length of the network was 2.13, and the diameter was 5. The clustering coefficient was 0.73. The centralisation betweenness and degree were 0.11 and 0.29, respectively ( Figure 4). The OTUs in the network were assigned to Acidobacteria, Actinobacteria, Proteobacteria, Firmicutes, and Gemmatimonade ( Figure 4). Besides, several OTUs could not be assigned to known phyla, showing certain dominant OTUs in the soil microbiota were novel. The ZMS group had 95 soil samples, and the most dominant OTUs were positively correlated in the network (Figure 4). Certain OTUs, especially OTUs in the center of the network, represent positive correlations, showing they were available in most soil samples. OTU_111, OTU_236, OTU_54, OTU_15, OTU_95, OTU_27, and OTU_46 represent negative OTUs, indicating that they were dominant only in certain soil samples ( Figure 4). OTU_111 was abundant in SS, XJ, and ZR groups, but it was only abundant in certain ZMS soil samples. The keystone taxa identified by the co-occurrence network, such as OTU_111, OTU_1, and OTU_10, are uncultured ( Figure 4).

Discussion
The TOC, TN, and microbiota of 112 soil samples in four groups were investigated. The TOC and TN in these four groups were different ( Table 1). The TOC of XR group samples was the highest in these four groups. The SS and XJ group soils were distributed in remote areas, and the TOC of SS and XJ group soil samples was low. As the agriculture area was used for food production, TOC might not be accumulated in the ZMS group samples [46,47]. Nitrogen fertiliser might be added to the agriculture environment or urban environment, this might have led to higher TN composition of the ZMS and ZR group [41,48,49]. Moreover, the TOC and TN content would have affected the structure and the function of the soil microbiota, but the cause-effect relationships between the soil microbiota and the soil nutrients need further investigation [9,50,51].
The four soil groups investigated were collected from four different areas, and the OTU numbers of these four soil groups were different, showing the divergent distribution of the microbes (Tables 2 and  3) [52]. The ZMS group had higher OTU numbers than the other three groups, showing the diversity of wheat rhizosphere microbiota were abundant and specific species might be available. Moreover, the OTU numbers of ZR group were similar with that of soil samples collected in the same area, suggesting that the soil samples from similar environments might have similar microbiota [53]. The OTU numbers were lower than the marine sediment environment, showing less nutrients led to diverse microbes to sustain microbial survival [39]. Compared with some enrichment artificial microbiota, such as wastewater treatment microbiota, the OTU numbers of soils were high, showing some species were enriched during long-term operation [32][33][34]54]. As the crops in ZMS samples varied, the OTU numbers were high, in order to adapt to the different agriculture environments.
The dominant phyla in these four groups were different, further indicating that the soils in the different groups had divergent microbiota. The dominant phyla were similar with soils distributed in other areas [10,55,56]. The abundant Actinobacteria in the ZR group indicates that there might be novel gene clusters for natural product biosynthesis in the soil microbiota, which could lead to the identification of novel antibiotics and other medicines [30,57]. At the genus level, more than 50% of the genera couldn't be assigned to any known microbial species, indicating that most of the species in the soil microbiota might be novel. The dominant genera of these four groups were not the same, showing that the soils from the different groups contained diverse soil microbiota.
Specifically, the Nitrososphaera species, which possesses the ammonia-oxidising ability, promoted absorption of ammonia to support plant growth [58]. Nitrososphaera species were more abundant at SS and XJ groups than at ZMS and ZR groups [59,60]. The soil samples in SS group were collected from remote areas where the ammonia was oxidised by Nitrososphaera for plant growth [61,62]. The high nitrogen content in ZR soil samples further indicated that active Nitrososphaera species were available. Besides, the low nitrogen content of the XJ soil samples could be attributed to the low water content of the soils (Table 1). Since ZMS and ZR groups were used as agricultural areas, fertilizer was added and nitrogen would be enough to grow wheat and other crops. Therefore, the abundance of Nitrososphaera species was low. Other functional nitrogen fixation species might also be available in the soil microbiota (Figure 1b).
The OTUs were investigated, and some of the dominant OTUs had diverse functional roles in the microbiota. The abundance of OTU_4 predicted to be B. lupini in the SS group was high, suggesting that nitrogen fixation might be necessary for plant growth in the mountain area [63,64]. The high abundance of OTU_27 predicted to be B. colisei in the XJ group might be due to the low nutrition level of the soil samples [65]. As SS was collected from a mountain with low-level pollution, the abundance of OTU_2 was lower in SS than that of the other three groups [66,67]. OTU_111 is uncultured bacteria, and it has low identities with known isolates, indicating that isolation of such bacteria might give further insight into the functions of soil microbiota [68]. Further network analysis identified OTU_111, OTU_1, and some other uncultured keystone species in the microbiota. In the future, the isolation of keystone species from soils based on omics technology could lead to the discovery of essential bacteria for plant growth [23,69].
In the ZMS group, ZMS1 and ZMS86 were different from other subgroups. The difference between ZMS1, ZMS86 and the other soil samples was unknown. It might be due to the error during the soil sample collection process, adversely affecting the microbial diversity in these two soil samples (Table S3). The TN and TOC of XJ and SS soil groups were different from the other two groups, which might have led to the different distribution of soil microbiota (Table 1) [70][71][72]. Soil microbiota is associated with soil nutrition and plant growth; therefore, insight into microbiota of different soil types and identification of essential microorganisms in plant growth can further improve plant growth [73,74].
In the future, the integration of different keystone microorganisms could improve plant fertiliser efficiency and mineral nutrient uptake, and help to decrease carbon release during agriculture production [75,76]. By isolating natural species and engineering other species for plant growth, a list of functional microbes could be obtained and used to build synthetic microbiota [77][78][79]. With the development of microbiome and synthetic biology, engineering microbiota as efficient fertiliser or plant growth accelerating agent will satisfy the requirement of efficient crop management and green agriculture development [68,80].

Conclusions
In this study, we offer an insight into the microbiota of four soil groups and explored the main and the keystone taxa in these soil samples. The main microorganisms in different soil types were found to be divergent, and different microorganisms were accumulated. The co-occurrence network analysis identified several keystone taxa in the wheat rhizosphere, and the majority of these microorganism were uncultured. The omics technologies, especially culturomics, would help to isolate these keystone microorganisms and develop efficient synthetic microbiota for plant growth.