Investigation on the trends and characteristics of articles on submerged macrophytes: perception from bibliometrics between 1991 and 2018

Abstract Submerged macrophytes, as one of the most important primary producers in shallow lakes and streams, have attracted significant research attention, with relevant studies showing dynamic evolution over the past three decades. Here, we investigated the trends and characteristics of 2,836 relevant articles published between 1991 and 2018 based on bibliometric analysis. Our geographic-based results indicated a scarcity of studies on submerged macrophytes in Africa, but strong international collaboration in this field, especially that between Denmark and China. The top two core journal sources (Hydrobiologia and Aquatic Botany) of studies on submerged macrophytes were determined, with another four important sources also identified. In addition, three trends in abstract word dynamics were found over the last three decades: i.e. most frequent word (‘lake’); increased words (those related to restoration, ecosystem state, model, and macrophyte community); and decreased words (those related to fish and specific macrophyte genera). Furthermore, our case study on four macrophyte genera showed a close relationship between Hydrilla and Vallisneria and between Potamogeton and Myriophyllum, as verified via the topic-article relationships determined using the topic model. The thematic evolution map of keywords used during the last three decades showed a clear shift in scientific fields; for example, Myriophyllum spicatum was an important keyword during 1999 and 2011 but not in other periods. We found that studies on submerged macrophytes developed with an increase in applied ecology topics, such as lake restoration, and basic scientific topics, such as freshwater ecosystems and plant physiology.


Introduction
Submerged macrophytes differ from other plant life forms as their leaves and roots are found underwater. Due to distinct scientific terms for marine algae and seagrass, submerged macrophytes in scientific studies commonly refer to submerged aquatic plants found in inland waters, such as lakes, rivers, and streams. Submerged macrophytes are important components of inland water ecosystems, especially their maintenance of clean water in shallow lakes.  published a keystone paper on submerged macrophytes, clearly describing the structural role of these plants in shallow lakes. Many subsequent papers also discussed the cascade effects of submerged macrophytes through the food web (e.g. van Donk and van de Bund 2002;Jeppesen et al. 2005;Meerhoff et al. 2007). In addition to submerged macrophyte ecology, the development of macrophyte physiology research has evolved rapidly with the advance of modern scientific equipment (e.g. mass spectrometer, chlorophyll fluorometer, and underwater in situ devices) and interdisciplinary integration (e.g. Hupfer and Dollan 2003;Jiang et al. 2018). Understanding the trends and characteristics of recent articles on submerged macrophytes should provide good guidance for future studies.
Bibliometric analysis can help elucidate the history of respective fields as well as highlight areas of research that are lacking and potential innovations. For example,  proposed future research directions, including eutrophication and climate change interactions in Lake Taihu, based on 1,582 papers related to Lake Taihu published over the last three decades. Peng et al. (2019) found that topics related to ecohydrology have transformed from a microcosmic to macroscopic perspective based on 21,753 papers published from 1900 to 2017. A variety of bibliometric analyses have also been conducted, including on aquatic plants (Qiu and Chen 2009;Liu et al. 2011). However, bibliometric analysis of studies with a core focus on submerged macrophytes has not yet been reported, despite the significant increase in studies on such species in recent years. Here, we carried out a descriptive analysis (e.g. geography and source) of previous studies on submerged macrophytes, analyzed word co-occurrence in published abstracts, provided a case study on important plant genera, and developed a thematic evolution map to explain changes in scientific topics. We aimed to provide a scientific track of the development of studies on submerged macrophytes.

Materials and methods
The data used in this study were retrieved from the SCI expanded database in Web of Science (Thompson Reuters Corporation, USA), which includes a wide range of major journals and is the most popular database for bibliometric analyses. Summaries (e.g. headings and abstracts) were added into the document meta-data from 1991 , so we confined the period of publication to between 1991 and 2018. To investigate subjects related to submerged macrophytes, we applied the following search strategy: TS¼((('submerged macrophyte Ã ') OR ('submersed macrophyte Ã ') OR ('submerged vegetation') OR ('submersed vegetation') OR ('submersed plant Ã ') OR ('submerged plant Ã ')) and (lake Ã OR river Ã OR stream Ã OR reservoir Ã OR 'fresh water' OR 'inland water Ã ')), which included KeyWords Plus developed by Web of Science. Due to the search strategy and analysis method, the language was confined to English and document type was confined to article. Data retrieval was conducted on 3 June 2019.
The meta-data of each publication included author, title, source (journal of publication), country/region, keywords (and KeyWords Plus), address, subject category, and abstract. Articles for Hong Kong and Taiwan were classified as regions differing from China. All downloaded data were '.bib' files and were analyzed by the 'biblioshiny' function in the 'bibliometrix' package (Aria and Cuccurullo 2017) in R 3.5.1, if not mentioned otherwise. Our analysis included basic information, word co-occurrence analysis, case study on four macrophyte genera, and theme evolution map of the chosen articles.
The exponential curve of annual publication production was fitted with the function 'nls' in R. When analyzing the publication number of the corresponding author's country, two categories were used, i.e. simple country publication (SCP) and multiple country publication (MCP). To understand the international collaboration on submerged macrophytes, the countries of the corresponding authors were chosen and further analyzed.
For word co-occurrence, word extraction was performed from the abstract only, if not explicitly mentioned otherwise, as the abstract contains more information about the study than keywords. The top 100 highest frequency words were used to plot the word cloud. Frequent words were also extracted per year to investigate word dynamics. Trend analysis (increasing or decreasing during the last three decades) was conducted using the function 'mk.test' in the package 'trend', and the significance level of the Mann Kendall test was set at 0.05 (see Peng et al. 2019).
Latent Dirichl et al. location (LDA) is a type of latent semantic analysis (Landauer et al. 1998) and can be used for tracing latent topics in literature (Blei et al. 2003). Here, LDA was used to analyze the topic model in the abstracts with the package 'topicmodels' (Hornik and Gr€ un 2011). K-fold cross validation (k ¼ 3) was used to determine the optimum number of latent topics when the change in perplexity became stable. The LDA analysis was based on the Gibbs sampling model, which, after identifying latent topics, showed the probability of certain articles (or words) belonging to certain latent topics.
A case study was conducted on four commonly studied macrophyte genera, i.e. Myriophyllum, Hydrilla, Vallisneria, and Potamogeton, which were found at relatively high frequencies based on LDA analysis. Firstly, the four genera were used as core words to identify closely related words by calculating the correlations among words using the R package 'tm' (Meyer et al. 2008). Secondly, the latent topics closely related to the four genera were chosen to show paper-topic relationships using a Venn diagram.
Keyword clusters were considered as themes in this study (article keywords were used because word extraction based on the abstract overloaded the vector size in the function 'thematicEvolution'). The evolutions of the themes were mapped in a two-dimensional diagram in seven-year periods.

Geographic information and international cooperation in articles
A total of 2,836 documents were found based on our search strategy (Figure 1). Annual scientific production has increased exponentially at a rate of 6.5% (supplementary material Figure S1 in appendix). Mainland China had the largest number of publications (1,270), followed by the USA (830), Netherlands (374), Germany (364), Denmark (340), and UK (287). Studies on submerged macrophytes were significantly scarce in African countries. When only the nationality of the corresponding author was considered, China and USA again topped for both SCP and MCP (Figure 2). The ratio of MCP to total publications for most of the Top 20 countries was $20%-35%, but was nearly 50% for Denmark and Sweden, indicating that each country exhibited strong preference for international cooperation in regard to submerged macrophyte study.  However, the most cited countries (citation number > 5,000) were ranked USA (9,445), Denmark (8,636), Netherlands (5,754), UK (5,592), and China (5,560) (supplementary material Figure S2 in appendix). Accordingly, average citations per article for China and Poland were relatively low compared with the other Top 20 countries. Our results indicated that articles published by Chinese authors were, as a whole, of high quantity. Xie et al. (2014) examined the factors related to China's continuing rise in scientific research in the past three decades, e.g. huge human capital base, strong willingness of the government to invest in science, and contributions from Chinese-origin scientists. However, equal high quality does not appear to be reflected in the citation number of these articles, which has been discussed previously. For example, Jin and Rousseau (2004) stated that simple quantitative evaluations with a focus on publication number in China can stimulate growth in publications but has very limited influence on the quality of research. However, this issue is improving, with Wang (2016) stating that high-quality research in China has seen an increase in high-impact journal publication in recent years (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016).
Based on the corresponding authors, three different international cooperation clusters were found (Figure 3). China and Denmark showed very strong collaboration, likely due to the establishment of the Sino-Danish Center (Chinese Academy of Sciences) in China, which undertakes cooperative research in the field of water and the environment. China and USA also showed considerable international cooperation, which has been funded by various bilateral projects from governments and scientific organizations, such as the annual cooperative research project between the National Natural Science Foundation of China (NSFC) and National Science Foundation (NSF). Likewise, the Sino-Africa Center (an organization affiliated with the Chinese Academy of Sciences) is expected to play a critical role in advancing future studies on submerged macrophytes in Africa.

Source information and co-citation analysis of articles
Based on Bradford's law, Hydrobiologia, Aquatic Botany, Freshwater Biology, Ecological Engineering, Journal of Freshwater Ecology, and Journal of Aquatic Plant Management were identified as core sources (Figure 4). Further correlation analysis with two other fields (i.e. country and KeyWords Plus) showed that the Top 20 countries accounted for most of the papers from the core sources, except for Aquatic Botany, which was a very 'international' journal (supplementary material Figure S3). A large proportion of retrieved papers in Ecological Engineering and the Journal of Aquatic Plant Management were from China and USA, respectively. Source co-citation demonstrated two main clusters (red and purple in Figure 5). The red cluster showed strong co-citation for three core sources (Hydrobiologia, Aquatic Botany, and Freshwater Biology) and three new sources (Limnology and Oceanography, Ecology, and Canadian Journal of Fishery and Aquatic Sciences). The other cluster consisted of relatively weak co-citation sources, e.g. Water Research, Science of the Total Environment, and Ecological Engineering. Thus, in addition to the six core journals, Limnology and Oceanography and Ecology are suggested as ideal candidate journals to submit studies related to submerged macrophytes.

Word co-occurrence analysis of articles
The word cloud revealed a close relationship between articles on submerged macrophytes and lakes, with 'lake' also being the most frequent word in most years over the last three decades (supplementary material Figures S4 and S5 in appendix). Word dynamics showed contrasting monotonic patterns for the 70 chosen words, after excluding those that were considered meaningless (Figure 6; supplementary material Figure S6). The increasing frequency of words such as 'restoration', 'ecosystem', 'nutrient', 'shallow', and 'state' is a reflection of studies regarding the role of submerged macrophytes in maintaining a clear water state and mitigating nutrient increases in shallow lakes. The words 'richness', 'composition', and 'community' are indicative of increasing attention on the macrophyte community.
Lake restoration and recovery of submerged macrophytes are long-term processes that can last several decades (Søndergaard et al. 2007(Søndergaard et al. , 2008, the modeling of which requires long-term datasets. Thus, we observed increasing frequency in the word 'model' over the past three decades of field and mesocosm studies on submerged macrophytes. For example, the newly developed water ecosystem tool (WET) is a sophisticated state-of-the-art aquatic ecosystem model (Nielsen et al. 2017), which has been used extensively in scenario simulation of macrophyte recovery and lake management.
Our analysis showed that studies related to 'food' and 'production' have diminished over the last several years. Furthermore, the use of the word 'submersed' decreased, whereas the frequency of 'submerged' remained stably high. In addition, the frequencies of 'biomass' and 'fish' decreased. The unimodal pattern of 'zooplankton', though not monotonic, indicated reduced studies on the interaction between zooplankton and macrophytes, whereas 'phytoplankton' showed high frequency throughout the investigated period (supplementary material Figure S5). Interestingly, periphyton and epiphyton were not identified in the annual Top 100 frequent words from 1991 to 2018.
Three species of Vallisneria were frequently investigated, i.e. V. natans, V. spiralis, and V. americana. It should be noted that V. spiralis was wrongly used in China, with the correct name subsequently identified as V. natans after checking the traits of male flowers (Zhou et al. 2016). Hydrilla co-occurred with Vallisneria at a high frequency.
The Myriophyllum genus was linked with plant taxa such as Ceratophyllum, Elodea, and Potamogeton, with M. spicatum found to be the most prevalent species in the studied papers. For the fourth genus (Hydrilla), H. verticillata was the only target species found with closely related words, e.g. Vallisneria, tuber, and coverage.
Through K-fold cross validation (supplementary material Figure S7), we identified approximately 240 latent topics, with the first 100 topics covering 70% of the whole dataset in the study. The constructed Venn diagram (Figure 8) also showed a close correlation between Hydrilla and Vallisneria and between Potamogeton and Myriophyllum.

Thematic evolution map of keywords in articles
The thematic evolution of keywords during the last three decades showed a clear shift in scientific fields. Fish-related keywords disappeared after 2005 (Figure 9). Lake eutrophication and restoration are long-term processes, and fish removal has long been considered as a way in which to improve water clarity and macrophyte recovery (Jeppesen et al. 1990), which probably led to fewer studies correlating fish and macrophytes. Herbivory (grazing, macroinvertebrates) and Myriophyllum spicatum were prevalent keywords from 1999 to 2011.
In addition, phosphorus (water quality) and eutrophication (nutrients) were important keywords throughout the three decades. Although nitrogen is still debated as an important factor for lake eutrophication , our results showed that excessive phosphorus was the main investigated symptom of lake eutrophication. Future studies on both nitrogen and phosphorus could contribute to improvements in macrophyte recovery as ammonia nitrogen alone is toxic to aquatic plants (Cao et al. 2018).
Basic scientific studies on the ecology and physiology of submerged macrophytes were found to be perpetual areas of interest, whereas hot topics were much more dynamic. Our study showed that specific species or genera (e.g. Myriophyllum spicatum) were hot topics for a certain period, whereas other areas of interest, such as phosphorus, eutrophication, and clear-turbid state in lakes, have remained long-lasting topics over the last three decades. As a recommendation for future study, keywords could include macrophyte communities and restoration.