DNA barcoding and faunistic criteria for a revised taxonomy of Italian Ephemeroptera

Abstract Mayfly (Ephemeroptera) systematics has considerably changed over the years, but many questions have yet to be answered. The synergistic connection between traditional knowledge and new data sources, producing increasingly complex information, has become a compelling issue for modern taxonomy. Molecular tests and the use of reliable reference sequence libraries may constitute effective complements to the traditional method in guiding recognition of species and giving information about taxonomic incongruences which require further examination. In the present study, we sought to verify the current Italian mayfly nomenclatural system through DNA barcoding and relevant points to reliably manage the available amount of morpho-ecological and molecular data are discussed. We investigated COI (Cytochrome oxidase I) sequence variation in 163 individuals of Italian mayflies, 126 of which were previously assigned to 24 morphologically recognised species, and 37 could be attributed only to generic taxonomic entities (“sp.”, “cf.” or “gr.”). DNA barcoding statistical tests for species delimitation hypotheses based on genetic distances and inferred gene trees were integrated with GenBank searches and surveys of the historical literature to better understand the knowledge acquired on the status and diversity of the investigated taxa. Combined criteria to define three categories of reliability were then assessed. Concurrent data allowing unambiguous identification were attained for only eight species. High intraspecific genetic distances (> 3%) and a lack of reliable reference material or convincing taxonomic information evidenced 29 critical states, deserving further investigation. Solid species names, potential cryptic species and entities about which little is known are pointed out for a future upgrade/reorganisation of the taxonomy of Italian Ephemeroptera.


Introduction
The taxonomy of mayflies has considerably changed in the last two centuries. In Europe, different periods corresponding to improvements in knowledge can be traced through its advancement. In the 19th century, in-depth anatomical and morphological analyses were poorly considered, and a Linnean taxonomic scheme was mainly followed based on tegument colours and general aspect of the adults. The most representative scientist of that period was A. E. Eaton, with numerous studies published in the last quarter of the century (Eaton 1883(Eaton -1888. Later on (in the first half of the 20 th century), closer attention was given to the description of more detailed morphological traits (i.e. larval and genital characters : Grandi 1960;Landa 1969) but taxonomic nomenclature did not follow at the same pace, and proper species identification was often uncertain. In the second half of the 20th century, new progress was made in taxonomy from the morphological analysis based on type specimens, with closer attention to nomenclatural aspects. Müller-Liebenau introduced this process with her work on the genus Baetis (Müller-Liebenau 1969). Progressively, many other authors followed this approach in the last part of the century and summarised their results in comprehensive checklists (e.g. Thomas & Belfiore 2004;Bauernfeind & Soldán 2012).
In the third millennium, the massive utilisation of molecular approaches (e.g. nuclear and mitochondrial DNA sequences, microsatellite variation; see Monaghan & Sartori 2009 for a review) has produced a great deal of new data and further progress towards a better comprehension of mayfly systematics, phylogeography and biodiversity. However, in the long run it has also suggested that biological complexity (i.e. Rhithrogena and Baetis genera) was far greater than previously considered in the taxonomic schemes currently in use (Sroka 2012;Webb et al. 2012;Gattolliat et al. 2015;Vuataz et al. 2016). The accuracy of the present nomenclature was definitely challenged by several well-aimed studies indicating the presence of crypticism in many species (Williams et al. 2006;Bisconti et al. 2016;Leys et al. 2016;Rutschmann et al. 2017). Today, all gathered data strongly point towards the clear, large occurrence of a previously overlooked taxonomic complexity in many families and genera, and prompt new efforts to redefine part of the current systematics.
Taxonomic attribution is fundamental for appropriate divulgation of biological research, to compare different studies and improve socio-economic applications such as habitat and territory preservation (Adamowicz 2015). Nonetheless, a reliable and stable taxonomic framework for mayflies has yet to be defined. All faunistic data concerning the Ephemeroptera (i.e. those available today) provide only extremely old, indicative values, both from a nomenclatural point of view and in terms of the definition of specific entities. In fact, the Italian mayfly checklist, mainly based on data acquired in the last few decades of the last century (currently listing ca. 110 species: Buffagni et al. 2003), is outdated and approximate. As such, correct evaluation of mayfly biodiversity and future ecological applications (see for instance Gill et al. 2016;Morinière et al. 2017) is likely to be precluded.
New groundwork for an unambiguous and agreedupon species detection method should be therefore laid down, putting in order the amount of molecular data that is progressively increasing, alongside the traditional morphological taxonomy. Instrumental and synergistic key points would be: (1) to increase the utilisation of standard molecular data; (2) to determine taxonomic entities based on clearly defined and thoroughly described morphological characters (e.g. Belfiore 1994;Haybach 1999;Sartori et al. 2016), while also giving proper weight to the analysis of intraspecific variation; and (3) to name taxa based on reliable reference material (e.g. type specimens).
The DNA barcoding approach, utilised for species recognition in numerous animal groups (Paz & Crawford 2012;Daneliya & Väinölä 2014;Evangelista et al. 2014), can effectively complement the traditional taxonomy and preliminarily point out possible nomenclatural incongruences or the existence of cryptic differentiation within a species. This method requires a correct taxonomic assignment of the specimens to generate reliable barcode (based on the Cytochorme C Oxydase mitochondrial gene sequence, COI) reference data. Ideally, the assignment of organisms to a species is correct only when the analyses are conducted with the species holotype as reference (see for instance Webb et al. 2012). Nevertheless, this is not always possible due to the limited availability of holotypes. Alternative solutions could provide decreasing levels of reliability depending on the specimens used as reference (i.e. paratypes, or other specimens sampled in the type locality). In the case of non-type reference material, the species name can be confidently attributed based on the specimen origin (e.g. nearby localities), the concurrence of unambiguous morphological traits, the known geographical distribution and the absence of similar co-occurring species. This approach could improve the problem of using badly assigned material in generating reference libraries.
In a preliminary work (Cardoni et al. 2015), we concluded that DNA barcoding is a powerful approach for taxonomic research in Mediterranean mayflies, ameliorating the available biodiversity inventories. Nevertheless, the practical and unambiguous disclosure of this approach is still hampered by the availability of current faunal lists that are not in line with the emerging molecular complexity, especially with the potential crypticism that has been reported, or is suspected to occur, based on the detected divergence of intraspecific lineages. The general scarcity of Mediterranean Mayfly (with unambiguous species name) COI sequences in the world reference databases that could drive a correct identification is an additional hindering factor.
In this work, the DNA barcoding reference of Italian Ephemeroptera was extended in terms of regions, species and intraspecific samples; whenever possible, specimens with unambiguous identification at type localities or nearby sites were included in the field collection. The accuracy of the nomenclature currently in use was tested against molecular and statistical DNA analyses, and confronted with literature surveys.
The gathered data suggest a reappraisal of the level of nominal reliability. Some useful criteria to challenge and eventually rearrange the nomenclature system currently in use have been proposed, such as combining taxonomical, faunistical and biogeographical issues with DNA barcoding data. This will be an advantage for future projects based on DNA barcoding, and will lay down the experimental groundwork for a sound and durable taxonomy of mayflies.

Sampling and laboratory analyses
One hundred and sixty-three specimens of mayflies at the nymph stage were collected in 18 rivers across Peninsular Italy and the main islands in November 2014, and April and July 2016 (Supplementary Figure S1). The sampling was carried out with a clear strategy, targeted to the collection of selected species in their type locality or nearby areas, and to gradually expand the available reference data set of the Italian mayflies (Cardoni et al. 2015; Table I). A priori species identification was performed based on morphology and current taxonomic literature. DNA barcoding was then used to test the potential congruence of species identification and genetic variability. DNA extractions, Cytochrome Oxidase subunit I region amplification and general sequence analyses were performed as in Cardoni et al. (2015).

DNA barcoding statistical tests
The absence of stop codons in the obtained sequences was verified with MEGA 7.0.14 (Kumar et al. 2016) and multiple alignments were generated with Clustal X (Thompson et al. 1997). Efficacy of taxon discrimination in our data set was assessed by means of (1) sequence similarity, (2) genetic distance and (3) phylogenetic/clustering approaches.
Operational taxonomic units (OTUs) were generated using Mothur (Schloss et al. 2009) with different similarity thresholds to compare the obtained OTUs with the species identified based on morphology. The R-package SPIDER (Brown et al. 2012) was used to calculate Kimura two-parameter (K2P) genetic distances within and among congeneric species (dis.dna function), and to test the occurrence of the "barcoding gap" (maxInDist and nonConDist function)that is, the assumption that the maximum intraspecific genetic distance is smaller than the minimum between species. RAxML (Stamatakis 2006) was used to generate maximum likelihood dendrograms under the GTRCAT model, on 252 sequences of Italian mayflies and on all COI sequences of the conspecific European mayflies available in GenBank (874 sequences); node support was evaluated with 1000 bootstrap replicates. Tree topology was inspected to check that individuals belonging to the same species were clustered together.
Finally, the Poisson tree process (PTP: Zhang et al. 2013) was performed as an additional test to delimitate putative species through Bayesian inference (BI) with 100 000 MCMC (Markov Chain Monte Carlo) generations on 252 COI sequences of Italian Ephemeroptera.

Taxonomic reliability and nomenclatural congruence
All publicly available European conspecific COI sequences were downloaded from GenBank and integrated in our data set (entries and relative information provided as Supplementary Table SI). As a first step, intra-and interspecific K2P genetic distances were calculated for the whole data set (626 sequences). We adopted 0-3% as the indicative range to group individuals into the same species, based on the inter-/ intraspecific K2P divergence boundaries generally observed in Ephemeroptera (Ball et al. 2005;Gattolliat et al. 2012Gattolliat et al. , 2015Kjaerstad et al. 2012;Webb et al. 2012), and considering Hebert et al. (2003). Subsequently, taxonomic literature information was surveyed (e.g. type or paratype locality of the a priori identified species). Considering the origin of the investigated material (type material or type locality), the reported distribution and taxonomic history of the studied entities, we defined three categories of nomenclatural consistency for the groups defined with the K2P distance range: A. "reliable species names": individuals resulting from analyses conducted on (in order of decreasing reliability): (1) the type material; (2) material collected at nearby sites, with no concurring syntopic similar species; (3) members of species with a circumscribed distribution and with solid diagnostic characters. These criteria imply no explicit taxonomic problems (Table II); B. "difficult name attribution, potential occurrence of cryptic species": individuals to which the criteria stated in A cannot be attributed, indicating a possible non-correspondence with the given species name, despite the presumed cospecificity of the investigated samples with a consistent GenBank reference (K2P < 0.03); intraspecific genetic distance > 0.03, indicating the possible occurrence of cryptic species. These criteria suggest a highly likely necessity for taxonomic revision; C. "entities with unclear taxonomic status": the identification cannot be reliably achieved due to the large occurrence of taxonomic problems, with few or no barcoding sequences available for comparison. These criteria imply further investigation is required at morphological, ecological and molecular levels before the current species name is accepted or rejected.

Data set description
With the morphological keys currently in use, 24 morphospecies (13 genera, four families) were recognised.  Figure S1.
Taxon Location Accession no.
Electrogena hyblaea, R. sibillina, R. johannis and R. nuragica were sampled in their type locality. Current taxonomic knowledge did not allow the identification of 37 specimens at the species level. These specimens were attributed to generic taxonomic entities (labelled with "sp.", "cf." or "gr.": see Table I). COI sequences were obtained for all 163 investigated samples. The multiple alignment was unambiguous (no gaps occurred in the whole data set and no stop codons were observed in the translated protein sequences). The final matrix consisted of 639 characters (primer sequences excluded). The reference library of Italian Ephemeroptera (Cardoni et al. 2015) was extended with 83 new COI sequences (GenBank accession numbers reported in Table I).

Evaluation of the molecular data set
The 100% identity threshold generated 127 OTUs; of these, 102 corresponded to single individuals and 25 grouped individuals assigned to the same morphospecies (Supplementary Table SII). The 99-98% identity thresholds reduced the number of OTUs to 50 and 40, respectively. The 97% identity threshold generated 38 OTUs; of these,   Table SIII).
Maximum likelihood inference based on the 252 Italian mayfly COI sequences (corresponding to 64 morphospecies), generated by data from this and the previous work (Cardoni et al. 2015), grouped all sequences into clades corresponding to the morphologically identified species (Figure 1). Moreover, the RaxML phylogram integrated with all European conspecific sequences from GenBank, relative to the Italian Ephemeroptera here and previously investigated species (874 total sequences), confirmed the previous clades and showed a clustering well in line with the acknowledged mayfly taxonomy (Supplementary Figure S2). However, in both dendrograms some "minor clades" with low bootstrap support were produced, corresponding to the morphospecies characterised by high K2P distances (Table II). Finally, the PTP analysis conducted on the total Italian data set (252 sequences) predicted 81 putative species (Supplementary Figure S3).

Data gathering and grouping
Four hundred and sixty two GenBank sequences belonging to the same morphospecies investigated here (163 COI sequences) were used to integrate the K2P genetic distance analysis in an expanded data set (626 total sequences). We found intraspecific K2P distances < 3% in 13 species (95 individuals) and K2P values > 3% in 22 species (529 individuals) ( Table II). The remaining two species were represented only by one specimen. Graphic visualisations of the K2P distances are reported in Supplementary Figure S4.
Based on the criteria defined in the Material and methods section, the three categories of nomenclatural consistency defined eight, 19 and 10 species or undefined taxa, respectively, as reported in Table II.

Discussion
A correct and truly effective application of DNA barcoding requires the use of a referenced, official benchmark data set to provide the correct (or the most plausible) species identification through sequence comparison. The construction of a comprehensive sequence reference library is a very long process requiring progressive and complementary approaches (field and lab work, specimen first identification, DNA analyses and data processing, historical and geographical surveys). To overcome the difficulties that often arise due to the lack of traditional taxonomic information, some authors (Gill et al. 2016;Morinière et al. 2017) introduced the use of BINs (Barcode Index Numbers; Ratnasingham & Hebert 2013) for temporary species assignation. However, in many cases multiple BINs were assigned to the same identified species, suggesting, as in the present work (see below), the likely occurrence of potential cryptic species. Indeed, this confirms how far we are from a real representation of mayfly diversity. It stresses also once more the necessity of a more reliable nomenclature. In our work, we found robust congruence among estabilished taxonomy, morphology, literature information and molecular data in only eight species. Contrasting or incomplete taxonomic information, and high genetic distances suggesting the potential occurrence of cryptic species, were found in 22 cases. These, and seven further cases for which there is scarce information from both taxonomic and molecular literature, coupled with some cases of difficult K2P-level intepretation, highlight the general necessity of additional sampling efforts and investigations to achieve reliable taxonomic assessments.

DNA barcoding for a mayfly benchmark taxonomy
Sequence similarity and the genetic distance analysis were concordant in indicating 3% as a general boundary to separate different species or taxonomic units, assigning 95% of the identified morphospecies to distinct OTUs. The remaining 5% corresponds to four entities (Rhithrogena sp. cf. fiorii, R. gr. hercynia, Epeorus assimilis and Baetis fuscatus) included in the B category (difficult name attribution). The occurrence of barcoding gaps was observed across all congeneric morphospecies (Supplementary Table SIII). The RAxML reconstruction on the overall 874 sequences was generally congruent with the acknowledged mayfly taxonomy (Supplementary Figure S2). No COI sequences of a single species clustered into a different species' cluster, and all clades proved to be monospecific, mostly with high bootstrap support (> 70%). The high number of putative species exceeding those morphologically identified with the PTP analysis suggests the hypothesis of cryptic species largely occurring in the investigated data set. Remarkable evidence (BI > 0.987) was provided by E. assimilis, B. fuscatus, B. alpinus, Takobia mutica [Takobia Novikova and Kluge 1987= Alainites Waltz and McCafferty 1994(in Waltz et al. 1994): Kluge & Novikova 2014] and Habrophlebia eldae Jacob & Sartori, 1984. Conversely, the separation of the two Electrogena grandiae (Belfiore, 1981) samples (BI = 0.390) was probably due to uneven sampling (Zhang et al. 2013).

Taxonomic issues
The currently available estimate of the diversity of this order in Italy is not sufficiently accurate.
The obtained molecular results combined with the literature survey showed serious deficiencies in the current taxonomy of mayflies (i.e. underestimated diversity and species names not correctly assigned), for which we believe a critical revision is required. Therefore, we propose some criteria to evaluate the reliability of the current species names that can be used for reconsidering the taxonomic lists. This could be useful also for future practical applications in the light of the newly emerging bio-taxonomic tools (Deiner et al. 2016).
Based on the criteria stated for the A category (Table II), the names of eight species of the investigated data set could be reliably ascertained. Of these, five were sampled in their type locality (Rhithrogena adrianae Belfiore, 1983, R. nuragica, R. johannis, R. sibillina, E. hyblaea) and one was investigated in our previous study (E. grandiae, Cardoni et al. 2015). In addition, for all these species there are no relevant taxonomic problems. Based on the low K2P distance (< 0.028) calculated with the type material (Latium, Belfiore 1981;Cardoni et al. 2015), the three specimens we sampled in the Marche region and preliminarily attributed to E. grandiae could be unambiguously assigned to this species. Moreover, the low divergence between material from the type locality (Sicily, Belfiore 1990) and peninsular sequences (High Marecchia Valley) of R. johannis indicates that all of the samples belong to the same species. The samples of R. sibillina were collected very close to the species type locality (Metzler et al. 1985). The high genetic (> 98%) and morphological similarity to Rhithrogena reatina (type locality Latium; Sowa & Belfiore 1984) support the synonymy R. reatina Sowa & Belfiore 1984= R. sibillina Metzler et al. 1985 Choroterpes borbonica (type locality R. Mingardo, southern Campania: Belfiore 1988) has a relatively restricted distribution (southern Apennines and Sicily), clear diagnostic characters and low intraspecific genetic variability (Table II). Finally, we assigned Epeorus yougoslavicus (Šámal, 1935) (type locality: Macedonia, Šámal 1935) to category A, due to the low intraspecific genetic distance displayed by the several (14) sequences available for comparison, and considering the marked distinctiveness of the species.
Fifty-nine total COI sequences referred to these eight species are now made available in GenBank to serve as benchmarks for future species attributions. Of these, five species (41 sequences) were previously not represented in GenBank (E. hyblaea, C. borbonica, E. yougoslavicus, R. adrianae and R. johannis).
The largest part of the investigated data set (82 individuals, a priori attributed to 19 morphospecies) was assigned to category B (Table II; Supplementary Figure S4). All these entities displayed taxonomic problems based on the standards proposed for the A category, challenging the correctness of their current names. In accordance, they evidenced marked intraspecific genetic distances (K2P range: 0.050-0.253) with high numbers of conspecific sequences available in GenBank. Strong lineage divergence might suggest the existence of potential cryptic species (Ball et al. 2005;Jackson et al. 2014), as also supported by the PTP results (Supplementary Figure  S3). Exhaustive molecular studies and detailed comparison with reliable reference material are therefore necessary to definitely assess the new nomenclatural status of these entities, described as follows.
The Sicilian specimens identified as Takobia sp. were characterised by six tracheal gills as Takobia albinatii Sartori & Thomas, 1989 (type locality from Corsica: Sartori & Thomas 1989). Nevertheless, they were clearly separated from the Corsican T. albinatii specimens and Sardinian samples of T. sp. cf. albinatii (K2P ≥ 0.196 and ≥ 0.210, respectively). High divergence was also observed between the Corsican and Sardinian samples (K2P > 0.188), supporting the hypothesis of a species complex, possibly with an evolutionary history similar to that of Baetis gr. rhodani (Bisconti et al. 2016).
The high COI divergence (> 0.090) within B. fuscatus could also suggest cryptic species. The B. fuscatus neotype is known in Sweden (Brinck & Müller-Liebenau 1965), but no COI sequences from that area are available in GenBank. Therefore, the nomenclatural status of these entities remains unresolved.
The Italian specimens of Baetis sp. cf. lutheri are morphologically different from the types of B. lutheri Müller-Liebenau, 1967 (loc. typ. Germany) (C. Belfiore unpublished data), and their COI sequences are highly divergent from those available in Genbank (K2P = 0.173-0.253); therefore, they may represent another new species possibly deserving the species rank. Furthermore, divergence values of 0.074-0.094 separated the specimens assigned to B. sp. cf. melanonyx (High Marecchia Valley) from B. melanonyx (Pictet, 1843) collected in Switzerland, at about 50 km from the type locality of this species (Savoy near Samoens, France: Müller-Liebenau 1969) (Leys et al. 2016).
In B. alpinus we observed two lineages (K2P > 0.052). Based on the sequence comparisons, one lineage has a wide distribution (Sicily, Latium and other European regions: K2P < 0.023) and the other was only found in Central Italy (sequence nos. 74-75, Supplementary Table SI: K2P = 0.015). It is reasonable to hypothesise that the two lineages might represent different species. Further molecular studies on expanded data sets will certainly contribute to understanding the well-known taxonomic complexity of this group, as already pointed out in previous works in which different lineages were found in sympatric coexistence (Leys et al. 2016(Leys et al. , 2017. This is a typical example in which names should be assigned by analysis of types or by conventional, arbitrary choices. Three samples labelled Electrogena sp. and one Rhithrogena sp. (from Emilia Romagna and HMV: Table I), very similar to E. grandiae and R. adrianae, respectively, showed high genetic divergences from the type locality material of these two species. Furthermore, Rhithrogena sp. was also highly divergent from another species of the diaphana group (Rhithrogena sp. cf. adrianae from the HMV area: K2P > 0.120), and, in contrast, it showed high affinity with Rhithrogena sp. 31 from the Alps (99% similarity; Vuataz et al. 2011). Rhithrogena sp. cf. adrianae from HMV is also highly divergent from R. adrianae collected in the type locality (K2P = 0.145-0.151).
In Sicily and Corsica, the COI divergence recorded in H. eldae might indicate the possible occurrence of two different entities, both differing also from the continental H. eldae (holotype described from Bulgaria: Jacob & Sartori 1984) here investigated (see Supplementary Figure S4).
The Italian samples of Baetis buceratus Eaton, 1870, B. vernus and Rhithrogena semicolorata (Curtis, 1834) showed low intraspecific divergence but were highly divergent from their European conspecifics. Specimens of B. vernus with high COI divergence were also observed in Germany and assigned to different BINs (13.94% intraspecific divergence by Morinière et al. 2017). Since the type localities of these three species are located in the United Kingdom (Curtis 1834;Kimmins 1960), it would be reasonable to assume that the Italian samples belong to different species. However, no sequences from the United Kingdom are available to clear up affinity/divergence patterns with the continental species, except for R. semicolorata whose only sequence available from United Kingdom (see Vuataz et al. 2016) confirms the genetic divergence from Italian specimens. The same conclusion is conceivable for both the single Italian investigated sample assigned to Nigrobaetis digitatus (Bengtsson, 1912) and Torleya major (Klapálek, 1905) specimens whose type localities are in Sweden (Müller-Liebenau 1969) and Bosnia Herzegovina (Klapálek 1905), respectively, highly divergent from all other European members of these species. Furthermore, the highly similar (K2P < 0.005) Sicilian samples of Electrogena lateralis (Curtis, 1834), very divergent from French, British and Swiss sequences, could belong to a different biological entity. In Ecdyonurus venosus (Fabricius, 1775), the current unavailability of the type locality hinders the nominal recognition of the three lineages identified (roughly corresponding to Italy, Germany and France; Supplementary Figure S4), separated by medium-low K2P distances (0.030-0.050). Finally, the possible occurrence of crypticism in E. assimilis (intraspecific K2P distance > 0.089), already discussed in Cardoni et al. (2015), and in Baetis pavidus Grandi, 1951 (K2P > 0.109), is supported by the high genetic divergence observed from the European conspecific samples. For the 60 specimens (10 morphospecies) included in category C (Table II; Supplementary Figure S4), the lack of data from the type localities, the taxonomic uncertainties and the very limited sequence availability in GenBank prevent any inferences being made regarding their status. For instance, with regard to Acentrella sinaica (Bogoescu, 1931) there are too few reference sequences available (see Table I;  Supplementary Table SI) and all collections are so far from the type localities that no considerations about its status can be made. In addition, naming the Italian entities with seven gills identified within Takobia, divergent from the European ones (K2P > 0.162) but with high affinity with one COI sequence from France (seq. no. 11, Supplementary Table SI), is currently difficult due to the K2P distance bieng hardly interpretable (close to the 3% conventional limit), and due to the type unavailability (Müller-Liebenau 1969).
Somewhat unexpected is the presence of Rhithrogena gr. hercynia in Sardinia. It was collected in the R. nuragica type locality (Gennargentu: Belfiore 1987). The two taxa are clearly distinguished by morphological characteristics (i.e. Supplementary Figure S5), corroborated by a very high interspecific genetic diversity at the COI locus (K2P > 0.100). Interestingly, the R. gr. hercynia samples showed 99% sequence similarity with R. gratianopolitana Sowa, Degrange & Sartori, 1986 from Germany (Morinière et al. 2017). In any case, it is a species new to Sardinia. We also compared another species of R. hercynia group, R. sp. cf. fiorii from Latium, with R. gr. hercynia samples from Sardinia and R. gratianopolitana. The high similarity detected (K2P = 0.027-0.032 and K2P = 0.023-0.030, respectively) requires further data (i.e. nuclear markers and thorough morphological analyses) to clarify the taxonomical status of this second Rhithrogena in Sardinia. The same investigation is necessary for other cases such as Rhithrogena sp. from Emilia Romagna, Habroleptoides sp. and Caenis gr. macrura. Specifically, in the latter group, which also includes Caenis sp. cf. luctuosa (from Sicily), three different Italian entities were found to be highly different from each other (K2P > 0.139), in turn highly separated from European sequences belonging to the same group.
Nineteen COI sequences were deposited in GenBank to increase the current availablity of the entities included in category C. Of these, nine sequences represent three taxa for which no data were previously available on GenBank: Quatica ikonomovi (Puthz, 1971), Habroleptoides pauliana and R. sp. cf. fiorii. In future works, comprehensive surveys are required, especially focusing on the type localities; detailed morpho-ecological descriptions will obviously be essential complements to clearly understand their species status.
In conclusion, the taxonomy of Italian mayflies appears more complicated than what is currently recognised. This can affect the actual evaluation of territorial mayfly biodiversity with implications for the compilation of Red Lists and running water quality assessments. We concur with the opinion that the synergy of disciplines which were once studied separately, as taxonomy and molecular biology, will contribute to elucidate some of the outstanding issues. In this context, type material represents a fundamental reference to drive species recognition, and as such it should be adequately considered in subsequent DNA barcoding campaigns. This integrated approach (cf. Schlick-Steiner et al. 2010) may take a long time, but it can provide a stronger and more reliable nomenclature as compared to the traditional methods used so far. This is expected to bring clarity and precision to the classification of nominal species. Thus, by using correct species names, official data reference becomes more stable and univocal even for further studies and future applications based on molecular approaches with innovative techniques (i.e. next-generation sequencing platforms with environmental DNA for biomonitoring assessment). This study has shown that there are taxonomic uncertainties in numerous species, evidenced by the applied molecular and statistical approaches. Given a sampling strategy that was mainly concentrated on the selected species, it will be necessary in further studies to increase the amount of samples and species. It will also be necessary to take advantage of a synergy of multiple data for each taxon, including ecological, morphological, historical and additional genetic descriptors.

Disclosure statement
No potential conflict of interest was reported by the authors.

Supplemental data
Supplemental data for this article can be accessed here.