Establishing demographic reference condition procedures for Italian river fish bioindicator

Abstract The Water Framework Directive 2000/60/EC (WFD) requires Member States of the European Union (EU) to use a pool of bioindicators, that is, Biological Quality Elements (BQEs), including fish species and each Member State shall, in fact, certify the compliance with the ecological quality goals of the WFD. The New Italian Index of the Ecological State of Fish Communities (NISECI) is the WFD-compliant index for the analysis of the fish species BQE in Italian river water bodies, both wadable and non-wadable, that are not (using WFD definitions) heavily modified or artificial. The WFD requires each index to relate the conditions found in the field to reference conditions corresponding to the absence of anthropogenic alterations or to minor anthropogenic alterations. The present study defines the logical basis on the model of refinement for the reference conditions of the metrics and submetrics of NISECI relating to the biological condition of indigenous and non-indigenous species. The model was tested on an empirical basis using data from 284 ichthyological samplings taken from river basins throughout Italy. In order to clean random errors and stochastic noise due to possible different sampling methods, the data was only collected using protocols that guarantee adequately homogeneous and reliable data of abundance (overall and by size). The developed approach provided comparable and homogeneous results throughout Italian watersheds; this allowed a standardized and robust refinement of the threshold values of the demographic parameters of NISECI. The procedure used, consistent with the spatially based reference condition and expert judgement reference condition approaches envisaged by the WFD, improves the analytical performance of the NISECI by increasing its power of resolution and is applicable in every Italian zoogeographic and ecological context.


The NISECI in European and Italian legislation
The Water Framework Directive 2000/60/EC (WFD) is the operational instrument of the European Water Policy. This Directive aims to prevent the deterioration of the status of water bodies in the European Union (EU) and possibly to improve their condition to achieve rivers, lakes, transitional waters, marine-coastal waters, groundwater that are in 'good status'. Regarding perennial surface waters, this Directive requires each Member State to use a pool of bioindicators, called Biological Quality Elements (BQEs), including fish species. In compliance with this Directive, the Italian legislation (particularly Legislative Decree 152/2006, as amended and supplemented by Ministerial Decree 260/ 2010) had initially identified the Index of the Ecological State of Fish Communities (ISECI) by Zerunian et al. (2009) as the index for the analysis of the fish fauna BQE in river water bodies. The ISECI has undergone a national validation process that has led to the development of the New Italian Index of the Ecological State of the NISECI Fish Communities (Macchio et al. 2017). The latter, as envisaged by the implementation process of Directive 2000/60/EC (Common Implementation Strategy, CIS) has, therefore, passed the European intercalibration exercise. The legislative replacement of the ISECI by the NISECI was ratified by Commission Decision (EU) 2018/229 of 12 February 2018 establishing, pursuant to Directive 2000/60/EC of the European Parliament and of the Council, the values of the Member States' monitoring system classifications resulting from the intercalibration exercise. In compliance with this Decision, the NISECI is the index to be used for the analysis of the fish species EQR in river water bodies, both wadable and non-wadable, which are not heavily modified or artificial. The NISECI sampling protocol for the collection of ichthyological data in the wadable water bodies (Macchio and Rossi 2014) has been formalized by the Italian Institute for Environmental Protection and Research (ISPRA) and included in the Manual 'Biological Methods for inland surface waters'. Known as ISPRA 2040, this sampling protocol is essentially a removal method (Zippin 1958) which, in successive capture steps, allows the estimation of the catchability and numerical abundance of individual populations.

The NISECI structure and functionality
The main criteria used by the NISECI to assess the ecological status of a waterway are the naturalness of the fish community (understood as the completeness of the composition in indigenous species expected in relation to the zoogeographic and ecological framework) and the biological condition of the populations present (quantified positively for the expected indigenous species and negatively for the non-indigenous ones), in terms of abundance and population structure such as to guarantee the ability to reproduce and to have normal ecological-evolutionary dynamics. These criteria are linked to the requirements of the Water Framework Directive, 2000/60/EC, reiterated in the corresponding rules of transposition at national level (Legislative Decree 152/06 and subsequent amendments and additions), which envisage that, for the definition of the ecological status of river water bodies, the fish fauna BQE must be considered, assessing specific composition, abundance and age structure.
The NISECI is made up of three main indicators: X 1 Presence/absence of species; X 2 Biological condition of the populations of expected indigenous species; X 3 Presence of non-indigenous or hybrid species, structure of the relative populations and numeric ratio (in terms of number of species and number of individuals) compared to the expected indigenous species detected.
Ecological state values are then expressed as EQRs after the equation Since WFD ecological indexes have to describe ecological status in 5 ordinal classes, EQR NISECI values are then grouped in five EQR NISECI CLASSES on the basis of threshold values of EQR NISECI . Thresholds values were defined after the intercalibration exercise required by WFD in order to standardize the performances of the pool of river fish bioindicators available in the European Community.
After subdivision of Italian territory in Alpine area (northern of the Po-Tanaro rivers line) and Mediterranean area (the remaining Italian territory), threshold values are defined as in the following Table 1. 1.2.1. Indicator X 1 It compares the specific composition of the indigenous fish community observed with that expected. The species belonging to Salmonidae sensu Nelson (including Thymallus thymallus), Esocidae and Percidae are defined as species of major ecological and functional importance. The value of the indicator is given by the function: where: n i ¼ number of expected and sampled indigenous species of major ecological and functional importance; n a ¼ number of other expected and sampled indigenous species; m i ¼ number of expected indigenous species of greater ecological-functional importance; m a ¼ number of other expected indigenous species.

Indicator X 2
The biological condition of each of the indigenous species present is given by the integration of population structure and demographic consistency (numerical abundance) that can assume the scores 0, 0.5 and 1 corresponding to three levels ( Table 2). The three levels are assigned with reference to threshold values that have been experimentally identified on Italian nationwide datasets.
The final value of the indicator is given by the function: Where: n ¼ number of expected indigenous species sampled; i ¼ single indigenous species sampled; X 2,a ¼ submetric related to the population structure in age groups; X 2,b ¼ submetric related to numerical abundance.
1.2.2.1. Submetric X 2,a : population structure. The assessment of the age of the individuals sampled is not carried out directly (analysis of scales or otoliths), but using an indirect method, based on the relationship between age and length, considering the second as a proxy for the first. All species of fish have been assigned to one of four size groups based on the typical size of the species; five different size classes (CLz, with z ¼ f1, 2, 3, 4, 5g) were identified within each size group, to which the individuals sampled are assigned on the basis of body length (Table 3).
Once the individuals sampled have been assigned to the size classes, the population structure in age groups is assessed by integrating two criteria (Table 4), each of which expresses a response based on three scores (1, 2, 3) expressed on the basis of pre-established objective rules: Criterion A -Distribution by size classes: the score is assigned in function of the number of length classes populated in the sample; Criterion B -Adults/juveniles numerical ratio: the score is assigned in function of the ratio between the number of adults AD (CL4 þ CL5) and the number of juveniles JUV (CL2 þ CL3). The criterion identifies four threshold values and assigns the optimal values in the central range of distribution. Values and corresponding levels of judgement. Table 3. Size groups of fish species and associated size classes.
Very small fish Small fish Medium fish Large fish CL1 < 3.5 cm CL1 < 4.5 cm CL1 < 8.0 cm CL1 < 25.0 cm 3.5 cm CL2 < 4.5 cm 4.5 cm CL2 < 9. 1.2.2.2. Submetric X 2,b : demographic consistency. The assessment of demographic consistency is carried out by assigning the values observed to one of three categories of abundance. These categories have been defined in Macchio et al. (2017) on the basis of threshold values for the distribution of frequency of density values obtained from regional fish maps. The threshold values used to separate the three categories of abundance were (Table 5) the 1st tertile of the distribution of frequency (cumulative percentage of the sample ¼ 33%) and the 2nd tertile (cumulative percentage of the sample ¼ 66%).

Indicator X 3
Non-indigenous species are divided into three groups according to their harmfulness, defined on the basis of the decreasing level of impact on the indigenous fish species listed in three lists (List 1: Species with high harmfulness; List 2: Species with medium harmfulness; List 3: Species with moderate harmfulness). The lists of species belonging to the three different groups are defined on the basis of the assessments carried out by Zerunian et al. (2009). The value of X 3 is determined as follows: State 1) absence of non-indigenous species X 3 ¼ 1 State 2) Presence of species belonging to list 1, with at least one well-structured population X 3 ¼ 0 State 3) Total number of non-indigenous fish ! total number of indigenous fish (belonging to expected species) X 3 ¼ 0 State 4) In all other cases, the following formula is calculated: where: Abbreviations: AD ¼ adults; JUV ¼ juveniles. i ¼ Proportion of non-indigenous species with a well-structured population compared to the total number of non-indigenous species present, multiplied by 0; ii ¼ Proportion of non-indigenous species with a moderately-structured population compared to the total number of non-indigenous species present, multiplied by 0.5; iii ¼ Proportion of non-indigenous species with a destructured population compared to the total number of non-indigenous species present, multiplied by 1.

Refining the NISECI calculation parameters
The original publication of the NISECI defines calculation parameters and application modalities identified on a historical-bibliographic basis and experimentally validated at national and district level. As reported in the Ministerial Decree 260/2010, in the original publications (Zerunian et al. 2009;Macchio et al. 2017) and in the dedicated literature (Rossi et al. 2015(Rossi et al. , 2016De Bonis et al. 2017), the NISECI is an index designed to be applicable in all Italian hydrographic basins through a process of calibration of the calculation parameters depending on the different local, ecological and zoogeographical characteristics. This procedure, made possible by the explicit logical and mathematical structure of the index and by the partially 'open source' nature of its calculation parameters, allows the refinement of the functionality of the NISECI without compromising its legislative validity. Thanks to the fact that the formulation of the NISECI is constructed using a deductive approach (Top-Down), while the values of the coefficients of the calculation parameters are processed according to an inductive approach (Bottom-Up), it is possible to model variables of different scales (at Community level and at Population level, respectively), dramatically reducing the amount of experimental data needed to obtain a model using the inductive approach alone. For each detailed zoogeographicalecological zone, therefore, the theoretical fish communities, described in the original publication, must be refined through ecological observations on the actual or potential habitats and historical-bibliographic analysis of the knowledge of the fish species. As stated in Ministerial Decree 260/2010: 'The Regions that, following the aforementioned surveys, have refined the expected fish communities, shall forward the results of the surveys carried out and the corresponding information, together with the supporting scientific documentation, to the Ministry of the Environment and Protection of the Territory and Sea'. Similarly, the reference conditions for the biological condition of individual species can be refined in relation to the geographical and ecological context of the water body under consideration (spatially based reference condition). A key requirement of Directive 2000/60/EC is that the reference conditions must correspond to the absence of anthropogenic changes or minor anthropogenic changes. These conditions can be defined through existing situations, when available; through a modelling approach, when allowed by the quantity and quality of the data; through expert judgement (pending the consistency of the data) only when the previous options are not possible. Before application of the NISECI for monitoring purposes, the following must be defined for each individual fish species in each individual water body: (1) whether the species is autochthonous, that is, indigenous, to its zoogeographic context; (2) whether it should be expected in the local ecological context; (3) the threshold values for associating quality judgments with the observed values of the metrics and submetrics relating to the biological condition of indigenous and non-indigenous species. Starting from experimental case studies, the present study defines and generalizes the modalities for refining the parameters used to calculate the NISECI, in the various Italian geographical contexts, in relation to: (1) population structure of the indigenous species (Criterion A and Criterion B of submetric X 2,a ); (2) demographic consistency of the indigenous species (submetric X 2,b ); (3) population structure of the non-indigenous species (State 2 and State 4 of indicator X 3 ). Integrating the analysis of experimental data and the scientific synaecological and autoecological knowledge of fish species, said refinement methods allow the definition of the spatially based and expert judgement reference conditions for every single Italian ecological and zoogeographical context. The aim of this study is to provide those responsible for refining the NISECI with a unique and standardized methodology capable of simplifying the refining process and delivering comparable and homogeneous results throughout the country.

Area of study
The methods of refinement (i.e., logical model of refinement) of the calculation parameters (i.e., threshold values of submetrics X 2,a and X 2,b ) have been elaborated on a rational basis, taking into account the need not to create logical distortions when the biological characteristics of the different species vary (e.g., annual or multi-year biological cycle, early or late sexual maturation, type of body size). The calculation parameters were then calculated and tested on an empirical basis using data from 284 ichthyological samplings from the hydrographic basins of Lombardy, Veneto, Emilia-Romagna, Tuscany, Marche, Basilicata and Calabria either embodied in the databases of the Department of Biological, Geological and Environmental Sciences of University of Bologna, of the Regional Agency for the Protection of the Environment (ARPA) of Basilicata, of the academic Spin-Off Hydrosynergy, of one of the in Authors (De  or available in literature (Confortini et al. 2008).

Sampling procedures
Since many the ISPRA 2040 protocol, not all the disposable data were obtained from ichthyological samples caught with the sampling procedure formalized by ISPRA. In order to contain random errors and stochastic noise due to the different sampling methods, only data collected using protocols that guarantee data of abundance, overall and by size, adequately homogeneous and reliable for the refinement of the biological condition of the populations, have been considered. In addition to the data collected with the ISPRA 2040 protocol, other data found in the bibliography and collected using different protocols were also used, as follows: sampled stretches of at least 50 m in length for average wetland widths of less than 5 m and of at least 100 m for average widths of more than 5 m; censuses by electrofishing with removal methods (Moran 1951;Zippin 1956Zippin , 1958; registration of the single body size with a precision of 1 mm. These conditions were considered sufficient to ensure performances similar to the ISPRA 2040 protocol (see quantitative section) exclusively for the analysis of demographic consistency and population structure in classes of length. Moreover, given the incomplete homogeneity of the representation of data from different sources, not all censuses were used to refine each of the parameters considered. With regard to the refinement of the reference community and the abundance thresholds, the works carried out with protocols that included all the information B of Table 6 were considered sufficient while, for aspects linked to the population structure in age groups, a stricter criterion was applied (only data from protocols contemplating the information C of Table 6). The information on the population structure in age groups by scalimetric survey or analysis of the growth rate (e.g., Von Bertalanffy Growth Model), allowed the estimation of the relationship between body length and age groups and the analysis of the relationship between the abundance of juvenile and adult age groups. The workflow for the dataset preparation is supplied as flowchart diagrams in Supplementary material ( Supplementary Figures 3-4).

Experimental design
The two main variables that can influence the threshold values of the Total Lengths between different age groups are the geographical origin of the populations and the local ecological context. The populations of each species have therefore been subdivided according to the geographical area of origin and to the environmental and geomorphological characteristics of the wet river channel of the sampling stations (channel-reach scale morphology) as defined by Montgomery and Buffington (1997) and described in the System for Stream Hydromorphological Assessment, Analysis and Monitoring (IDRAIM) methodology (Rinaldi et al. 2016). To eliminate self-referentiality in the test phase, the data available were divided into a training dataset (74% of the data available), with which the values of the calculation parameters were obtained, and a test dataset (26% of the data Table 6. (A) Information for the refinement of the NISECI parameters obtained from data collected using the ISPRA 2040 protocol, (B and C) minimum information for the refinement of individual NISECI metrics required for the use of bibliographic data collected using verse protocols from ISPRA 2040. available), with which the validity of the model was tested. The latter dataset is fully compliant with the ISPRA 2040 sampling protocol.

General criteria for the estimation of demographic consistencies
The estimation of the demographic consistencies was carried out by a classical Maximum Likelihood models on the basis of data collected by the removal method (Zippin 1958) at s ¼ f2, 3, 4g successive trapping steps. These models were chosen, despite the existence of more modern statistical alternatives, because they can be applied to large datasets using simple spreadsheets and without the need for time-consuming calculations. The general formula for estimating numerical abundance (N) according to the Maximum Likelihood estimates of Moran (1951) and Zippin (1956Zippin ( , 1958 is: (1) where: s ¼ number of trapping steps; For two and three-sample methods, mathematical functions with explicit solutions are available in literature.: Two-sample method (Seber and Le Cren 1967) Three-sample method (Junge and Libosvarsky 1965) Where: For the four-sample method, as no explicit mathematical functions are available, the general model of the Maximum Likelihood estimates of Moran (1951) and Zippin (1956Zippin ( , 1958 was used, providing implicit solutions (iterative algorithms or graphic procedures) in the joint determination of N and p starting from the equations: and in the specific case of s ¼ 4 provides a graphic procedure for the estimation of p (and, therefore, of N) through the correlation with the variable R; this procedure involves the calculation of R, the determination on the graph shown in Figure 1 of the corresponding value of p and, therefore, the estimation of N.
To improve the calculation of p and obtain numerical solutions on large datasets using simple spreadsheets, the graph for s ¼ 4 has been reconstructed by calculating the values of R corresponding to 21 different catch values perfectly constant in the four trapping steps p 21 j¼1 ¼ f0.1, p 19 i¼1 =20, 0.99g and the relationship between the variables p and R obtained by polynomial interpolation. To calculate R, the number of specimens captured in each ith step for each jth perfectly constant catch value was obtained as:  Moran (1951) and Zippin (1956Zippin ( , 1958 starting from the calculated values of the variable R for the four-sample method (s ¼ 4); from Zippin (1956).
For s > 2: Correlation between the catch parameter (p) and the variable R in the Maximum Likelihood estimates of Moran (1951) and Zippin (1956Zippin ( , 1958 for 4-step catch sampling.
Assuming a normal error distribution, the 95% confidence intervals can be calculated as: It must, however, be considered that, when p and N are too small, the distribution of N is asymmetrical and V[N] too narrow, and so the confidence intervals calculated as above are closer to 90% than to 95% (Zippin 1956;Robson and Regier 1968).
All the models used require trapping to be constant or for all the trapping steps to be comparable. Slight deviations from this requirement, due to stochastic factors, are tolerable while a net change in electrofishing efficiency between the various trapping steps leads to unrepresentative estimates. This situation can be found, for example, with small benthonic species; at the first step with electrofishing, passive swimming induced by the electric current can have such a direction that many of the specimens remain in the den; these specimens can then come out of the shelter passively and slowly as a result of the water current or by fleeing voluntarily and quickly when their motor skills are fully restored. At the second step, these animals are, therefore, 'bewildered and in the open field' and it is common to observe a greater number of them than at the first step. To assess the applicability and reliability of these models, it is therefore necessary to consider some mathematical, logical and statistical aspects. These models are not mathematically applicable when (Seber and Whale 1970): R > s/2 meaning: For R ¼ s/2, the models are logically inapplicable, because they render a zero-catchability value. For s > 2, it is possible to test the validity of the assumption of constant catchability through a goodness of fit test. White et al. (1982) describe the procedure by means of a simple chi-square test considering the expected values, resulting from the estimation of abundance and catchability. As pointed out in Lockwood and Schneider (2000), the chi-square test must be performed with s-2 degrees of freedom because N and p are estimated quantities; and for this reason, the test is not applicable when s ¼ 2. If the chisquare test is significant, the difference between expected and observed data is of such magnitude that the model cannot be applied. In cases where a model on s trapping steps is not applicable, an attempt may be made to achieve an estimate by using only the partial sample obtained with fewer trapping steps. In fact, providing at least two trapping steps, it is mathematically possible to get the estimate excluding data from the last trapping steps. It is also possible, to obtain an estimate excluding data from the first trapping steps. At the end, the number of individuals actually caught in the steps excluded from the application of the model must be added to the expected number of individuals obtained by applying the model with the partial sample. Conversely it is not correct to exclude data obtained from intermediate trapping steps. The applicability of the model must obviously be tested also for each attempt with partial samples. For the purpose of refining the threshold values using the procedures indicated, all cases in which the conditions of applicability are not met or the goodness of fit test is significant must therefore be excluded. It is also necessary to exclude cases that lead to excessively wide confidence intervals; as a practical rule, cases in which the coefficient of variation of the estimate (cv) is greater than 30% should be excluded (Zippin 1956 Lastly, given the reduced efficiency of the calculation of the variances when p and N are small, as reported in Seber (1982), it is necessary to exclude cases where: N Â p 3 < 16 Â q 2 Â ð1 þ qÞ 2.4.2. Submetric X 2,adefinition of Size Classes (CL) and refinement of threshold values between Size Classes (CL) Submetric X 2,a envisages the subdivision of each sample in five different size classes (CL) according to the body length of the specimens caught. These size classes must be functional to the representation of the population structure in age groups for which they are proxies. This subdivision (data binning) was deemed necessary to make the submetric as resistant as possible to seasonal variations of individual dimensions.
The breakdown of the population structure into a finite number of size classes entails a likelihood (variable depending on the species and the geographical and ecological context) that more than one age group can fall, at least partially, into a single length class and vice versa.
The association between size classes and age groups must also be made for taking into account the definition of Criterion B of Submetric X 2,a (ration of adultsjuveniles) . According to this criterion, the larger size classes (CL4 and CL5) are to be associated with adult forms (AD), while it is assumed that the immediate smaller size classes (CL2 and CL3) are mainly populated by juvenile forms (JUV). According to this scheme, all or most of the individuals included in CL4 should have reached their first sexual maturity. In the calculation of the juvenile/adult ratio using the total length classes, the smaller size class (CL1) corresponding to the fry (AV) is excluded; this exclusion is motivated by the extreme inter and intra-annual numerical variability (which is also usually affected by high inaccuracy in the sample estimate) of the cohort 0 þ (individuals under one year of age), a component that is assumed to dominate CL1 (Thompson and Rahel 1996;Dolan and Miranda 2003;Miranda and Dolan 2003;Korman et al. 2009;Millar et al. 2016;Hedger et al. 2018). The age groups (CE) above 0 þ have been divided into juveniles and adults on the basis of the years required to reach sexual maturity, information gathered from the main bibliographic texts (Holcik 1986;Gandolfi et al. 1991;Hoestlandt 1991;B an arescu 1999;B an arescu and Paepke 2001;B an arescu and Bogutskaya 2003;Miller 2003;Zerunian 2004) complemented by evidence of reproductive maturity recorded in the field (nuptial tubercles, gamete emission, etc.). If first sexual maturity (CE sex ) is reached at different ages in the two sexes, the age at which both sexes are able to reproduce with each other for the first time has been taken as a precautionary reference. For species in which both sexes are mature from the fourth year (CE 3 þ ) onwards, the CL were assigned according to the following criteria (scheme for merging age groupslength classes): The workflow for the definition of Size Classes (CL) is supplied as flowchart diagram in Supplementary material (Supplementary Figure 5).
The refinement of the length threshold values between the size classes (CL) was carried out on the basis of the distributions of frequency of the total body lengths (LT) for narrower class intervals than those used in the original draft of the NISECI, i.e., in intervals of size (InT) of 0.5 cm for very small and small species and 1 cm for medium and large species; it follows that each CL of NISECI is divided into several InT. The smallest InT (InT z , min ) and biggest (InT z , max ) values were identified for each CLz of each population; then the median values of the distributions of InT z , min (Me_InT z , min ) and InT z , max (Me_InT z , max ) for all the available populations were identified. The species-specific threshold value between two adjacent size classes CLz and CLz þ 1 was then calculated as: (Me_InT z , max þ Me_InT zþ1 , min )/2. Only populations for which the length/age ratio was available were considered for this purpose (by scalimetric survey and/or analysis of the growth rate), with a sample of specimens corresponding to at least one intermediate population size in indicator X 2,b and characterized by a continuous distribution of the sizes (Figure 3).
In species in which sexual maturity is reached at or before the 3rd year of age (CE sex 2 þ ), age differences of less than one year are considered significant for the purpose of identifying threshold values for separating length classes.
The following criteria are adopted: A. If first sexual maturity is reached at the 3rd year of age (CE sex ¼ 2 þ ), juvenile individuals (JUV) consist exclusively of CE1 þ , which in this case would combine CL2 and CL3. Individuals of CE1 þ were divided between the two classes using the median value of LT distribution of CE1 þ as the discriminating value. B. If first sexual maturity is reached at the 2nd year of age (CE sex ¼ 1 þ ) all the individuals belonging to age groups above CE0 þ were divided between CL2, CL3, CL4 and CL5, in this case, the first three quartiles of the Lt distribution of individuals over the age of CE0 þ were used as the discriminant threshold LT. C. If sexual maturity is reached at the 1st year of age (CE sex ¼ 0 þ ) all the individuals in the population were divided between CL1, CL2, CL3, CL4 and CL5 using the first four quintiles of the Lt distribution as the threshold LT. The synoptic framework of the breakdown of the age groups (CE) into size classes (CL) is shown in Table 7.
The workflow for the refinement of threshold values between Size Classes is supplied as flowchart diagrams in Supplementary material (Supplementary Figures 6-7).
2.4.3. Submetric X 2,a -Criterion B refinement of the threshold values of the AD/ JUV ratio For each species, the threshold values of the ratio between the number of adult and juvenile individuals were defined on the basis of the sextiles of the frequency distribution of this parameter in the training dataset, assigning the optimal values to the central sextiles. This choice was made so as to comply with Criterion B of Submetric X 2 , which involves four threshold values, and in order to evenly distribute the cases available in six symmetrical scoring ranks with respect to the optimal value. The extreme cases corresponding to the complete absence of one of the two categories were considered as outliers and therefore excluded from the analysis (AD ¼ 0 or JUV ¼ 0). The scores were then assigned as in Table 8.
Cumulative catches have been used for the definition of sextiles, regardless of the number of subsequent trapping steps.
In conditions of constant catchability between successive steps and assuming negligible differences in the likelihood of capture p between the age groups above 0 þ , that is: p CL2 % p CL3 % p CL4 % p CL5 using (1) the general formula for estimating numerical abundance (N) and indicating as C sCLz , the total catches in the s electrofishing steps carried out for the zth size class,we obtain the following formulation for the AD/JUV ratio: The workflow for the refinement of the threshold values of the AD/JUV ratio is supplied as flowchart diagram in Supplementary material (Supplementary Figure 8) Table 7. Synoptic framework of the scheme of merging age group-length classes. 2.4.4. Submetric X 2,brefinement of the demographic consistency threshold values The density threshold values were calculated according to the procedure identified by Macchio et al. (2017). In order to standardize the data obtained from sampling carried out under different conditions of catchability and possibly with a different number of trapping steps, estimated density values (Table 9), from demographic consistencies calculated as reported in subsection 2.1.1., were used instead of the densities observed.
The workflow for the refinement of the demographic consistency threshold values is supplied as flowchart diagram in Supplementary material (Supplementary Figure 9).

Results and discussion
Quantitative data available, mostly collected in wadable watercourses, have made it possible to refine the threshold values of populations distributed among the ecological zones of salmonids and rheophilic cyprinids. The test of the efficiency of the proposed refinement methodology was then carried out on 74 population data not used for the processing of the model. For this purpose, only data collected using the ISPRA 2040 protocol and referring to the ecological zones of salmonids and rheophilic cyprinids were considered. To assess the efficiency of the refinement method proposed here, the frequency distributions before and after refinement of the variables were compared: values of submetric X 2,a ,values of the NISECI, values of the EQR NISECI , classes of the ecological state of the EQR NISECI (Figure 4).
The efficiency of the methodology for refining submetric X 2,b with respect to standard threshold values has not been tested here; the original publication of the NISECI does not in fact provide standard threshold values to be refined but assumes that the refining for this submetric must be compulsory.
The following have been considered as positive evaluation criteria: 1. the obtaining of a more symmetrical and less flattened distribution for indicator X 2 , with a median value closer to 0.5 and with greater dispersion of values; this condition was considered to be consistent with the spatially based reference condition approach in which the best possible conditions found in the geographical and ecological context of the water body in question are adopted as reference condition; 2. a statistically significant difference between the before and after refinement condition in the distribution of values of all the metrics considered; Table 8. How to refine threshold values and assign scores for criterion B of submetric X 2,a .

Evaluation criterion 1
Table 10 shows the descriptive statistics of the frequency distribution of submetric X 2,a in conditions before and after refinement. Submetric X 2,a after refinement shows a distribution of values with a greater dispersion (Variance ¼ 0.06; Interquartile Range ¼ 0.29), more symmetrical (Adjusted Skewness ¼ 0.61, median value 0.4) and more bell-shaped (Excess Kurtosis ¼ 0.03). The form acquired by the values of submetric X 2,a after refinement indicates a greater  discriminatory power between similar but not identical situations, that is, a better descriptive capacity of the different possible cases. After refinement, submetric X 2,a is able to describe both optimal and critical situations, as opposed to before refinement, when almost exclusively poor and intermediate situations are described.

Evaluation criterion 2
The variables of interest for each individual sample have a before and after condition with non normal data distribution (Table 11).
To assess the statistical significance of the effect of refinement on the results of the variables of interest, 2-paired-samples non-parametric tests were used. In particular, given the non-symmetrical distribution of the before and after differences (data not shown) for all the variables, both continuous and ordinal, the paired-samples sign test was applied (Sprent 1993). For the 'classes of the ecological quality of the EQR NISECI ' variable, being ordinal, the marginal homogeneity test of Stuart-Maxwell (Stuart 1955;Maxwell 1970;Everitt 1977) and Bhapkar (1966), were also used as confirmation. The first two tests were carried out using the PASW Statistic 17 software, the last one using the Bhapkar test.xlsx spreadsheet (Stikker 2017).
The results of the tests are summarized in Table 12. For all the variables, the refinement of the threshold values was found to have a highly significant effect. In other words, as a result of the refinement process, the variables considered acquired new and more efficient capacities for assessing fish populations.

Evaluation criterion 3
The results of the assignment of quality scores (classes) of the EQR NISECI before and after refinement are shown in comparative form as a table with two entries (Table 13). The refinement of the threshold values has a numerically significant effect on the assignment of quality scores. As a result of the refinement, the changes in the score led to an improvement of a quality class in 35% of the stations and to the deterioration of a class in 3% of the stations.

Conclusions
The present study has allowed the identification of methods for the refinement of the threshold values of the demographic parameters of indigenous and non-indigenous fish species necessary for the calculation of the NISECI and the assignment of quality scores (classes) of the EQR NISECI . The procedure used, consistent with the spatially based reference condition and expert judgement reference condition approaches of the WFD, is applicable in every zoogeographic and ecological Italian context and on the basis of the three assessment criteria used, improves the analytical performance of the NISECI by increasing its power of resolution. In other words, the NISECI acquires a greater ability to describe and classify fish populations by identifying differences in greater detail. The procedure used is designed to refine the descriptive capacity of the ecological quality of the fish species with the NISECI with further data, available in the bibliography (e.g., fish maps or atlas). The bibliographic examination of the data available at Italian nationwide level highlighted products for which, for example, demographic consistency values collected in a sufficiently robust way were not coupled with structural analysis in appropriate age groups and vice versa. The criteria for judging whether and which bibliographic data can be used to refine individual demographic parameters have therefore been defined here. This procedure maximizes the possibility of having a sufficient critical mass of data to correctly represent the demographic variability of individual territorial contexts. This study also defines the minimum level of detail that must be guaranteed to allow the refinement of the demographic threshold values for classification purposes. It should be noted that the model presented here is designed to be used and implemented in different Italian contexts in operational mode, without prejudice to the need for ichthyological and ecological skills. The summary of the procedures identified, broken down into subroutines, is shown in the form of a flowchart as Supplementary material ( Supplementary  Figures 1-9). The approach used here represents a first method for refining the NISECI; in the near future, further models will be tested in terms of cost benefits (e.g., amount of data needed to ensure sufficient reliability) and technical applicability. For the latter, it is necessary to envisage that complex models can be used as an alternative to those proposed here, as long as they are developed by staff with statistical, ichthyological and ecological skills that guarantee the validation of the results. Stefano Macchio: Biologist with specialization in applied ecology and environmental monitoring through vertebrate bioindicators. Engaged for more than 20 years in field activities and data analysis in ISPRA.
Gian Luigi Rossi: Senior Researcher at Research Center ENEA Saluggia, in the Laboratory of Biodiversity and Ecosystem Services of the Department for Sustainability. His work fields are river ecology, environmental quality assessment through the use of bioindication and synthetic biotic indices, planning and management of protected areas, management of wetland environments.
Stefania Balzamo: Chemist with more than 20 years of experience in the field of environmental chemistry and sampling; she has worked for the Environmental Ministry to implement the WFD in Italy; she attends to the WG of ECOSTAT and now she is the head of Laboratory Department of ISPRA.

Data availability statement
The Authors do not have the permission to share the main part of their data because was collected for third parties. The freely shareable data that support the findings of this study are available from the corresponding author, [GR], upon reasonable request.