DNA barcode approaches to reveal interspecies genetic variation of Indian ungulates

Abstract In the past two decades, identification of species from noninvasive sampling has turned out to be an important tool for wildlife conservation. In this study a total 93 specimens representing 22 species of ungulates were analyzed from partial sequences of mtDNA COI and Cytb genes. All the species showed unique clades, and sequences divergence within species was between 0.01–3.9% in COI and 0.01–13.7 in Cytb, whereas divergence between species ranged from 2.2 to 29.5% in COI and 2.3 to 28.8% in Cytb. Highest intraspecific divergence was observed within the Ovis aries in COI and Porcula salvania in Cytb. Bayesian (BA) phylogeny analysis of both genes combined distinguishes all the studied species as monophyletic criteria. The Indian rhinoceros (Rhinoceros unicornis) exhibited closer relation to horse (Equus caballus). No barcode gap was observed between species in COI. This study demonstrates that even short fragments of COI and Cytb generated from fecal pellets can efficiently identify the Indian ungulates, thus demonstrating its high potential for use in wildlife conservation activities.


Introduction
Ungulates are amongst the most vulnerable group of mammals (Ceballos et al. 2005). These are also known as the hoofed animals' distinction due to the shape of their toe. Cattle, sheep, goats, deer and pigs belong to family Artiodactyla, horses and rhinos are part of another family Perissodactyla. There are 39 species of ungulates present in India (Sankar and Goyal 2004). Among these, many species are in the extremely endangered category (Schipper et al. 2008), which are declining due to environmental changes, impacts of anthropogenic pressure on wildlife habitats, and poaching (Maisels et al. 2013). Some of the species are completely protected under the schedule of the Wildlife Protection Act of 1972. There are many species which are highly endangered with only single populations in the entire distribution range, for example, the Kashmir stag or hangul (Cervus elaphus hanglu), the Manipur brow-antlered deer or sangai (Cervus eldi eldi), the Central Indian race of the swamp deer or barasingha (Cervus duvauceli branderi) and the Indian wild ass or khur (Equus hemionus khur) (Daniel 1991). Cervus elaphus wallichi has disappeared from Sikkim (Sankar and Goyal 2004). The decline of these populations of ungulates to adapt to environmental changes decreases their chances of long-term survival. Terrestrial mammals are threatened to the risk of extinction due to hunting pressure, habitat fragmentation, and habitat modification (Karanth et al. 2010), and around 50% of them are showing a declining trend in the population size from their native range (Channell and Lomolino 2000;Ceballos et al. 2005). Hence, these populations need a higher priority of conservation. As for other wildlife species of India, they are facing severe threats due to alarming increase in the human population (Karanth et al. 2009). Conservation success largely depends upon identifying vulnerable species and understanding the environmental factors that support their persistence in human-dominated landscapes (Kumar et al. 2017). More recently, genetic comparisons with the non-invasive sampling have led to greater understanding of lineages of related species, especially at higher taxonomic levels, where derived morphological characteristics can be difficult to determine owing to ancient divergences, thus leading to often radically different phylogenies and species groupings (Waits and Paetkau 2005).
The identification of species with non-invasive sampling without disturbing the animals or putting them at health risk still stands as one of the most basic but important issues in a forest. In a recent study, molecular taxonomy has helped in resolving the phylogeny of cervids resulting in clarity on species distribution and relatedness for effective conservation planning (Gilbert et al. 2006). However, the studies indicate further revision in the molecular phylogeny (Groves and Grubb 2011). Successful conservation efforts depend upon the identification of evolutionary significant units (ESU) of vulnerable species. In the present study, we examined COI and Cytb diversity within and among 22 species of Indian ungulates with the goal of testing the utility of DNA barcoding as a tool to identify species. The mitochondrial DNA (mtDNA) cytochrome c oxidase I (COI) and cytbchrome b (Cytb) has been widely used as a barcode for biological identification and phylogenetic studies (Hebert et al. 2003). In this study, we determine levels of interspecific variation within COI and Cytb between closely related species and provide an unbiased analysis using the same criteria for each and will make recommendations based on their use in phylogenetic reconstruction and species discrimination in between the 22 ungulates of India from the DNA extracted from noninvasive and highly degraded samples.  (Table 1). All samples were fixed in 95% ethanol and stored at 20 C until the analysis. A downloaded sequence of eight species from NCBI is also included in this study. A total of 22 species of Indian ungulates were included in the present study (Table 1). Genomic DNA was extracted from fresh fecal samples, by using QIAamp DNA Stool (QIAGEN, Hilden, Germany) with a little modification in temperature. A partial fragment of the COI-1 gene was amplified using the following primers: COI (F 2 -GTACCGCTAATAATTGGTGCTCC), COI (R 2 -GGGTGGCCAAAGAATCAGAACAAGTG) (Kumar et al. 2017), Cytb Bongo forward 5 0 -GAT ACGTCCTACCATGAGGACAAATAT-3 0 , and Cytb Bongo reverse 5 0 -GGGTGTATTAAGTGGGTTTG-3 0 (Faria et al. 2011). PCR amplification of the COI and Cytb gene was performed in a total volume of 25 mL reaction, containing 1X PCR Buffer (5 mM MgCl 2 ; 10 mM dNTPs; 5 pmol of each primer; 1 U Taq polymerase (CinnaGen)). Negative controls were included in all PCR amplification. PCR reactions were carried out in Eppendorf Thermo Cycler and amplification conditions were 94 C for 5 min followed by 35 cycles at 94 C for 30 s, annealing 50 C (T a ) for 30 s and 72 C for 1 min, with the final extension of 72 C for 10 min. PCR products, that yielded a clear band on agarose gel electrophoresis, were used for sequencing bidirectionally, using an automated capillary sequencer (ABI377) following the manufacturer's instructions.

Data analysis
All the sequences were individually checked manually using the program BioEdit and ClustalW (http://www.clustal.org/ clustal2/). Each sequence was systematically analyzed to find out the identity through Basic Local Alignment Search Tool (BLASTn; https://blast.ncbi.nlm.nih.gov). All the sequences obtained were submitted to NCBI to obtain the respective accession numbers. We have retrieved sequences of three species from GenBank from the whole mitochondrial genome. Alignments were then performed using BioEdit (Hall 1999) and ClustalW (http://www.clustal.org/clustal2/) trimmed  The mean genetic distance with in species are represented in bold numbers on diagonal (COI/Cytb).
to 408 bp. All statistical parameters, sequence composition and substitution pattern for the entire data set, genetic divergence, variable sites, transition, and transversion rates were calculated using the program MEGA 6 (Tamura et al. 2013). The Bayesian tree was built in Mr. Bayes 3.1.233, the program Modeltest was used to find the suitable model for data test by selecting parameters nst ¼ 6 for GTR þ G þ I model with four metropolis-coupled Markov Chain Monte Carlo (MCMC) and run for 1,000,000 cycles with 25 burns (Ronquist and Huelsenbeck 2003). The generated BA tree was represented by the FIGTree software. The neighbor-joining (NJ), maximum-parsimony (MP), and maximum-likelihood (ML) were also generated by MEGA 6 (Tamura et al. 2013). The haplotype data were generated using DnaSP5.10 (Librado and Rozas 2009). Automatic barcode gap discovery analysis (ABGD) was implemented online (www.abi.snv.jussieu.fr/public/abgd/abgdweb.html, Puillandre et al. 2012) and was run by selecting Kimura 2-parameter distance (K2P) with transition/transversion ratio (TS/TV) equal to 2 and with a FASTA file input of the alignment, with default values for P min , P max , and relative gap width. The database sequence of Pteropus giganteus (MG821199) was used as an out-group in the phylogenetic study for making the Bayesian tree.

Results and discussion
A total number of 83 fecal samples out of these 67 samples obtained good amplification in COI and 65 in Cytb gene, the other samples might have yielded low DNA or were highly degraded samples. Most of the sequences generated 470 bp in COI and 420 bp but trimmed to 408 bp, as few shorter sequences were downloaded from NCBI, in both COI and Cytb. The generated DNA sequences of 14 species of Indian ungulates were submitted to NCBI with accession numbers given in Table 1   With respect to the pairwise distance among the 22 ungulates species, the highest interspecific genetic divergence observed was 0.295 (29.5%) between Equus caballus and Ovis aries in COI and 0.288 (28.8%) between E. caballus and Moschiola indica in Cytb and the lowest genetic divergence was 0.022 (2.2%) in COI and 0.023 (2.3%) in Cytb in R. d. branderi and R. duvaucelii ( Table 2). The overall mean divergence was estimated at 17.2% in both COI and Cytb. The highest intraspecific variation was observed in O. aries (3.9%) and lowest (0.01) in Antilope cervicapra, Axis axis, and Muntiacus muntjak (Table 2) in COI. In the Cytb gene, the highest intraspecific sequence divergence was 0.137 (P. salvania) and the lowest 0.001 (O. aries, Bos grunniens, A. cervicapra) ( Table 2).
The estimated ABGD analysis of COI ( Figure 1) revealed a total 22 MOTUs within the studied barcode data in the dataset. One of the species Bos indicus showed similar results in ABGD analysis and BA tree topology showing two MOTUs with two sister's clades. However, the other two species depicted inconsistent results in ABGD analysis, R. d. branderi is the subspecies of R. duvacelii (Groves and Grubb 2011) but in ABGD analysis results, R. duvacelii and R. d. branderi were considered a single species. This could be confirmed by many more markers.
The topology patterns are almost alike in all the treebuilding methods (NJ, ML, and BA) examined for the studied dataset of 22 species of Indian ungulates with high bootstrap values in COI and combined sequences of both genes COI and Cytb (Figures 1 and 2). The bayesian tree is produced by combined sequences of two genes falling into four major clades. Rhinoceros unicornis and E. caballus are clustered in Clade 1. In Clade 2, P. salvania is close to Sus scrofa. Moschiola indica alone is separated in Clade 3. Clade 4 comprises the family Bovidae and Cervidae (Figure 1). In a separate analysis of COI, there are 16 sub-clades identifying the species in the entire tree with 22 distinct lineages representing all the 22 separate species in COI (Figure 2). Sub clade 1: Moschiola indica; sub clade 14: S. scrofa; Sub clade 15: R. unicornis; and Sub clade 16: E. caballus are separated as paraphylitic group from all the ungulates species (Figure 2). The rhino (R. unicornis) is closer to horse (E. caballus). Pygmy hog (P. salvania) and wild boar (S. scrofa) population are sisters to each other. Recent findings of a genomic analysis on pygmy hog reveal extensive interbreeding of wild boar (Liu et al. 2019). Indian spotted deer (A. axis) and hog deer populations are clustered together as sister species in clade 3 but paraphyletic with swamp and Sambar deer. The clade two suggested two subspecies of swamp deer population. A similar finding was reported by (Kumar et al. 2017). Rucervus duvacelii branderi is the subspecies of R. duvacelii (Groves and Grubb 2011) but the genetic distance is low (0.023) ( Table 2). We found three nucleotide deletions in R. duvaucelii compared to R. d. branderi. Both species are separated by a high bootstrap value (98%) (Figure 2). The Clade 11, four-horned antelope T. quadricornis is the sole member of the genus Tetracerus, and is placed under the family Bovidae is clustered with the nilgai (Boselaphus tragocamelus) in the Boselaphini, (Leslie and Sharma 2009). Tetracerus quadricornis and B. tragocamelus (Nilgai) are clustered together as a monophyletic group and this cluster is again paraphyletic cluster with genus Bos in Bovidae family. Antilope cervicapra and Gazella bennettii form a paraphyletic group closer to (sheep) O. aries and Capra hircus (goat) in Bovidae family. Our study compared barcode data of COI and Cytb in Indian ungulates and may serve as a baseline for future analyses of genetic diversity of ungulates (Ramon-Laca et al. 2014).
Therefore, the present study provides significant contributions toward the taxonomic identity confirmation, phylogenetic studies that can be used for better planning of conservation and management of Indian ungulates. In India, very few data are present on many species of ungulates. This study will be helpful to strengthen the global database with barcode sequences of accurately identified other mammalian species from Fecal DNA. Thus, the improvements of both taxonomic studies, generated barcode data are mandatory for more reliable and accurate results. The DNA sequences of COI and Cytb genes revealed that the obtained sequences are very helpful to delineate the Indian ungulates (Bergsten et al. 2012).