Pet and turtle: DNA barcoding identified twelve Geoemydid species in northeast India

Abstract Geoemydid turtles are one of the most imperilled fauna on the planet, with nearly half of them are threatened with extinction due to bushmeat crisis, traditional medicine, and the illegal pet trade. Classical taxonomy often fails to identify the pet-kept turtle specimens due to amorphous form, unusual shell colouration owing to poor storage in captivity or intensely tinted for high demanding value. The DNA barcoding technique has evidenced as a supportive tool for accurate species identification in systematics research and discerned the nameless taxa in forensic sciences. We tested the effectiveness of DNA barcoding tools for identifying the pet-kept Geoemydid turtle in northeast India. The 36 generated sequences are readily delineated into 12 Geoemydid species using molecular data. The overall mean genetic distance of the studied Geoemydid turtles dataset is 15.3% and ranges from 3.4% to 22.6% between the species. The NJ, ML and Bayesian phylogeny also resulted monophyletic clustering and discriminated all the studied species. The present study contributes DNA barcode sequences of Geoemydid turtles in the global database and also affirms the on-going illegal pet trade of highly threatened species in northeast India.


Introduction
Geoemydid turtles are one of the most ornamental and highly threatened Chelonian groups in the world (van Dijk et al. 2000;Buhlmann et al. 2009). Since the sixteenth century, the animals have been associated with the human for several mythological beliefs and recreational purposes (Klemens 2000;Fordham et al. 2007;Chen et al. 2009). The family Geoemydidae comprises of 71 species within 19 genera and two subfamilies worldwide (Turtle Taxonomy Working Group 2017). Among them, 13 species within eight genera are reportedly distributed throughout northeast India (Das 2001;Buhlmann et al. 2009). They mostly inhabit in the wild vegetation and have the burrowing nature of living under the leaf litters with exception of safe exposure to sunlight for basking. Most of them are dwellers surrounding freshwater ecosystems, and a few are adapting to estuarine or terrestrial habitats (Ernst et al. 2000). The genera Batagur and Hardella are reported to be confined to the river, while Pangshura, Melanochelys, Geoclemys, Cuora, Cyclemys, and Morenia are found in small hill streams or stagnant water bodies (Praschag et al. 2007).
Northeast India is also known as a turtle trading hub and dozens of turtles are being hunted by the local peoples in every year. The peoples in this region consume turtle meat for therapeutic practices, use burning ash of the turtle shell as traditional medicine, keep the dry shell as holy things and live specimens as pets. Thus, the overexploitation of turtles has been causing massive declination of the wild population in this region (Kundu et al. 2013(Kundu et al. , 2015(Kundu et al. , 2016(Kundu et al. , 2018. Several baseline studies have been fulfilled to know the morphology, genetic information, and distribution pattern of wild living Geoemydid turtles in northeast India (Das 2001;Praschag et al. 2007;Kundu et al. 2016). However, investigation of petkept turtles and their genetic identity was never being assessed in this region to estimate illegitimate threats on this threatened taxa.
Besides, the practice with classical taxonomy sometimes fails to identify the pet-kept turtle species due to the absence of important phenotypic characters. At this juncture, the interventions of molecular approaches are worthwhile to identify the species (Alacs et al. 2007). The molecular tools, DNA barcoding is evidenced as a supporting method in classical taxonomy and systematics research and have rendered clearer understanding to identify the extant fauna throughout the globe (Hebert et al. 2003). The DNA based species identification has been attempted to recognize the freshwater turtles worldwide, including India (Reid et al. 2011;Kundu et al. 2013Kundu et al. , 2015Kundu et al. , 2016Kundu et al. , 2018. Nevertheless, the DNA barcode information of Indian Geoemydid turtles is still deficient in the global database. Thus, the study aimed to identify the pet-kept Geoemydid turtles through DNA barcoding. The generated DNA data would enrich the global database as well as helps to identify Geoemydid species and track the illegal turtle trade hereafter.

Materials and methods
Study site, ethical concern, and sampling The pet-kept Geoemydid turtles were surveyed throughout northeast India after acquiring prior permission from the wildlife authority during 2010-2013. Most of the studied samples were collected from the villages of Assam state and a few individuals were collected from Tripura and Mizoram (Table 1, Figure 1A). The animals were handled with sufficient care and a meagre amount of blood or saliva was collected from each animal; solely for scientific research. The dry tissue samples were also collected from the carapace shells, are being kept after consuming the meat in the rural houses. The blood samples were collected from the hind limb by insulin syringe and stored into EDTA containing vial. The saliva samples were collected from the buccal cavity by swabbing and dipped into the 200 ml TES buffer (50mM Tris-HCl, 25mM EDTA, 150 mM NaCl). The samples were stored at 4 C in the field and subsequently at À20 C in the laboratory before DNA based investigation.

DNA isolation, PCR, and sequencing
The total genomic DNA was extracted followed by QIAamp DNA Mini Kit standard protocol. The published primer pair, FishF1-5 0 TCAACCAACCACAAAGACATTGGCAC3 0 and FishR1-5 0 TAGACTTCTGGGTGGCCAAAGAATCA3 0 (Ward et al. 2005) was used for amplification of partial mitochondrial cytochrome c oxidase subunit I (mtCOI) ($650 bp) gene segment in a Veriti V R Thermal Cycler (Applied Bio systems, Foster City, CA). The 25 ml PCR mixture contains 10 pmol of each primer, 100 ng of DNA template, 1 Â PCR buffer, 1.0-1.5 mM of MgCl2, 0.25 mM of each dNTPs, and 0.25 U of Platinum Taq DNA Polymerase High fidelity (Invitrogen, Life Science Technologies). PCR conditions were: initial denaturation at 94 C (2 min) followed by 30 cycles at 94 C (45 s), 50 C (45 s), and 72 C (1 min), and a final elongation at 72 C (8 min). The PCR amplified products were checked in 1% agarose gel containing ethidium bromide (10 mg/ml). Further, the PCR products were purified using QIAquickR Gel extraction kit (QIAGEN Inc., Germantown, MD), and cycle sequencing products were cleaned by using standard BigDye X Terminator Purification Kit (Applied Biosystems, Foster City, CA). Sequencing was done bi-directionally in 48 capillary array 3730 DNA Analyzer (Applied Biosystems, Foster City, CA) following Sanger sequencing methods.

DNA barcode sequence quality control measures and analysis
The generated chromatograms that represent sequences of both DNA strands were obtained for each sample. The noisy sequences were trimmed at both end and greater than 2% ambiguous bases were discarded from the generated chromatograms, using a quality value of >40 for bidirectional reads. The SeqScape software version 2.7 (Applied Biosystems Inc., Foster City, California, USA) was used to analyze to obtain the consensus sequences from the forward and reverse chromatograms. The sequences were submitted to the GenBank database for acquiring accession numbers. The homology search of the generated sequences was performed through nucleotide BLASTn search in the GenBank database. Based on the similarity search, the reference sequences showing highest identity matches for each of the studied species (n ¼ 11) were also retrieved from the GenBank. The generated and acquired sequences were aligned with the ClustalX program (Thompson et al. 1997) to make a comprehensive dataset with equal length and common start position. The mean genetic divergences were calculated using Kimura 2 parameter (K2P) and the best evolutionary model (HKY þ G) was selected with the lowest Bayesian information criterion (BIC) score (7490.52) by MEGA6.0 (Tamura et al. 2013). The DAMBE5 software was used to test the sequence substitution saturation of mtCOI gene within the studied Geoemydid species (Xia 2013). Phylogenetic analysis was performed under the optimality criteria of Neighbour-Joining (NJ) and Maximum Likelihood (ML) by using PAUP Ã 4.0b10 (Swofford 2002) with 1000 bootstrap support and Bayesian analysis (BA) using MrBayes 3.1 (Ronquist and Huelsenbeck 2003). For BA, Markov Chain Monte Carlo (MCMC) was performed with four chains for 1,000,000 generations, with trees sampled every 100 generations (the first 1000 trees were discarded as 'burn in'). MCMC analysis was stationary when maximum standard deviation of split frequencies reached below 0.01 and potential scale reduction factor (PSRF) approached 1.0. Sequence of Manouria emys (family: Testudinidae) was incorporated to the dataset as the out-group in the Phylogenetic analysis.

Results and discussion
The molecular taxonomy successfully delimitated most of the globally distributed extant Geoemydid species (Spinks et al. 2004;Praschag et al. 2007). The utility of mtCOI gene also successfully tested to identify the Geoemydid species (Reid et al. 2011;Kundu et al. 2016). Therefore, having found the competency of this molecular tool in accurate species identification and delineation, it is essential to generate DNA barcode data (Ihlow et al. 2016). The pet turtles are generally kept alive inside the small artificial water tank, barrel or aquarium without proper management and the shells are often intensely tinted by colours or chemicals for recreation purposes or high demanding commercial value. Thus, due to the lack of stable morphological characters, the classical taxonomy has its limitations to identify the pet-kept species occasionally. However, we have not observed any unusual morphological characters in the studied specimens which led to assume the possible hybridization and assure through more molecular markers. In this context, the current study dealt with the DNA data of pet-kept Geoemydid species collected from northeast India for accurate species-level identification and genetic distinctiveness. The study generated a total of 36 DNA barcode data of 12 pet-kept Geoemydid turtles collected from northeast India. The 11 samples shows 98%-100% identity match with the reference sequences in the GenBank databases and confirmed as P. tentoria (n ¼ 6), P. smithii (n ¼ 2), and P. tecta (n ¼ 3). The two generated sequences shows 93% similarity with P. smithii, which infers an insignificant identity match due to lack of reference barcode in the database. In order to resolve this issue, the specimens were revisited and examined carefully to collect the necessary morphological data which finally led to identifying the species as P. sylhetensis. This study provides the first DNA barcode data information for P. sylhetensis from its known distribution region in northeast India. This contribution in the global database further assists to estimate the deep phylogenetic relationship among Pangshura congeners and intraspecific genetic distance from other geographical areas. The other 25 samples also shows 98%-100% similarity with the reference sequences in the GenBank database and identified as Batagur dhongoka (n ¼ 1), Hardella thurjii (n ¼ 2), Morenia petersi (n ¼ 3), Geoclemys hamiltonii (n ¼ 3), Cuora amboinensis (n ¼ 4), C. mouhotii (n ¼ 6), Melanochelys tricarinata (n ¼ 2), and M. trijuga (n ¼ 2). Thus, the present investigation recorded three endangered, five vulnerable, two lower risk/least concern, and two lower risk/near threatened Geoemydid turtles in northeast India (IUCN 2018). Due to the lack of knowledge about the species, their habitats, and proper awareness, the rural peoples in this region are often hunting the highly threatened freshwater turtles unknowingly during fishing practice or dry leaf and wood collection in wild. After interaction with the local peoples, the study also perceived that the population of lower risk categorized species in IUCN red list of threatened species are becoming rare in the wild. The overall mean genetic distance of the studied Geoemydid turtles dataset is 15.3% and ranges from 3.4% to 22.6% between the species. The genus Pangshura shows 3.4%-7.8% genetic distance between four known species in the dataset, whereas Cuora and Melanochelys with their two congeners show 8.3% and 4.2% genetic distance, respectively. In the total dataset, the six studied species, P. smithii, P. tecta, B. dhongoka, H. thurjii, M. petersi, and M. tricarinata shows 0% genetic divergence within the species in this current dataset. Further, the other six species, P. tentoria, P. sylhetensis, G. hamiltoni, C. amboinensis, C. mouhotii, and M. trijuga shows 0.2%, 2%, 1.2%, 2.1%, 1% and 0.4% within species genetic distance, respectively. The less genetic divergence (0%) within few studied species revealed their restrict gene pool in the studied locality, which warrants further sampling throughout their known distribution to perceive more clear understanding. The high genetic distance (!2%) within the group of P. sylhetensis and C. amboinensis depicts the possible cryptic diversity within the species. The four congeners of Pangshura, P. tentoria, P. smithii, P. sylhetensis, and P. tecta shows 3.4%-7.8% genetic divergence between the species. Further, the C. amboinensis and C. mouhotii shows 8.3% and M. tricarinata and M. trijuga shows 4.2% genetic distance between the species in the current dataset. Sequence saturation analysis of mtCOI gene of the studied sequences showed the increase of frequency of both transitions and transversions linearly along with the divergence value. The NJ, ML and Bayesian phylogeny of the studied dataset shows cohesive monophyletic clustering of all the studied species with 100 bootstrap supports and 0.9-1 posterior probability ( Figure 1B). The database reference sequences of the representative species also depicted cohesive clustering with the generated sequences in the phylogenetic tree. The accurate identification of any taxa is vital before initiating any other allied assessment, i.e. physiology, ecology, behaviour, population estimation, and conservation. To assess the population structure of any species, researchers are not only concerned about the wild living individuals but also concerned about their threats estimation, records of enforcement seizures, pet trade etc. within and beyond their known geographical regions (Alacs et al. 2007;Mendiratta et al. 2017). For example, many new Geoemydid species have been described from the food and pet markets in the last two decades (Kou 1989;Parham and Shi 2001) and the genetic data corroborated the presence of three non-native turtles in northeast India kept as pets (Kundu et al. 2016). Thus, the present review of pet-kept Geoemydid turtles through molecular approaches has not only identified the species but also estimated illegitimate threats, pet trade scenario in the studied area and contributed genetic information in the global database. The results specified the serious concern in view of illegal wildlife hunting as well as protection of threatened Geoemydid turtle population in natural settings. The current approach can be useful for monitoring trade, the origin of seizures, and directing enforcement to protect overexploited turtle populations in the future.