Bioarchaeological study of ancient Teotihuacans based on complete mitochondrial genome sequences and diet isotopes

Abstract Background The Teotihuacan civilisation was the largest one in ancient Mesoamerica. The Teotihuacan city was born in the north-eastern Basin of Mexico around the second century BC, reached its peak in the fourth century AD, and had cultural influence throughout Mesoamerica. At its peak, the size of the city reached more than 20 km2, and the total population is estimated to have increased from 100,000 to 200,000. However, knowledge of the genetic background of the Teotihuacan people is still limited. Aim We aimed to determine the mitogenome sequences of the Teotihuacan human remains and compare the ancient and present Mesoamericans. In addition, we aimed to identify the food habits of ancient Teotihuacans. Subjects and methods We determined the mitogenome sequences of human remains dated to 250–636 cal AD using target enrichment-coupled next generation sequencing. We also performed stable isotope analysis. Results We successfully obtained nearly full-length sequences newly unearthed from a civilian dwelling in the Teotihuacan site. Teotihuacan mitochondrial DNA was classified into the haplogroups in present and ancient Mesoamericans. In addition, Teotihuacan individuals had a diet dependent on C4 plants such as maize. Conclusion Genetic diversity varied among the Teotihuacans.


Introduction
Teotihuacan, an ancient mega-state in northern Mesoamerica, is one of the largest pre-Columbian metropolitan centres of the New World and a UNESCO world heritage.It was a flourishing metropolis from CE1-550 and is one of the most anthropogenically altered landscapes in the ancient New World.Teotihuacan had more than 150,000 inhabitants at its peak, making it the largest city in pre-Columbian America.The grand axial spine of this UNESCO World Heritage Site, known as the Avenue of the Dead, is flanked by the Moon Pyramid, the Sun Pyramid, and the Ciudadela (Citadel), which encloses the Feathered Serpent Pyramid.Despite being ravaged for nearly two millennia since their foundations were laid, these staggered silhouettes are of great interest.Being grand and magnificent, their characteristics of state monumentalism attract researchers, tourists, and locals (Sugiyama et al. 2021).The scale, density, and orthogonal cityscape of Teotihuacan cities are unparalleled in the New World.The cities, which supported an estimated 100,000 inhabitants, were structured into well-defined multi-ethnic and specialised neighbourhoods spread across an area of approximately 20 km 2 (Cowgill 2015;Manzanilla 2015Manzanilla , 2017;;Smith et al. 2019).
Agriculture is believed to have enabled human beings to obtain a stable food supply, thereby causing the population to increase, and it became the driving force behind the birth of civilisation.Wheat, rice, and maize are the three major grains consumed in the world (Sauer 1952).While rice and wheat are the two major grains consumed in the Old World, maize is the only major grain consumed in the New World (Sauer 1952;Van der Merwe and Vogel 1978).In particular, in ancient Mesoamerica, including Aztec and Teotihuacan, maize was produced in large quantities that led to the rise of civilisation.
The Teotihuacan region is not favourable for long term DNA preservation because of high temperature fluctuation in the region owing to the presence of volcanic ash.Therefore, there are few studies on DNA analysis of human remains belonging to this area.To investigate the Teotihuacan society and its origin based on the genetic origins of the individuals, we analysed the mitochondrial genome (mitogenome) of the human remains belonging to the Teotihuacan civilisations.

Samples
Human bone remains of six individuals from the Teotihuacan and Quetzalcoatl archaeological sites were sampled for ancient genome analyses (Table 1).11-Pro, 15-Pro, and 16-Pro were excavated from Teotihuacan city.13-lav and 14-lav were excavated from the la Ventilla complex.la Ventilla was near the ancient city of Teotihuacan.It is one of the ancient Teotihuacan urban areas where many people lived (Castro 1996).8-PTQ was a sacrifice at the Quetzalcoatl.The sample locations are shown in Figure 1.The need for approval to conduct this study was waived off by the Ethics Committee of the Toho University School of Medicine (A18114).

Radiocarbon dating and stable isotopic measurements
The surface of each femoral fragment was cleaned by sandblasting to remove contaminants, such as sediment.Bone samples were ultrasonicated with deionised water and immersed in acetone for 8 h.Then, the samples were immersed in a 0.2 M NaOH alkali solution for 8 h to remove humic acids.After washing with distilled water, the bone samples were lyophilised and pulverised.The pulverised bone samples were decalcified with 10 ml of 1.0 M HCl at 4 °C in cellulose tubes for 12 h.After rinsing with distilled water, the samples were lyophilised, and the gelatine collagen for each bone sample was extracted.Radiocarbon dating of each extracted collagen sample was performed using a compact accelerator mass spectrometry system based on a 0.5 MV Pelletron accelerator (the National Electrostatics Corporation) at yamagata University Accelerator Mass Spectrometry (yU-AMS), Japan (Tokanai et al. 2013). 14C dates were converted into calibrated ages using the program OxCal v4.4 (Ramsey 2009) and the international calibration datasets of IntCal20 (Reimer et al. 2020).The ratios of stable carbon and nitrogen isotopes in an extracted collagen were measured using an elemental analyser (EA, Vario MICRO cube, Elementar) coupled with an isotope ratio mass spectrometer (IRMS, IsoPrime) at yU-AMS (Kato et al. 2014).The carbon isotope ratio ( 13 C/ 12 C) and nitrogen isotope ratio ( 15 N/ 14 N) are shown as delta values (δ 13 C and δ 15 N) in comparison with the international standards.

DNA extraction, next-generation sequencing (NGS) library preparation, and enrichment of the ancient mitochondrial genome and sequencing
DNA was extracted from the femoral bones, and the NGS library was constructed as described in our previous study (Mizuno et al. 2021).While performing DNA extraction, purification, and NGS library construction, all possible precautions were taken to avoid contamination.Experiments were performed in a laboratory that was exclusively dedicated to ancient DNA research and physically isolated from other molecular research laboratories.All the experiments were performed in a laminar flow cabinet that was routinely irradiated with UV light.Frequent surface cleaning was performed routinely before and after the experiments.Facemasks, head caps, and clean laboratory coats were always worn, and gloves were frequently replaced.All procedures were performed using sterilised disposable tubes and filter pipette tips.All non-disposable glass and metallic materials were dry-heat sterilised at 160 °C for 2-4 h.
After exhaustive brushing to eliminate dirt and exogenous contaminants, the outer bone surface was mechanically removed using a sanding machine (Dremel) to remove surface contaminants.Then, the clean bone was cut into small pieces (approximately 0.5-1 cm 3 ) using an electric drill cutter (Dremel).The bone fragments were cooled with liquid nitrogen and powdered finely by grinding them in a mill (Multi-Beads Shocker MB601U; yasui Kikai).
The powdered samples (100-500 mg) were decalcified in 0.5 M EDTA (pH 8.0) for 2 h at 56 °C in a rotating hybridisation oven, and the supernatant was removed by centrifugation.The decalcification step was performed thrice, followed by DNA extraction.DNA was extracted using phenol: chloroform: isoamyl alcohol (25:24:1) followed an extraction with an equal volume of chloroform.After centrifugation, the aqueous solution was removed and concentrated to a final volume of 200 μl by centrifugation dialysis using the Amicon Ultra-15 centrifugal filter (Merck Millipore).The DNA solution was purified using the silica-based MiniElute spin columns (Qiagen) and quantified using the Quant-iT dsDNA HS assay kit (Invitrogen), according to the manufacturer's protocol.Using the DNA obtained, we prepared single-and double-stranded NGS libraries.For the double-stranded libraries, we treated the DNA with the PreCR Repair Mix (New England Biolabs).We performed in-solution target enrichment using the SureSelect custom kit designed for the human mitogenome (Agilent Technologies), as described in a previous study (Kihana et al. 2013;Mizuno et al. 2020;Mizuno et al. 2021).We built a single-stranded library and a double-stranded library from each individual, and both libraries were enriched, respectively.These libraries were sequenced on the MiSeq platform (Illumina, USA) using the MiSeq Reagent Kit v3 150 cycles.

Data processing and ancient DNA authentication
Raw sequencing reads were trimmed by removing adapter sequences and low complexity sequences using fastp ver.0.20.0 (parameters: -l 30 -y -detect_adapter_for_pe) (Chen et al. 2018).The trimmed reads were mapped against the human mitogenome reference sequence rCRS (Andrews et al. 1999) using the Burrows-Wheeler Alignment tool (li and Durbin 2009) with parameters optimised for ancient DNA (Schubert et al. 2012).Reads with mapping quality <30 were filtered out, and high-quality mapped reads were retained using SAM tools ver.1.9 (li et al. 2009).To minimise the effects of nucleotide mis-incorporations on building a consensus mitogenome sequence, the first two bases on each end of the read were clipped using BamUtil ver.1.0.14 (Jun et al. 2015).Concordance percentages were estimated using MitoSuite ver.1.0.9 (Ishiya and Ueda 2017).Finally, consensus sequences were built using MitoSuite, and their mitochondrial haplogroup assignments were designated as HaploGrep2 To assess whether the results obtained could be affected by the present-day human DNA contamination, sequences with ancient DNA-specific cytosine to thymine substitutions (C to T) at the 5′ or 3′ ends were analysed.The deamination patterns are shown in Figure S1.DNA contamination was estimated using the MitoSuite ver.1.0.9 (Ishiya and Ueda 2017).Owing to hyper-variability, C homopolymeric tract polymorphisms in regions 303-315 and 522-523 and variation at 16,519 were disregarded for tree reconstruction.Additionally, C homopolymeric tract polymorphism in region 16,182-16,188 was disregarded, except for variation at 16,189, which is one of the defining mutations of haplogroup B.  S2, locations are shown in Figure 1.

NGS data processing from previous study
We could not obtain their fasta sequences; hence, we obtained the fastq data and applied the same bioinfomatic pipeline used for the Teotihuacan samples analysed in this study.
A median-joining network (Bandelt et al. 1999) was constructed using PopART v1.7 (Ronquist et al. 2012) with the parameter "epsilon = 0".Together with the aforementioned dataset, we aligned the 13 ancient mitogenome sequences.Then, the variant sites in the coding region of the mitogenome were used as input for the network analysis (Tables S2 and S3).The resultant network was manually modified to improve visibility.
A Maximum Parsimony (MP) tree was constructed for whole mitochondria sequences of the 13 ancient and 122 present-day Mesoamericans by using MEGA11 (Tamura et al. 2021).There were a total of 16,557 positions in the final dataset (Codon positions for 1st + 2nd + 3rd + Noncoding).The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm (pg.126 in Nei and Kumar 2000) with search level 1 in which the initial trees were obtained by the random addition of sequences (10 replicates).A bootstrap test with 1000 replicates was conducted (Felsenstein 1985).The tree is drawn to scale, with branch lengths calculated using the average pathway method (see pg. 132 in Nei and Kumar 2000).

Radiocarbon dating of Teotihuacan
The measured atomic carbon and nitrogen ratios (C/N), 14 C ages, and calibrated ages of the six bone collagen samples are summarised in Table S1.The C/N ratio ranged from 3.3 to 3.4.These values were within the recommended range (2.9-3.6) for well-preserved bone collagen (DeNiro 1985), suggesting that the extracted collagen samples were suitable for radiocarbon dating.The 14 C age of the 13-laV sample was 1,494 ± 20 years BP, which was the latest 14 C age among the six collagen samples.The calibrated ages (95.4% probability) were in the range of 548-607 cal AD (90.8%) and 625-636 cal AD (4.6%).Figure 2 shows the probability distributions of the calibrated data for the six collagen samples.Median sample ages range between 332 and 583 cal AD.

Stable isotopic measurements
The measured δ 13 C and δ 15 N values of the six bone collagen samples are summarised in Table S1. Figure 3 shows the relationship between δ 13 C and δ 15 N isotope values in bone samples compared to that determined from previous data (Kennett et al. 2017;Kennett et al. 2020).The δ 13 C values ranged between −9.0‰ and −7.8‰, and the mean and SD were −8.4 and 0.3, respectively.The δ 15 N values ranged between 8.1‰ and 11.4‰, and the mean and SD were 9.2 and 0.5, respectively.The stable isotope analysis results indicate that the Teotihuacans in this study were highly dependent on C4 plants (Figure 3).

Mitogenome sequences of Teotihuacan
We showed the deamination pattern (Figure S1) with the highest C-to-T misincorporation in most 3′ and 5′ ends of DNA fragments.The deamination patterns observed in DNA indicate an ancient provenance of the sequence molecules.DNA contamination was estimated (Table 1).We successfully obtained nearly full-length sequences newly unearthed from civilian dwellings in the Teotihuacan site.All the sequences belonged to haplogroup A2, B2, or D1, which are representative for the present-day native Americans.Among the four individuals belonging to haplogroup A2, their sequences were classified into known haplogroups:   A2f2 (16-Pro), A2ae (8-PTQ), and A2d1 (14-lav and 13-lav).Although 14-lav and 13-lav belonged to the same haplogroup A2d1, they showed different haplotypes.Furthermore, 11-Pro belongs to haplogroup D1i, and 15-Pro belongs to haplogroup B2c1; among them 11-Pro and 8-PTQ had more singletons than the other Teotihuacan individuals, even though their depths were sufficient.More singletons could be due to sequencing errors, but we consider that unlikely due to the high sequencing depth.All the sequences obtained had an average depth between 22 and 129×, and the percentage of concordance across the mitogenome was > 0.987 (Table 1).The percentage of concordance shows the number of bases concordant with the assembled consensus sequence at each site.
Figure 4 shows a median-joining network for the 13 ancient (this study and Morales-Arce et al. 2017, 2019), and 122 present-day Mesoamerican individuals (Mizuno et al. 2014(Mizuno et al. , 2017)).Mutations were scored relative to rCRS, and coding regions were extracted.Only the ancient sequences with a coverage depth of more than 15 were used (Table S3).The present-day sequences obtained using PCR and Sanger sequencing belonged to nine Maya, 25 Mazahua, and 88 Zapotec populations.All the ancient and present-day sequences were divided into haplogroups A, B, C, and D. Haplogroup A being the largest group followed by groups B, C, and D. In this study, there is no observation that the present-day sequence is branching off from any ancient sequence's node in the network.
Figure 5 shows one of the most parsimonious trees (length = 491).The evolutionary history was inferred using the Maximum Parsimony method with a total of 16,557 positions.The consistency, retention, and composite indices are 0.835031, 0.963431, and 0.804494, respectively for all the sites, while they are 0.714789, 0.963431, and 0.688650, respectively for the parsimony-informative sites.And, in parentheses, for the parsimony-informative sites.There is no observed present-day sequence branching off of any ancient sequence's node in the tree.Thus, no ancient haplotype in the dataset is a direct ancestor of any of the modern haplotypes so far observed.

Discussion
Mitogenomes of native Americans have been traced back to four major haplogroups and a fifth haplogroup.These haplogroups were initially named A, B, C, D, and X (Torroni et al. 1993a, 1993b, Brown et al. 1998) but are now detailed subdivisions of haplogroups based on an increasing amount of genetic data; under which framework all the native American haplotypes fall into five clusters/haplogroups known as A2, B2, C1, D1, and X2a (Forster et al. 1996;Bandelt et al. 2003).Complete mitogenome sequences have been published for indigenous American people (Ingman et al. 2000;Kumar et al. 2011) allowing the identification of other different sub-haplogroups as ancestral founding lineages (Achilli et al. 2008).The Teotihuacan mitochondrial DNA sequences obtained in this study were classified into haplogroups A2, B2, and D1.These haplogroups are found in the present native American people.
We previously proposed the presence of two types of population clusters based on haplogroup frequencies among present Mesoamerican indigenous people: Centro-Mesoamerican and Pan-American clusters (Mizuno et al. 2014).The former cluster was located from central Mesoamerica to Central America, and the latter cluster was located outside Centro-Mesoamerica.The indigenous populations in the north-western external areas of Mesoamerica, such as Mazahua, Cora, Huichol, Pima and Tarahumara, form a cluster with those residing in the region spanning from the Isthmian land bridge to South America, such as Wounan, Embera, Zenu, Ingano, and Piapoco.The indigenous populations classified into the Centro-Mesoamerican and Pan-American clusters show higher frequencies of haplogroups A and B, respectively.Four out of six ancient Teotihuacan individuals examined here have haplogroup A, and the remaining two have haplogroups B and D, respectively.Recently, Kennett et al. (2022) proposed the northward movement of ancient Mesoamerican people from the Isthmian land bridge to South America, probably near Costa Rica and Panama, correlating with maize introduction from the South.This Isthmian land bridge corresponds to the southern end of the Centro-Mesoamerican population cluster.Our previous and present results are well consistent with their scenario of human dispersal in Mesoamerica.
The transition from hunting and gathering to agriculture is believed to have driven the population growth.Agricultural cultivation in East Asia included rice, millet, and foxtail millet, while in Mesoamerica maize was primarily cultivated.Maize was first cultivated in Mesoamerica approximately 9000 BP using theosinte/teosintes as the original species (Doebley 1990;Bennetzen et al. 2001;Matsuoka 2002;Smalley and Blake 2003).Based on the appearance of maize excavated from archaeological sites and its radiocarbon dates, the first steps in maize cultivation are believed to have occurred in the Balsas region of Mexico in approximately 9000 BP.In the Mexican highlands, genes for cultivation traits were not fixed in 5300 BP, but there were maize varieties productive enough to become staple grains by 4300 BP, indicating that maize did not become a staple grain until around this period (Kennett et al. 2017;Kennett et al. 2020).Since the δ 13 C and δ 15 N isotope values almost belong to the group of the staple maize diet, it is indicated that the Teotihuacans in this study were highly dependent on C4 plants (Figure 3).
In this study, we described the mitogenome sequences and eating habits of the ancient Teotihuacans.The collagen analysis suggested that cultivated maize was the staple food of the ancient Teotihuacans.

A
total of 15 sequences from GenBank PRJNA 360145 (Morales-Arce et al. 2017) and 13 sequences from GenBank PRJNA 423230 (Morales-Arce et al. 2019) were compared with the sequences obtained in this study.Jicaro is in northwest Costa Rica (800-1250 AD), Paquime and Convento in northwest Mexico (700-1,450 AD) (Morales-Arce et al. 2017), and Tlatelolco is located just north of Tenochtitlan (1350-1519 AD) which was the commercial centre of the Aztec Empire, so these two cities have been considered as twin cities (Morales-Arce et al. 2019).Sample information is shown in Table

Figure 2 .
Figure 2. Probability distributions for the calibrated dates of six collagen samples from Teotihuacan city, Templo de Quetzalcoatl, and la Ventilla.The plus mark indicates the median of the each.

Figure 3 .
Figure3. between δ 13 C and δ 15 n for six ancient samples from Teotihuacan city, Templo de Quetzalcoatl, and la Ventilla sites (this study) together with ancient mesoamerican stable isotope analysis results.Black circle for Teotihuacan, black rhombus for Templo de Quetzalcoatl, and black triangle for la Ventilla.grey circle, square, Triangle were pre-maize diet, Transitional maize diet, and staple maize diet, respectively.These isotopic values from Classic Period maize agriculturalists from across the maya lowlands(Kennett et al. 2020).

Figure 5 .
Figure 5. maximum Parsimony tree of whole mitogenomes for 13 ancient and 122 present-day mesoamericans.one of the most parsimonious trees (length = 491) is shown.The consistency index is 0.835031 (0.714789), the retention index is 0.963431 (0.963431), and the composite index is 0.804494 (0.688650) for all sites and parsimony-informative sites (in parentheses).The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates and higher than 50%) are shown next to the branches.Dark purple: ancient mesoamericans including Teotihuacan city, Templo de Quetzalcoatl, and la Ventilla, Paquimé, Jicaro, Convento, and Tlateloco.green: present-day maya.magenta: present-day mazahua.Blue: present-day Zapotec.mitogenome haplogroups A2, B2, C1, D1, and D4 were assigned.The other possible trees equally parsimonious are shown in figure s2.

Table 1 .
ngs results of mitogenome of Teotihuacan city, Templo de Quetzalcoatl, and laVentilla individuals.
a Concordance represents consistent base per site of an assembled consensus sequence.b mtDnA contamination is based on estimetied mismatch percentages.