DNA methylation signatures in cord blood associated with birthweight are enriched for dmCpGs previously associated with maternal hypertension or pre-eclampsia, smoking and folic acid intake

ABSTRACT Many epidemiological studies have linked low birthweight to an increased risk of non-communicable diseases (NCDs) in later life, with epigenetic proceseses suggested as an underlying mechanism. Here, we sought to identify neonatal methylation changes associated with birthweight, at the individual CpG and genomic regional level, and whether the birthweight-associated methylation signatures were associated with specific maternal factors. Using the Illumina Human Methylation EPIC array, we assessed DNA methylation in the cord blood of 557 and 483 infants from the UK Pregnancies Better Eating and Activity Trial and Southampton Women’s Survey, respectively. Adjusting for gestational age and other covariates, an epigenome-wide association study identified 2911 (FDR≤0.05) and 236 (Bonferroni corrected p ≤ 6.45×10−8) differentially methylated CpGs (dmCpGs), and 1230 differentially methylated regions (DMRs) (Stouffer ≤0.05) associated with birthweight. The top birthweight-associated dmCpG was located within the Homeobox Telomere-Binding Protein 1 (HMBOX1) gene with a 195 g (95%CI: −241, −149 g) decrease in birthweight per 10% increase in methylation, while the top DMR was located within the promoter of corticotropin-releasing hormone-binding protein (CRHBP). Furthermore, the birthweight-related dmCpGs were enriched for dmCpGs previously associated with gestational hypertension/pre-eclampsia (14.51%, p = 1.37×10−255), maternal smoking (7.71%, p = 1.50 x 10−57) and maternal plasma folate levels during pregnancy (0.33%, p = 0.029). The identification of birthweight-associated methylation markers, particularly those connected to specific pregnancy complications and exposures, may provide insights into the developmental pathways that affect birthweight and suggest surrogate markers to identify adverse prenatal exposures for stratifying for individuals at risk of later NCDs.


Introduction
Birthweight can be affected by multiple intrauterine exposures, such as maternal smoking [1,2], maternal body mass index (BMI) [3][4][5], hypertensive disorders of pregnancy [6,7], gestational diabetes [5,8], and maternal nutrition [9]. All these factors influence foetal growth and, as a consequence, birthweight. Many observational and epidemiological studies have shown a U-shaped relationship between birthweight and an increased risk of later noncommunicable diseases (NCDs), including cardiometabolic disease [10][11][12][13], cancer [10,14] and increased mortality [10,15], with babies born at both low and high birthweight being at increased risk. Although a relatively crude marker of foetal growth, birthweight may nonetheless act as a proxy measure for the quality of the intrauterine environment and of detrimental prenatal exposures.
Several mechanisms have been suggested to underlie the associations between birthweight, and by extension the intrauterine environment, and risk of later NCDs, including DNA methylation. A role for DNA methylation in mediating the effects of early life environment on later health is supported by a substantial body of evidence from experimental models [16,17] as well as by large epigenome-wide association studies (EWAS) in humans. The latter have linked adverse early life environments induced by maternal smoking [18,19], dysglycaemia [20,21], hypertension [22] and unbalanced nutrition during pregnancy [23] to altered DNA methylation in neonatal blood and then to increased susceptibility to later NCDs. Consistent with birthweight being a proxy for prenatal exposure to environmental factors, it has also been shown to be associated with neonatal DNA methylation [24]. To date, the majority of the published work on the associations between DNA methylation and birthweight, including the meta-analysis in the PACE consortium by Küpers et al. [24], have been carried out using the Illumina Human Methylation450K array, which measures the methylation at ~450,000 CpGs in the genome [25]. This represents approximately 1.6% of the total CpGs in the genome and was designed to cover regulatory regions involved in transcriptional regulation of gene expression, and as such were located in CpG islands 5 UTR, 3 UTR and bodies of annotated genes, gene promoters, and a limited number of enhancers. With the updated EPIC array, which measures ~850,000 CpGs, including 90% of the CpGs found on the 450 K array, there is increased coverage across the genome, including a much greater number of probes at regions identified as potential enhancers by both the FANTOM5 [26] and ENCODE [27] projects. Therefore, we sought to investigate the associations between neonatal blood DNA methylation, as measured on the EPIC array, with birthweight. As well as looking at methylation changes at the individual CpG level, we sought to identify differentially methylated regions (DMRs) associated with birthweight. We then investigated the relationships between the birthweight-associated methylation changes and the differentially methylated CpGs (dmCpGs) previously reported to be associated with maternal smoking, obesity, folate levels and pregnancy hypertension/preeclampsia. Finally, we examined whether the methylation at the dmCpGs associated with birthweight may play a functional role by assessing the overlap between these dmCpGs and published cis-expression quantitative trait methylation (cis-eQTMs).

UPBEAT -UK Pregnancies Better Eating and Activity Trial
This prospective cohort study was a secondary analysis using data from UPBEAT (isrctn.org registration number 89,971,375). UPBEAT was a multi-centre RCT of a complex dietary and physical activity intervention designed to prevent GDM in obese women and reduce the incidence of large-for-gestational-age (LGA) infants [28]. The cohort comprised 1555 women, older than 16 years of age with a BMI of 30 kg/m 2 or higher, recruited between 2009 and 2014, who were randomized between 15 weeks 0 days' and 18 weeks 6 days' gestation (15 +0 and 18 + [6] weeks' gestation) to either a behavioural intervention superimposed on standard antenatal care or standard antenatal care. As the primary outcomes (GDM and LGA infants) did not differ between intervention and control arms (p = 0.68 and p = 0.40, respectively) [28], all women recruited to the trial were treated as a single cohort for the purposes of this study. Educational attainment was categorized using the highest educational qualification achieved (None, GCE (or equivalent), vocational qualification, A level (or equivalent), first degree, higher degree). Gestational weight gain was characterized as adequate, inadequate or excessive according to the USA National Academy of Medicine 2009 categories (NAM), formerly known as the Institute of Medicine (IOM). [29]. The trial protocol required an Oral Glucose Tolerance Test (OGTT) at 27 +0 to 28 + [6] weeks' but for this study a clinically pragmatic approach was adopted with OGTTs at 23 + [2] to 30 +0 weeks' (mean 27 + [5]) included. Diagnosis of GDM was according to International Association of Diabetes and Pregnancy Study Groups (IADPSG) criteria (fasting glucose ≥5.1 mmol/l and/or 1 hr ≥10.0 mmol/l and/or 2 hr ≥8.5 mmol/l in response to a 75 g oral glucose load) All aspects of the trial, including the analyses in the present study, were approved by the National Health Service Research Ethics Committee (UK Integrated Research Application System; reference 09/H0802/5) and all participants, including women aged 16 and 17 using Fraser guidelines, provided written informed consent.

SWS -Southampton Women's Survey
The SWS is a prospective mother-offspring cohort study that has assessed the diet, body composition, physical activity, and social circumstances of nonpregnant women aged 20-34 years living in Southampton, UK. Comprehensive details of SWS have been previously published [30]. Educational attainment was categorized using the highest educational qualification achieved (None, CSE, O levels, A levels, Higher national diploma, degree). Gestational weight gain was characterized as adequate, inadequate or excessive according to the USA NAM categories. The SWS cohort did not require an OGTT to be carried out for all participants, therefore, GDM status was obtained from clinical records. Follow-up of the children and sample collection/analysis was carried out under Institutional Review Board approval (Southampton and South West Hampshire Research Ethics Committee, references 276/97, 307/97, 153/99 w, and 10/H0504/30) with written informed consent.

DNA extraction
Genomic DNA (gDNA) was extracted from the buffy coat of umbilical cord blood samples from UPBEAT using the QIAamp Blood DNA mini kit (Qiagen), and using a standard high salt method for the SWS umbilical cord blood samples. Quality of the genomic DNA was assessed by agarose gel electrophoresis and quantity of gDNA was checked on the NanoDrop ND-1000 (NanoDrop Technologies).

Infinium Human OmniExpress genotype arrays
SNP genotyping was carried using Human OminiExpress-24v1.2 at the Edinburgh Clinical Research Facility. Genotyping was analysed in Illumina Genome Studio 2.0.2 using genotyping Module 2.0.2 following the manufacturer's technical note. PLINK version 1.9 beta [31] was used for SNP data management and quality control. Genotyping imputation was carried out using Sanger imputation (https://imputation.sanger.ac.uk). The UK10K [32] and 1000 Genomes Phase 3 [33] reference panels were used with the EAGLE2 imputation pipeline [34,35]. Imputed SNPs were further filtered using imputation info score (INFO >0.8), genotyping call rate 0.05, missingness 0.01, minor allele frequency 0.01, and Hardy-Weinberg equilibrium 1 × 10 −6 . SNPs with a minor allele frequency (MAF) <5% in the whole dataset were excluded from downstream analysis. GEM [36] was used to investigate the genetic influence on methylation.

Infinium Human MethylationEPIC BeadChip array
DNA methylation using the Infinium Human MethylationEPIC BeadChip array was used to interrogate DNA methylation including 17 technical replicates. 1 µg of the genomic DNA was treated with Sodium Bisulphite using the Zymo EZ DNA Methylation-Gold kit (ZymoResearch, Irvine, California, USA, D5007) and processing of the Human MethylationEPIC (Infinium Methylation 850 K; Illumina, Inc. CA, USA) platform was carried out by the Centre for Molecular Medicine and Therapeutics (CMMT, Canada) (http://www.cmmt.ubc.ca).

Infinium Human MethylationEPIC BeadChip array data processing
Infinium 850 K data was then processed using the Bioconductor package minfi [37] in R (version 3.4.2). 557 arrays from UPBEAT and 483 arrays from SWS were combined for the purpose of this study with cohort being adjusted for in the analysis (see below). The two cohorts were combined for the initial analysis as power calculations using the pwr package in R calculated that to identify robust associations that pass a Bonferroni adjustment (0.05/775,539) explaining 10% of the variance in birthweight, 758 samples would be required for 90% power. Beta-mixture quantile (BMIQ) normalization was applied to remove array biases and correct for probe design. Probes with a detection p-value >0.01 (n = 18,745) and beadcount <3 (n = 263) were removed from the dataset. CpGs known to cross-hybridize to other locations in the genome [38] (n = 14,759), coinciding with SNPs (n = 77,261), aligning to the sex chromosomes (n = 17,063) and non-CpG probes (n = 2905) were also removed from the dataset. Duplicate samples were included and hierarchical clustering grouped the pairs together. Participant sex was predicted using the getSex() function in minfi. Data was further assessed by visualization of methylation density plots and calculation of median absolute deviation (MAD) scores. Duplicates were removed after normalization but before inference (the duplicate with the lowest MAD score was removed). Eleven samples showed aberrant methylation densities and MAD scores lower than −5 so were removed from the analysis, while nine samples showed aberrant grouping on a multidimensional scaling (MDS) plot, separated by infant sex, and sex discrepencies were removed from subsequent analysis. After excluding preterm births (<37 weeks, n = 35), 985 samples were taken forward for further analysis. ComBat was applied to remove batch (plate) and chip effects [39], and the batch-corrected methylation values used for downstream analysis. The removal of batch effects using ComBat was sufficient to account for the two cohorts used in this study, with PCA analysis showing after BMIQ normalization but before adjustment for chip and cohort of each sample, PC1 separates the two cohorts from each other, while after ComBat the samples are no longer separated by cohort, removing the variation due to cohort from the data ( Figure S1). Model assumptions were assessed by visual inspection of QQ plots and p-value histograms and calculation of genomic inflation factor lambda (λ) value, defined as the ratio of the median observed chi squared value distribution to the expected distribution. Genomic inflation (λ) >1.2 was suggestive of test statistic bias and inflation; therefore the bacon package [40] in R was used to reduce this inflation. Raw data used in this analysis has been deposited in the Gene Expression Omnibus (GEO) under accession numbers GSE141065 (UPBEAT) and GSE154915 (SWS).

Infinium Human MethylationEPIC BeadChip array data analysis
As cord blood comprises a heterogeneous cell population, and no cell composition data was available for these samples, a reference-based prediction of the cell composition was carried out to obtain predicted cellular compositions using the algorithm by Houseman et al. [41], using the FlowSorted.CordBlood.450k [42] package in R, which utilizes the reference for cord blood cell compositions estimated by Andrews and Bakulski [42]. Robust linear regression models using limma [43] were run with methylation as the outcome variable. All models included the following as covariates: Maternal age, maternal smoking during pregnancy, maternal ethnicity, neonate sex, parity, Sentrix position, GDM status (30% of mothers in UPBEAT were diagnosed with GDM by IADSPG criteria [44]), gestational age, and the predicted values for B-cells, CD4 T-cells, CD8 T-cells, granulocytes, monocytes, natural killer cells, and nucleated red blood cell composition. The analysis was controlled for multiple testing with the Benjamini--Hochberg adjustment for false discovery rate (FDR). Sensitivity analyses were carried out to determine the effect of baseline maternal BMI and educational attainment on any genomewide methylation changes observed.

Statistical analysis
All statistical analysis was carried out in R (version 3.4.2). For the purpose of the combined analysis, educational attainment was combined into 5 levels: None, CSE/GCE/O levels, higher national diploma (HND)/vocational qualification, A levels, and Degree. Fisher exact test was used to determine over/under-representation of dmCpGs amongst genomic locations. The hypergeometric distribution probability test was used to determine enrichment of dmCpGs.

Participant characteristics
We used data from 995 neonates from the SWS and UPBEAT studies, representing mainly participants of European heritage (SWS = 98.5%, UPBEAT = 73.7%, combined = 85.4%), with a minority of African and Asian ethnicities. There were similar proportions of male and female infants in the two cohorts (52.7% males in the combined dataset). While maternal age and parity were similar in the two cohorts, the average pre-pregnancy BMI in the mothers was higher in UPBEAT (35.2 kg/m 2 (6.0)) compared to SWS (24.0 kg/m 2 (5.2)), as was the incidence of GDM (30% of women in the UPBEAT cohort compared to 0.9% in SWS), and maternal smoking (16.6% in UPBEAT, 13.7% in SWS). Birthweight (3522.2 g in SWS infants, 3559 g in UPBEAT infants) was not different between the two cohorts, however, gestational age was slightly less in UPBEAT compared to SWS (39.9 and 40.1 weeks, respectively, p = 0.046).

Sensitivity analysis
As a number of maternal factors such as maternal BMI [45,46], gestational weight gain [47], and educational attainment [48,49] have previously been shown to be associated with birthweight, we carried out a series of sensitivity analyses, analysing DNA methylation with respect to birthweight with adjustment for each of these factors in the original model as additional covariates. After the addition of maternal BMI into the regression model, 2911 CpGs showed a significant association with child's birthweight (FDR<0.05, table S2)), of which 243 survived the more stringent Bonferroni correction for multiple comparisons; there was a 91.6% overlap amongst the dmCpGs with an FDR <0.05 from the original analysis, and a 53.91% overlap amongst the Bonferroni adjusted birthweight dmCpGs. Of the dmCpGs with an FDR<0.05, there was also a strong correlation between the effect estimates of the birthweight associated CpGs (Pearson correlation coefficient = 0.999, ICC = 0.999, p < 2.2 x 10 −16 , figure S2a). After the addition of gestational weight gain (GWG) into the original regression model, 1195 dmCpGs were associated with birthweight (FDR<0.05, Table S3), of which 87 remained after the more stringent Bonferroni adjustment. All 87 of the Bonferroni adjusted and 98.3% of the FDR<0.05 dmCpGs in GWG adjusted analysis were also significantly associated with birthweight in the main analysis. Of the dmCpGs with an FDR<0.05, there was also a strong correlation between the effect Furthermore, when the model was additionally adjusted for maternal educational attainment, 2922 CpGs showed a significant association with child's  When restricting the analysis to participants of European descent, the findings were consistent with the main analysis; 73.6% of the CpGs with an FDR<0.05 in the restricted analysis on participants of European descent were found to be associated with the birthweight dmCpGs in the main analysis on all participants, with a strong correlation between the effect estimates of the birthweight-associated CpGs of the two analyses (Pearson correlation coefficient = 0.996, ICC = 0.991, p < 2.2 x 10 −16 , figure S2d, Table  S5). Similarly, when restricting the correlation to the 236 CpGs from the main analysis that survive Bonferroni correction, there was a strong correlation between the effect estimates between the main analysis and the restricted analysis on participants of European descent (Pearson correlation coefficient = 0.998). Due to the low number of African and Asian participants in this data set (UPBEAT n = 90 and SWS n = 25), it was not possible to repeat the analysis restricted to these populations. The findings were also consistent when neonates who were born preterm were included, with 99% of the original birthweight associated dmCpGs (at both FDR and Bonferroni <0.05, Figure S2b+d, Table S6) also being associated with birthweight when preterm infants were included. There was also a strong correlation between the effect estimates of the birthweight associated CpGs (Pearson correlation coefficient = 0.992, ICC = 0.98, p = p < 2.2 x 10 −16 , figure S2e).

Overlap between the birthweight-associated dmCpGs in the UPBEAT and SWS cohorts
As the main analysis was based on combining data from the SWS and UPBEAT cohorts, we also analysed the two cohorts separately. In the SWS only analysis, there were 1238 birthweight-associated dmCpGs (FDR<0.05), of which 76.2% overlapped with the main analysis, while in the UPBEAT only analysis, of the 928 birthweight-associated dmCpGs (FDR<0.05), 52.0% overlapped with the main analysis ( Figure 2, Table 3). Comparison of the overlap between the single cohort analyses revealed 33.1% of the dmCpGs in the UPBEAT single cohort analysis were also differentially methylated in the SWS single cohort analysis, of which 97.4% were significantly associated with birthweight in the main analysis (Figure 2c).

Differentially methylated regions in the infants' methylome are associated with birthweight
Regional analysis of differential methylation associated with birthweight identified 1250 DMRs with a Stouffer<0.05 (Table 4 and S7). The DMRs were spread throughout the genome, and not localized to a particular region, with 41.8% of the significant DMRs showing greater methylation with increased birthweight. The top DMR was located in the promoter region of the corticotropin-releasing hormonebinding protein (CRHBP), consisting of 11 CpGs spanning a region of 1448bp (Stouffer = 3.05 x 10 −24 ). Of the 1250 birthweight-associated DMRs, 40 contained a dmCpG (Bonferroni<0.05) previously reported by Küpers et al. [24] to be associated with birthweight, with a further 28 DMRs associated with the same gene for which a dmCpG had previously been reported [24].
There were no KEGG terms enriched amongst the birthweight-associated dmCpGs.
To determine whether the birthweightassociated dmCpGs showed a functional correlation with gene expression, we compared our list of 1206 dmCpGs, the sites found on both the 450 K and 850 K arrays (FDR<0.05), with a published list of 18,881 cis-expression quantitative trait methylation (cis-eQTM) [51], CpG sites known to correlate with gene expression from adult whole blood samples. Of the 1206 dmCpGs, 93 were reported to be a cis-eQTM, associated with 123 genes (enrichment p-value = 4.439 x 10 −6 ).

Maternal factors
A meta-analysis has been previously carried out by the PACE consortium to determine the effect of maternal factors, including smoking during pregnancy [18], BMI [22], obesity [22], folate levels [52], and gestational hypertension/pre-eclampsia [53], on the DNA methylation of the offspring at birth. These analyses were carried out using the 450 K array. Restricting the birthweightassociated dmCpGs (FDR<0.05) from our study to those CpGs present on the 450 K array (1206/ 2911), we investigated whether the birthweightassociated dmCpGs in this study were enriched amongst the dmCpGs found to be associated with There was a significant enrichment of the birthweight-associated dmCpGs in our study with the PACE study dmCpGs associated with gestational hypertension/pre-eclampsia (14.51%, p = 1.37 x 10 −255 ), maternal smoking (7.71%, p = 1.50 x 10 −54 ) and maternal plasma folate levels during pregnancy (0.33%, p = 0.029). Meta-analyses with respect to maternal BMI and obesity have only to date shown CpGs that pass Bonferroni correction; comparing these dmCpGs to those identified in our study we found no evidence of enrichment of maternal BMI-associated CpGs amongst the birthweightassociated dmCpGs. Similarly, restricting the analysis to only the birthweight-associated dmCpGs that remain after Bonferroni correction, we found a significant enrichment of CpGs associated with gestational hypertension/ pre-eclampsia (13.33%, p = 1.25 x 10 −34 ) and maternal smoking (7.78%, p = 1.44 x 10 −10 ), but not with maternal BMI or plasma folate levels during pregnancy, BMI, or obesity.

Influence of genetic variation on the birthweight-associated dmCpGs
As DNA methylation can be driven by genotype and act as an integrator of the individual's genotype and environmental exposure, we investigated the potential effect of genotype on the birthweightassociated dmCpGs (Bonferroni<0.05) and any interaction between genetic variation and birthweight, by carrying out a genome-wide mQTL screen using the GEM package. 9109 significant mQTLs (FDR<0.05, table S8) were identified, with 224 of the birthweight-associated dmCpGs significantly associated with the genotype at one of 8506 unique SNPs. 1153 of the mQTLs identified were cis-mQTLs, while 7956 were trans-mQTLs. Of these mQTLs, 95 have been previously reported in the ARIES mQTL dataset [56] in cord blood at birth. The top mQTL identified was rs4484654, with each copy of the alternative C allele at this position being associated with an increase in methylation level at cg14908202, located in the body of the ADAM5 gene (Figure 3a). Despite rs4484654 affecting DNA methylation levels at cg14908202, the association between DNA methylation at cg14908202 and birthweight remained significant after stratification by genotype at rs4484654 (Figure 3b), or when genotype was included as a covariate in the regression model (p = 8.09 x 10- −11 ). Similarly, the association between the remaining 221 dmCpGs, for which genotype was shown to influence DNA methylation, and birthweight remained after the inclusion of the relevant mQTLs as a covariate in the regression models.

Discussion
In this epigenome-wide study, using the Human MethylationEPIC array, we report novel neonatal DNA methylation signatures associated with birthweight. Moreover, the birthweight-associated dmCpGs were enriched amongst previously published cis-eQTMs, CpG loci known to correlate with gene expression, as well as amongst genes associated with developmental pathways, potentially suggesting a functional role for these dmCpGs in the offspring. The birthweight-related dmCpGs were also enriched for CpGs sites previously linked to maternal smoking during pregnancy, hypertension/pre-eclampsia, and maternal folate levels, suggesting that early life environmental factors and stressors during pregnancy may be important determinants of neonatal DNA methylation and potentially contribute to variations in birthweight. The identification of birthweight- associated methylation markers, particularly those connected to specific pregnancy complications and exposures, may provide valuable insights into the developmental pathways that affect birthweight and allow the identification of methylation signatures associated with adverse prenatal exposures to stratify individuals at risk of later NCDs.
Birthweight was associated with changes in neonatal DNA methylation at both the CpG and regional level, with dmCpGs being enriched amongst open seas and, when associated with a gene, under-represented within the transcriptional start site (TSS), suggesting enrichment of dmCpGs within enhancer or long range regulatory regions. Moreover, 93 of the 1206 dmCpGs (these were the birthweight-associated dmCpGs covered on both the 450 K and 850 K arrays) were reported to be a cis-eQTM, and associated with 123 genes, suggesting that differential methylation at these sites may be associated with functional changes in gene expression. In this study, we only investigated the overlap between the birthweightassociated dmCpGs and reported cis-eQTMs. Although the majority of our dmCpGs were not reported to be cis-eQTMs, a large proportion of the dmCpGs were found to be associated with trans-mQTLs. Bonder at al. [51] reported that trans-mQTL CpGs are enriched around TSSs and within Hi-C interchromosomal contacts, overlapping the binding sites of transcription factors that regulate chromatin architecture (e.g., CTCF, RAD21). This suggests that although the majority of the dmCpGs are not direct eQTMs, they may play a role in regulating chromatin architecture, which in turn regulates transcription and gene expression.
A number of previous studies, including the metaanalysis by the PACE consortium [24], have reported changes in neonatal DNA methylation signatures with respect to birthweight using the 450 K array. Of the 1206 birthweight associated dmCpGs identified in this study that were also present on the 450 K array, 415 (34.41%) were sites identified previously by Küpers et al. to be associated with birthweight, suggesting that there are methylation signatures associated with birthweight that are robustly replicated across cohorts. However, 58% of the 2911 birthweight-associated dmCpGs (FDR<0.05) identified in this present study were unique to the 850 K array, consistent with the enrichment of the birthweight associated dmCpGs to regulatory or enhancer regions, regions with enhanced coverage on the 850 K array. The top dmCpG associated with birthweight in this study was HMBOX1. This CpG is not present on the HumanMethylation 450 K array. HMBOX1 encodes a transcriptional repressor [57], which has been reported to play roles in the differentiation of bone marrow stromal cells (BMSCs), and embryonic stem cells to endothelial cells [58,59], inhibition of apoptosis [60] and telomere maintenance [61]. A dmCpG within the body of the T Cell Immune Regulator 1, ATPase H + Transporting V0 Subunit A3 (TCIRG1) gene was associated with the largest change in birthweight. This gene is a member of a family of ATPdependent proton pumps that are responsible for the acidification of intracellular compartments involved in processes, such as receptor-mediated endocytosis [62]. TCIRG1 and HMBOX1 have not previously been linked to birthweight so whether they are functionally involved in foetal growth, or simply markers of altered birthweight is unknown. The top DMR associated with birthweight was located within CRHBP. Eight out of the eleven CpGs in the CRHBP DMR are not present on the 450 K array. Of the three that are on the 450 K array, one was identified by Kupers et al [24]. CRHBP encodes for the corticotropin-releasing hormonebinding protein, which inactivates corticotropinreleasing hormone (CRH) and has been suggested to be involved in regulating foetal pituitary-adrenal function; in humans, placental CRH is known to play an important role in determining the length of gestation and the timing of parturition [63] and has been implicated in low birthweight, with higher placental CRH production in mid-gestation shown to predict lower birthweight [63]. More recently, SNPs within CRHBP have been shown to be associated with variations in birthweight in three ethnically diverse populations [64]. The differential methylation of this gene suggests that not only genetic but also epigenetic regulation of this gene and hence the CRH pathway may be an important determinant of foetal growth and birthweight, via the foetal pituitary ACTH and adrenal cortisol secretion.
Maternal factors have been strongly linked to DNA methylation changes in the infant. Interestingly, there was no overlap amongst the birthweight-associated dmCpGs with metastable epialleles suggesting that the factors that influence birthweight and associated methylation patterns operate throughout pregnancy, rather than in the very earliest stages of pregnancy. Indeed we found that the birthweight-associated dmCpGs were enriched for CpGs previously identified by the PACE consortium to be associated with maternal hypertension [53], smoking during pregnancy [18] and folate status [52] suggesting that a proportion of the dmCpGs, and birthweight itself, may be driven by these maternal factors. Of the dmCpGs enriched amongst the smoking, hypertension, or folate associated CpGs, the majority were only associated with one of these maternal factors, suggesting specificity of effect. There was, however, no enrichment amongst the birthweight-associated dmCpGs with CpGs associated with maternal BMI [22]. This may be because the analysis was restricted to the 450k birthweight-associated CpGs identified by the PACE consortium analysis, however, adjustment for maternal BMI in our analysis had minimal effect on the number or location of the significant dmCpGs, with over 90% of dmCpGs in the adjusted analyses being present in the unadjusted analyses. Furthermore, comparison of the birthweight-associated dmCpGs in the combined cohort analysis with those associated with birthweight in the UPBEAT cohort alone showed a considerable overlap between the two analyses, even though all women in the UPBEAT cohort had a BMI of over 30, consistent with the majority of identified dmCpGs being largely independent of maternal BMI. It has been suggested that maternal BMI may programme offspring adiposity through birthweight and changes in the foetal epigenome [65][66][67]. However, the finding that the birthweight associated methylation changes are not related to previously reported methylation changes associated with maternal BMI suggests either the maternal BMI/birthweight/offspring adiposity axis is unrelated to methylation, or birth weight per se is not the mediator. Consistent with this, recent mediation analyses have shown that most of the effect of prepregnancy obesity on childhood weight-related anthropometric outcomes is not mediated through offspring's birthweight [68,69] . DNA methylation can be driven by both the environment and genotype and, while we found that the methylation status of the majority of the birthweight associated dmCpGs were associated with genotype, only a minority of these showed a significant interaction with birthweight and genotype. Further large studies will be required to determine the precise contribution that both genotype and early life environmental factors make to these methylation changes and their impact on birthweight.
A strength of this study is that we analysed DNA methylation on the 850 K platform rather than the 450 K platform, giving a far greater coverage of gene regulatory regions which may have functional significance. Moreover, by combining the analysis for birthweight across the SWS and UPBEAT cohorts, the analysis included infants born to women across a much wider range of body mass indices, from 17 to 59 kg/m 2 . The majority of studies to date have used cohorts consisting solely of normal weight individuals, excluding individuals with a BMI>30 kg/m 2 . Importantly, we show that although the birthweight-related methylation signatures were associated with a number of maternal environmental factors, they were largely independent of maternal BMI.
There are some limitations to this study. Firstly we analysed DNA methylation in cord blood and the functional consequences of altered methylation at such marks are unknown. However, a number of the dmCpGs were eQTMs, suggesting that at least some of the changes in DNA methylation are associated with a change in gene expression. Secondly, although we found that many of the birthweight-associated dmCpGs were also associated with CpGs previously shown to be associated with specific maternal factors, whether these dmCpGs play a causal role in mediating the effect on birthweight or are simply markers of the change in birthweight is unknown. Even if they are markers of birthweight rather than mechanistically involved in affecting foetal growth, they may still be useful in identification of risk or exposures. The third limitation is the relatively modest sample size, although we did find robust associations with birthweight which passed a stringent Bonferroni correction, and there was considerable overlap with the meta-analysis of birthweightassociated dmCpGs previously found using the 450 K array [24].

Conclusion
Here we show that birthweight is associated with widespread changes in DNA methylation. These were for the most part independent of genotype and enriched for CpGs associated with a range of environmental exposures including maternal smoking during pregnancy, hypertension/preeclampsia and folate status, supporting the concept that early life environment is an important determinant of variations in birthweight. The identification of birthweight-associated methylation markers, particularly those associated with specific maternal factors, may provide valuable insights into the developmental pathways involved [70] and/or act as markers to identify adverse exposures and which could be used to stratify individuals at risk.