Genome-wide oxidative bisulfite sequencing identifies sex-specific methylation differences in the human placenta

ABSTRACT DNA methylation is an important regulator of gene function. Fetal sex is associated with the risk of several specific pregnancy complications related to placental function. However, the association between fetal sex and placental DNA methylation remains poorly understood. We carried out whole-genome oxidative bisulfite sequencing in the placentas of two healthy female and two healthy male pregnancies generating an average genome depth of coverage of 25x. Most highly ranked differentially methylated regions (DMRs) were located on the X chromosome but we identified a 225 kb sex-specific DMR in the body of the CUB and Sushi Multiple Domains 1 (CSMD1) gene on chromosome 8. The sex-specific differential methylation pattern observed in this region was validated in additional placentas using in-solution target capture. In a new RNA-seq data set from 64 female and 67 male placentas, CSMD1 mRNA was 1.8-fold higher in male than in female placentas (P value = 8.5 × 10−7, Mann-Whitney test). Exon-level quantification of CSMD1 mRNA from these 131 placentas suggested a likely placenta-specific CSMD1 isoform not detected in the 21 somatic tissues analyzed. We show that the gene body of an autosomal gene, CSMD1, is differentially methylated in a sex- and placental-specific manner, displaying sex-specific differences in placental transcript abundance.


Introduction
The placenta is a crucial, yet short-lived, organ at the interface between mother and the developing fetus. Among its many functions, it mediates nutrient uptake and gas and waste exchange, and acts as protective barrier against some pathogens. Many adverse pregnancy outcomes, such as stillbirth, neonatal death, growth restriction, cerebral palsy (due to oxygen deprivation at birth), and preterm labor can be attributed to compromised placental function [1]. Moreover, the intrauterine environment is also a key determinant of the risk for cardiovascular disease in adult life [2]. Fetal sex is itself often associated with increased risk for specific pregnancy complications [3] and, since the placenta is a fetal organ, it becomes a suitable tissue to study the molecular mechanisms that may underlie sex-specific adverse pregnancy outcomes.
DNA methylation is one of the most intensely studied epigenetic regulatory mechanisms and determines cellular phenotypes, commonly through the regulation of transcript abundance. Previous studies have demonstrated the role for DNA methylation in placental genomic imprinting and X inactivation [4][5][6]. Studies that applied whole-genome bisulfite sequencing to analyze the placental methylome have either focused on understanding genomic imprinting [7] or been limited to the analysis of a single placenta [8] and could not, therefore, address sex-specific patterns. Although a few studies have examined sex-specific differences in DNA methylation, they have either assayed DNA methylation for only 1-2% of CpG sites [9] or specifically focused on X chromosome differences [4].
Given the vital role of the placenta in maintaining a healthy pregnancy and the association between fetal sex and specific pregnancy complications, we sought to characterize sex-specific differences in global DNA methylation and identify sex-specific differentially methylated regions from placental whole-genome oxidative bisulfite sequencing data sets (WGoxBS). We tested the hypothesis that sex-specific differential methylation is associated with transcript abundance using an extensive placental RNA-seq data set.

Genome-wide methylation differences by fetal sex
We carried out whole-genome oxidative bisulfite sequencing (WGoxBS) in two female and two male term placentas from well-characterized healthy pregnancies of the Pregnancy Outcome Prediction Study Cohort [10][11][12] (Supplementary  Table S1). We obtained a total of 1.7 billion uniquely mapped reads, this equates to an average of 26x genome coverage per sample (Supplementary Table S2). For our initial analyses, we used the 14,960,649 strand-specific CpG sites that were sequenced at a depth of at least 10x and shared among all four individuals (Supplementary Figure S1). As previously documented [8], the placenta has lower methylation levels than other tissues, with a global CpG methylation value of 53.5% (autosomes: 53.6%, X-chromosome: 46.6%). Methylation of single CpG sites showed a typical bimodal distribution pattern ( Figure 1A), as previously reported [8,13], whereas CpG islands ( Figure 1D) were globally hypomethylated, as expected [14]. Gene bodies ( Figure 1B) were the most methylated regions at 54%, followed by CpG island shores (47%), enhancers (35%), promoters (20%) ( Figure 1C), and CpG islands (16%). In addition to finding that male placentas were 1.6% more methylated than female placentas, lowly methylated (<50%) single CpG sites, gene bodies, and promoter regions were more common in females ( Figure 1A-1D). This global male-hypermethylation was reproduced using an in-solution target capture methodology (SureSelect) for the two samples assayed by WGoxBS (technical replicates; 1% more methylated in male). It was also reproduced in six additional samples, three from female placentas and three from male placentas (i.e., biological replicates; 1.5% more methylated in male) (Supplementary Table S3). However, further investigation of eight other tissues from the Schultz et al. [15] datasets suggests this male hypermethylation could be tissue-specific, as female-hypermethylation is observed in the adrenal and spleen (Supplementary Table S3).
Having observed global placental methylation differences between the sexes, we investigated the chromosome-wide methylation differences between females and males ( Figure 2 and Supplementary Figure S3). Chromosome X CpGs were globally hypomethylated in females, whereas there was little evidence for sex-specific differences in overall methylation in the autosomes ( Figure 2). Next, we determined whether this hypomethylation of the X chromosome in females varied according to the genomic feature. We found that the X chromosome was globally hypomethylated in females, except at CpG islands, where females were hypermethylated compared to males; this was also reproduced in the SureSelect datasets (Supplementary Figure S4).

Sex-specific differentially methylated regions
In order to identify sex-specific differentially methylated regions (DMRs), we analyzed our data using the Bioconductor package RnBeads [16]. We created Manhattan plots analyzing DMRs by 5 kb tiles, gene bodies, promoters, and CpG islands ( Figure 3A-3D). Consistent with the chromosome specific analysis of global methylation, most highly ranked DMRs were located on the X chromosome (Supplementary Table S4). For autosomal regions, apart from isolated regions with highly  Table S3. ranked sex-specific DMRs, there was strong evidence for a cluster of multiple sex-specific DMRs located on chromosome 8 ( Figure 3A).
The most highly ranked 5 kb DMR on chromosome 8 shows average methylation values of 36% and 72% in females and males, respectively. This DMR and 19 additional 5 kb DMRs ranked amongst the top 100 differentially methylated regions (Supplementary Table S4). All 20 DMRs were located within a 225 kb locus (8:2,795,000-3,020,000) of the CUB and Sushi Multiple Domains 1 (CSMD1, 8:2,792,875-4,852,494) gene, which is located on the minus strand ( Figure 4). Consistent with the highest ranked chromosome 8 DMR, the average methylation value across these twenty 5 kb DMRs was 42% and 71% in females and males, respectively.
We then studied the CSMD1 and X chromosome DMRs using SureSelect technical (n = 2) and biological replicates (n = 6) (Supplementary Table S1). The in-solution target capture methodology yielded 2.4 and 1.3 million strand-specific CpG sites that were sequenced at a depth of at least 10x and shared among the two technical and six biological replicates, respectively (Supplementary Table S5). We observed good concordance (R 2 = 0.81 females, R 2 = 0.95 males) in the level of methylation between technical and biological replicates of X chromosome CpGs present in the 100 top-ranked 5 kb tile, gene-body, promoter, and CpG island regions shown in Figure 3A-3D (Supplementary Table S4 and Figure S5).
Females were consistently and significantly hypomethylated compared to males over the extensive differentially methylated region of CSMD1 ( Figure 4). With the in-solution target capture methodology, the level of methylation of individual CpG sites within these 20 chromosome 8 DMRs was statistically significantly different between females and males, both for the technical and biological replicates (P = 9.2 £ 10 ¡38 and P = 1.1 £ 10 ¡29 , respectively, Fisher's Exact test). More importantly, the average level of methylation within the chromosome 8 DMR in females and males (8:2,795,000-3,020,000) was consistent among all three data sets: WGoxBS (47% vs. 71%, n = 4), technical replicates (52% vs. 72%, n = 2), and biological replicates (52% vs. 68%, n = 6). Having validated this extensive sex-specific DMR, we then determined whether this sex-specific methylation pattern is placenta-specific or observed in other tissues. We obtained data from Schultz et al. [15] and studied the same region in 8 somatic tissues, comparing methylation levels in 1 male and 1 female. Interestingly, sex-specific differential methylation of the CSMD1 region was only observed in the placenta and not in the other 8 tissues studied (Supplementary Figure S6). The average level of methylation in females and males in these tissues was 82% and 81%, respectively. We also examined the methylation profile in 6 other tissues obtained from the NIH Roadmap Epigenomics Mapping Consortium [18]. Although only a single male [brain germinal matrix (87%), hippocampus (89%), testis spermatozoa (84%)] and a single female [fetal muscle (79%), fetal thymus (87%), breast luminal epithelium (80%)] sample were available, no obvious malefemale methylation differences were observed. The 14 non-placental tissues were highly methylated at this locus (range: 79% to 89%), regardless of the sex and the site where the bisulfite sequencing data was generated (Supplementary Figure S6).

Sex-specific CSMD1 differential methylation and transcript abundance
To understand the potential functional significance of this extensive and sex-specific differentially methylated region of CSMD1 in the placenta, we quantified the abundance of CSMD1 RNA from a placental RNA-seq data set generated from 64 healthy female and 67 healthy male placentas (Supplementary Tables S6 and S7). At the gene level, female samples had 1.8-fold lower transcript abundance than male samples (P = 8.5 £ 10 ¡7 , Mann-Whitney test). As transcript abundance typically is negatively correlated with the level of methylation in the promoter [19], we compared the methylation level of the annotated CSMD1 promoter region (8:4,851,994-4,853,994) in female and male placental samples. We did not observe sex-specific methylation differences at the promoter region (Supplementary Figure S7). Intriguingly, we noted that most of the placental RNA-seq reads aligned towards the 3' end of the CSMD1 gene, the same region where the 20 differentially methylated regions were identified ( Figure 5). The first canonical exon [i.e., first exon near the annotated transcription start site (TSS), 8:4,851,994)] of CSMD1 did not have a single placental RNA-seq read. This suggests that the CSMD1 placental transcript does not start with the canonical TSS.
Using RNA-seq data from 131 individual placentas, we reconstructed the CSMD1 transcript. This showed that the first exon, and thus the TSS, is actually 1.8 mega base pairs downstream the canonical first exon (Supplementary Figure S8). This suggests that the placenta uses an alternative promoter that is different from the reference gene annotation (e.g., GENCODE). Then, we re-examined the methylation level at the likely promoter region (1,500 bp upstream and 500 bp downstream the putative TSS) and found that females had higher methylation values (38%) compared to males (23%) in the WGoxBS data set (P = 1.7 £ 10 ¡13 , Supplementary Figure S9). This pattern of higher methylation in females was also validated technically (38% vs. 29%) and biologically (48% vs. 29%) using the in-solution target capture methodology (P = 1.7 £ 10 ¡86 and P = 1.5 £ 10 ¡65 , respectively, Fisher's Exact test, Supplementary Figure S9). In addition, the putative promoter region is annotated as a DNase hypersensitive region, a transcription factor binding site, and a transcription start site (5' cap analysis gene expression, CAGE), which suggest the putative promoter region is biologically relevant (Supplementary Figure S10). This region was hypermethylated (>79%) in both sexes in all eight tissues from Schultz et al. (Supplementary Figure S11).

CSMD1 placenta-specific isoform
We further investigated whether this potential isoform of CSMD1 is specific to the placenta. We obtained RNA-seq data of five placenta-derived tissues from the NIH Roadmap Epigenomics Mapping Consortium (REMC) [18], 18 somatic tissues from Schultz et al. [15], and three additional tissues from the REMC, respectively (Supplementary Table S8). We quantified CSMD1 mRNA abundance at the exon-level and compared the placenta-related tissues, including our own data, with the other 21 tissues. All the placenta-derived tissues clustered separately from the 21 non-placental tissues and they shared the same exon (ENSE00002127078), which is the first exon covered with at least 8 uniquely mapped reads ( Figure 6). This is the candidate first exon predicted as an alternative transcript start site by the transcript reconstruction using our 131 RNA-seq samples. Interestingly, none of the other nonplacental tissues, except the cerebellum, have reads covering this exon, suggesting that they do not use this putative first exon. We therefore propose that the placental transcript of CSMD1 has a different TSS from other tissues and it is a novel isoform specific to the placenta. The exon usage comparison with other fetal and embryonic tissues [20] also supports this putative TSS and a novel form of placenta CSMD1 transcript (Supplementary Figure S12).
The putative protein encoded by the placental-specific transcript is predicted to contain 4 CUB and 18 Sushi extracellular domains instead of the full 14 CUB and 28 Sushi domains, as expected from the reference sequence (Supplementary Figure S13). CSMD1 exon-level expression analysis from the    Figure 4). Canonical and placenta-specific transcription start sites are shown in blue and red, respectively. 64 female and 67 male placental RNA-seq data identified 15 exons that were significantly differentially expressed (P<0.05, Mann-Whitney test) in a sex-specific manner (Supplementary Table S9). Consistent with the gene-level quantification, females showed lower expression compared to males, from 14 exons out of the 15 significantly differentially expressed exons.

Discussion
Whole-genome oxidative bisulfite sequencing (WGoxBS) was used as a discovery method to assay CpG methylation at single nucleotide resolution in two female and two male healthy term placentas with an average of 26x genome coverage per individual yielding a data set of »15 million CpGs (Supplementary  Table S2 and Supplementary Figure S1). The majority of methylation studies of the placenta have used Illumina BeadChip arrays, which assay up to 450,000 CpGs, or have focused on identifying differences between the placentas of healthy and complicated pregnancies (reviewed by Bianco-Miotto et al. [9]). As a discovery method, we carried out WGoxBS with validation in technical and biological replicates using in-solution target capture bisulfite sequencing. Finally, using an RNA-seq data set generated from 64 female and 67 male healthy term placentas we determined gene-and exon-level transcript abundance associated with a 225 kb sex-specific DMR located within the CSMD1 gene.
In agreement with other studies, we showed that placental DNA is hypomethylated compared to that from other somatic tissues [4,21,22]. After examining the global methylation profiles within various genomic features, we observed that males are generally hypermethylated compared to females, and the differences are most evident on the X chromosome. There is an exception, however, to this global male hypermethylation, namely, X chromosome CpG islands hypermethylated in females. Our own analysis of eight tissues from the wholegenome bisulfite sequencing data of Schultz et al. showed that X chromosome hypermethylation in CpG islands of females is not unique to the placenta. Female hypermethylation of X chromosome CpG island regions seems to be a common finding among different tissues, which was reported in a number of other studies of X chromosome inactivation, DNA methylation role at CpG island regions, and meta-analysis of human peripheral blood leukocytes [23][24][25][26].
Many (35) of our top 100 sex-specific 5 kb DMRs are on the X chromosome. A similar observation was previous reported by Yousefi et al. [27] The authors found, in umbilical cord blood, that the majority (74%) of gender-associated significantly differentially methylated CpGs were located on the X chromosome, with 64% of them being hypermethylated in females. Also, a recent paper reports a similar observation [28]. Surprisingly, we found that 20 of the top 100 sex-specific 5 kb DMRs were located within a 225 kb region of the gene-body of CSMD1, with females being hypomethylated. The differential methylation in this large region was not evident when analyzing individual CpGs, CpG islands, or promoters. This suggests that the use of 5 kb tiles alongside these other methods is able to reveal additional chromosomal features that may be of functional relevance. This also highlights the advantage of using whole-genome analysis, as the number of CpGs evaluated is higher (»30-fold in this study) than when using current arraybased methods. This observation was confirmed using technical and biological replicates. We did not observe sex-specific methylation of CSMD1 in any of the eight somatic tissues, which suggests that this sex-specific differential methylation is restricted. Demonstration of true placental specificity will require systematic analysis of this CSMD1 DMR in all tissues and developmental stages in replicate female and male samples.
Schultz et al. applied whole-genome bisulfite sequencing (WGBS), a method in which 5-methylcytosine and 5-hydroxymethylcytosine are indistinguishable; therefore, the two modifications are reported in aggregate [29,30]. As WGoxBS measures 5-methylcytosine alone [31,32], we were able to detect the CSMD1 DMR in the absence of 5-hydroxymethylcytosine. Despite our discovery data set being derived from WGoxBS, we technically validated our sex-specific CSMD1 DMR with an in-solution methodology using bisulfite conversion alone. We also validated this observation in 3 additional healthy female and male placentas (biological replicates) using the same technique. Thus, we are confident that the placental and sex-specific CSMD1 DMR is due to differences in 5-methylcytosine and not 5-hydroxymethylcytosine.
We analyzed CSMD1 RNA-seq reads obtained from 64 female and 67 male placental samples and showed a sex-specific difference in transcript abundance, where females have lower transcript abundance compared to males. Our analysis of transcript reconstruction based on the 131 placental RNA-seq samples suggests that the placenta may have a novel CSMD1 transcript that starts 1.8 mega base pairs downstream the canonical first exon. While no sex-specific methylation differences were detected in the canonical promoter, females had higher methylation values in the newly identified putative promoter region, which may explain overall decreased transcript abundance in females when compared to males. This putative placental promoter region was hypermethylated (>85%) in the eight tissues from the Schulz et al. dataset with no evidence of use of the putative first placental exon. While sex-specific transcript abundance of CSMD1 can be explained by differences in DNA methylation at the putative promoter region, the 225 kb CSMD1 DMR, located 20 kb downstream the putative placental-specific first exon, might also affect gene-level transcript abundance.
Comparative analysis of RNA-seq data of CSMD1 at the exon-level between the placenta-derived tissues and other adult and fetal tissues indicates that the placenta has a different first exon and its location is consistent with our proposed model of a novel transcript. The Ensembl genome browser annotates that one of the CSMD1 transcripts (ENST00000520451) starts with the same exon (ENSE00002127078), which we identified as the putative first exon in the placenta-derived tissues. However, the Ensembl transcript consists of only five exons with no evidence of an open reading frame, whereas our model of a novel placental CSMD1 transcript spans 35 exons and its protein-coding potential is highly marked by the Coding-Potential Assessment Tool [33]. However, very little is known about the possible function of CSMD1-it has neither a Gene Ontology biological process nor a molecular function term associated with it and the cellular components terms ("membrane" and "integral component of membrane") are only inferred from electronic annotation. Nonetheless, it has been linked in genome-wide association studies with Parkinson's disease, Alzheimer's disease, and schizophrenia (for an example see [34]). A fragment of recombinant rat CSMD1 inhibited the classical but not the alternative complement activation pathways [35]. A recent study showed that suppression of CSMD1 by short hairpin RNAs significantly increased the proliferation, cell migration, and invasiveness of MCF10A cells compared to controls [36]. It is not known what possible role CSMD1 might have in the placenta, but regulation of complement [37] and cell invasiveness are known to be important [38].
The 225 kb DMR described above covers the exons (and introns) that comprise this novel transcript, which is also differentially expressed in male and female placentas. Gene body methylation is an informative predictor of gene activity [39], although detailed mechanistic understanding of this, and particularly how it may differ between males and females, is currently lacking.
Whole-genome oxidative bisulfite sequencing and in-solution target capture carried out in the placentas of healthy female and male replicates allowed us to identify and validate a sex-specific autosomal 225 kb differentially methylated region in the CSMD1 gene which was hypermethylated in the male placenta. RNA-seq analysis of 64 healthy female and 67 healthy male placenta samples showed that CSMD1 had sex-specific differential transcript abundance (»1.8-fold higher in male). Analysis of DNA methylation and RNA-seq data from publically available data sets indicate that the differential methylation is likely to be placental-specific, with the presence of a placental-specific CSMD1 isoform. Further studies will be required to elucidate the cause, consequences, and role for sexspecific differential methylation, transcript abundance, and placental-specific isoforms of CSMD1 in the placenta. These data illustrate the benefit of analyzing comprehensive RNA-seq data in parallel with DNA methylation data.

Placental samples
All samples were obtained as part of The Pregnancy Outcome Prediction study (POPs)-a prospective cohort study of nulliparous women attending the Rosie Hospital, Cambridge (UK). The study and sample collection have been previously described in detail [10][11][12]. Ethical approval for the study was obtained by the Cambridgeshire 2 Research Ethics Committee (reference number 07/H0308/163). All participants gave written informed consent.

DNA extraction
Frozen placental tissue samples from four separate regions from the basal plate of each term placenta were combined to provide a single 20-24 mg sample for DNA extraction. This was processed using the Qiagen DNeasy Blood & Tissue Kit (Qiagen), according to the manufacturer's instructions including the RNase treatment. To achieve buffer compatibility for oxidative whole-genome bisulfite conversion, placental DNA was eluted in UltraPure water (Cambridge Epigenetix). All DNA samples were quantified using the Qubit® dsDNA HS Assay Kit (Life Technologies).

Whole-genome oxidative bisulfite sequencing
Placental genomic DNA (4 mg) from 4 healthy pregnancies was processed to achieve 10 kb fragments with the g-Tube (Covaris), according to the manufacturer's instructions. Fragmented DNA was concentrated using GeneJET purification columns, according to the manufacturer's instructions (Thermo Fisher Scientific). This fragmented and purified DNA (1.5 mg) was taken forward for oxidation + bisulfite treatment using the TrueMethyl Kit (Cambridge Epigenetix). The supplied sequencing and digestion controls were added at least to 1% (w/w) to the fragmented DNA. Buffer exchange, denaturation, oxidation, bisulfite conversion, cleanup, and qualitative assessment of 5-hydroxymethylcytosine oxidation was carried out according to manufacturer's instructions (version 3.1). Bisulfite-converted DNA (50 ng) was used as input for library generation using the EpiGnome TM Methyl-Seq Kit (Epicentre) and EpiGnome TM Index PCR Primers (Epicentre), according to manufacturer's instructions. All double and single stranded DNA was quantified using the Qubit dsDNA HS Assay Kit and Qubit ssDNA Assay Kits, respectively, according to the manufacturer's instructions (Life Technologies), and were sized with the Agilent Bioanalyzer 2100 High Sensitivity DNA assay. Molarity of each of the libraries was determined by qPCR using the KAPA Illumina ABI Prism Library Quantification Kit (Kapa Biosystems). Indexed libraries were pooled at an equimolar ratio. To increase the number of uniquely sequenced reads, two independent libraries were generated for each individual. Multiplexed sequencing was carried out on the Illumina MiSeq, HiSeq 2000, and HiSeq 2500 instruments with 2 £ 100, 2 £ 150 and 2 £ 125 cycles using MiSeq Reagent Kit v3, HiSeq SBS Kit v3 and HiSeq SBS Kit v4, respectively. The methylation conversion rate of the supplied sequencing controls was checked with bsExpress (https://bitbucket.org/cegx-bfx/cegx_b sexpress_docker).

In-solution target capture library preparation and sequencing
Placental genomic DNA (3.5 mg) from 8 healthy pregnancies (including 2 from the WGoxBS) was fragmented by the Covaris S220 system according to the SureSelect Methyl-Seq target enrichment protocol (Agilent). As oxidative bisulfite sequencing was unsupported, library preparation, hybridization, bisulfite conversion, indexing, and sample pooling were carried out according to manufacturer's instructions. After the 3' end adenylation step, we excluded DNA fragments >500 bases by performing an AMPure XP bead selection with a 0.6x bead to DNA ratio discarding the larger fragments that were bound by the beads. We continued the prescribed purification by adding an additional 1.2x volume beads to the supernatant to bring the final ratio to 1.8x. Resultant libraries were sized with the Agilent Bioanalyzer 2100 High Sensitivity DNA assay and molarity of each library was determined by qPCR using the KAPA Illumina ABI Prism Library Quantification Kit (Kapa Biosystems).

Placental biopsies and RNA extraction
Placental biopsies (n = 64 female placentas, n = 67 male placentas) were selected from healthy pregnancies from the POPs cohort. These patients had no evidence of hypertension at booking and during pregnancy, did not experience pre-eclampsia, Hemolysis, Elevated Liver enzymes, and Low Platelets (HELLP) syndrome, gestational diabetes, or diabetes mellitus type I or type II and other obstetric complications. They delivered live babies with a birth weight percentile in the normal range (20-80 th percentile), with no evidence of slowing in fetal growth trajectory. Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80 C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4 C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80 C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer.

Library preparation for RNA-Seq
Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN 7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument.

Sequencing data processing and analysis
Both WGoxBS and in-solution target capture data went through the quality control process (i.e., trimming primer sequences and poor quality bases toward 3' end) by Trim Galore! [40] after checking the quality metrics of raw reads with FastQC [41]. Then, the high quality trimmed reads were mapped to the human genome reference (hg19) using Bismark [42]. Reads of poor mapping quality (MAPQ = 0) and duplicated reads (i.e., those mapped at the same start and end position within the same chromosome) were removed from the alignment files before calling methylated CpGs by Bis-SNP [43], which is a genotype-aware methylcytosine caller enabled with a base-quality recalibration function built on top of the Genome Analysis Toolkit [44]. The VCF files generated by Bis-SNP were converted into BED format as inputs for RnBeads [16] to find differentially methylated regions. RnBeads considers both the statistical significance (i.e., P value) and the effect size (i.e., absolute and relative methylation difference) and takes the worst rank (combined rank) from the three individual ranks of aforementioned measurements. In Manhattan plots (Figure 3), to make highly ranked DMRs (smaller values) appear in top, the original combined rank is converted by subtracting the log-scale of combined rank from the log-scale of maximum rank. CpGs of at least 10x coverage and shared among all four individuals were considered and DMRs were filtered if they had fewer CpG sites compared to the median of the CpGs found in the dataset: 17 (5 kb tiles), 61 (gene-bodies), 18 (promoters), and 51 (CpG islands).
The definitions of CpG island regions and the gene model were from the UCSC (hg19) table browser [45] and the Ensembl 73 equivalent with the GENCODE version 18, respectively, which were accompanied with RnBeads package. The promoter regions were defined as 1,500 bp upstream and 500 bp downstream regions of the transcription start sites. The CpG island shore regions were defined as 2,000 bp flanking regions from both ends of the CpG island regions. The '5,000bp' regions (5 kb) are defined as a set of genomic regions where each chromosome is tiled by non-overlapping 5 kb windows.
Technical and biological validation of the sex-specific methylation patterns was performed using Fisher's exact test on a 2 £ 2 contingency table by testing independence of sex and the cumulative count of methylated CpGs over the 5,000 bp region. For technical validation, one female and one male sample were tested by both WGoxBS and in-solution capture methodologies. For biological validation, all 3 female and all 3 male samples assayed by the in-solution capture methodology were used. A two-sided P value <0.05 was considered statistically significant.
The Schultz et al. [15] methylation data for 8 tissues [adrenal gland (AD), aorta (AO), esophagus (EG), adipose tissue (FT), gastric (GA), psoas muscle (PO), small bowel (SB), and spleen (SX)] were downloaded from http://neomorph.salk.edu/SDEC_ tissue_methylomes/processed_data/allc_tissues.tar. CpGs with at least 10x coverage detected in individual 2 (a female) and individual 3 (a male) were used. The cytosine methylation level was measured as recommended by Schultz et al. [46]. Briefly, for single CpG sites, the ratio of reads with methylation (C base) out of the total number of reads covering the position (sum of C and T reads) was calculated; within regional contexts (e.g., gene-body, CpG island, promoter, etc.), the weighted methylation level (i.e., the sum of reads with methylation out of the sum of the total number reads over the region of interest) was determined. Other Roadmap [18] whole-genome bisulfite sequencing datasets generated at the Broad Institute and UCSF-UBC were downloaded from the GEO website (https:// www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics), as follows: fetal muscle (fMU, GSM1172596); fetal thymus (fMU, GSM1172595); hippocampus (HC, GSM1112838 and GSM916050); brain germinal matrix (GM, GSM941747); breast luminal epithelial (LE, GSM1127125), and testis spermatozoa (SP, GSM1127117). The list of tissues and downloaded file names are available in Supplementary Table S8. A quality control process was also applied for the RNA-Seq datasets: reads were trimmed with Trim Galore!, which uses cutadapt internally [47] and were mapped to the same version of human genome reference (hg19). TopHat2 [48], a spliceaware mapper built on top of Bowtie2 [49] short-read aligner, was used in the mapping process in which so-called two-pass (or two-scan) alignment protocol was applied to rescue unmapped reads from the initial mapping step [50]. In the second mapping, previously unmapped reads were re-aligned to the exon-intron junctions detected in the first-mapping by TopHat2 and were combined across all 131 placenta samples. The initial and second mapped reads were merged by samtools [51] and the uniquely mapped reads were counted by the htseq-count of HTSeq [52] against the same gene-model (Ensembl 73) used to calculate the degree of methylation at the gene-body or promoter region. The transcript abundance of CSMD1 was measured in fragments per kilobase of transcript per million (FPKM) mapped reads using DESeq2 Bioconductor package [53]. The average depth of RNA-seq coverage was measured by taking the mean of each coverage calculated by the 'genomecov' of Bedtools [54]. The exon-level expression of CSMD1 was measured by counting mapped reads against 98 unique exons (i.e., Ensembl exons with a distinct start and end position) of CSMD1 using featureCount of subRead [50] and 'coverageBed' of Bedtools. The exon-level FPKM values were transformed by the following equation: log e (FPKM/1000+1) and used to generate the heatmap ( Figure 6). The CSMD1 transcripts were reconstructed based on the RNA-seq data obtained from the 131 placenta samples using StringTie [55] and Cufflinks [56]. The RNA-seq data from Schultz et al. [15] and two other participating institutes from the NIH Roadmap consortium (UCSF-UCB and The Broad Institute) were downloaded from the GEO with the following accession numbers: GSE16256 (Schultz et al.), GSE16368 (UCSF-UCB), and GSE17312 (Broad Institute). The list of tissues and downloaded file names are provided in Supplementary Table S8.
The summary statistics (read counts, coverage, mapping efficiencies, etc.) of the data processing for the WGoxBS, in-solution target capture methyl-Seq, and RNA-Seq data are available in Supplementary Table S2, Supplementary Table S5, and Supplementary Table S7, respectively. A list of software, version information, and non-default options used in this study is available in Supplementary Table S10. All the computational analyses were conducted using the Linux clusters at the University of Cambridge High Performance Computing Service and the Linux workstations of School of Biological Science computing service.

Disclosure of potential conflicts of interest
The authors declare that they have no conflict of interest.