Recommendations for a nomenclature system for reporting methylation aberrations in imprinted domains

ABSTRACT The analysis of DNA methylation has become routine in the pipeline for diagnosis of imprinting disorders, with many publications reporting aberrant methylation associated with imprinted differentially methylated regions (DMRs). However, comparisons between these studies are routinely hampered by the lack of consistency in reporting sites of methylation evaluated. To avoid confusion surrounding nomenclature, special care is needed to communicate results accurately, especially between scientists and other health care professionals. Within the European Network for Human Congenital Imprinting Disorders we have discussed these issues and designed a nomenclature for naming imprinted DMRs as well as for reporting methylation values. We apply these recommendations for imprinted DMRs that are commonly assayed in clinical laboratories and show how they support standardized database submission. The recommendations are in line with existing recommendations, most importantly the Human Genome Variation Society nomenclature, and should facilitate accurate reporting and data exchange among laboratories and thereby help to avoid future confusion.

Mammalian genomic imprinting is an epigenetic regulatory mechanism that results in parent-of-origin specific gene expression in diploid somatic cells (for a review, see ref 1 ). Several features of the imprinting mechanism have been identified including allelic DNA methylation, histone modifications, and noncoding RNAs. 2 Clustering and coordinate regulation is a key feature of imprinted domains with much effort invested in understanding how multiple genes are regulated by long-range cis-acting differentially methylated regions (DMRs).
In 1993, two publications reported parent-of-origin specific methylation associated with imprinted domains. Both of these studies were in mouse, the first described the paternally methylated regions associated with the H19-Igf2 gene cluster, 3 the second identified a region of methylation on the maternal allele within the Igf2r gene associated with the T-associated maternal effect (Tme) deletion. 4 Since these first pivotal reports, with the advent of genome-wide methylation screening technologies, the number of imprinted DMRs in mammalian species has steadily increased, including those originating from the respective germlines and those that are somatically acquired.
Primary methylation defects of some well characterized imprinted DMRs 5 ). Reporting these epigenomic data from molecular tests in laboratory reports or for publication is troubled by the lack of a uniform nomenclature. In this article we recommend unified names for imprinted DMRs and give details of their precise locations and suggest nomenclature for describing the results similar to those routinely used for DNA sequence variants.

Consensus for names of imprinted DMR
From the earliest days of the molecular descriptions of imprinting aberrations it became obvious that recording methylation defects would be challenging. This was evident simply because some imprinted genes have historically had multiple names, since many were identified simultaneously by independent groups who have termed the transcripts and DMRs differently. This is exemplified by the maternally methylated region overlapping the promoter of KCNQ1OT1 within intron 10 of KCNQ1 on chromosome 11, which has more than five aliases (Table 1). Ultimately, this causes confusion when crossreferencing original literature and modern databases. To improve this situation, the 41 EUCID members from 22 countries, have developed a uniform nomenclature system for reporting methylation aberrations. The final consensus after careful consideration was that the name of an imprinted DMR should be attributed to the nearest transcript with an approved symbol from the Human Genome Organization (HUGO) Gene Nomenclature Committee (HGNC).
Furthermore, we named a DMR in such a way that it gives basic information regarding its localization in relation to the nearest RefSeq transcript with the use of several prefixes outlined in Fig. 1 (e.g., TSS for transcription start site, IG for intergenic, Int for intronic and alt-TSS for alternative transcription start site). The precise location of each imprinted DMR is derived from methylseq data from whole blood samples as described by Court et al., 6 which has base-pair resolution. To ensure that the same genomic regions are identifiable in subsequent genome builds, all imprinted domains, including their corresponding DMRs, have been submitted to obtain Locus Reference Genomic (LRG) identifiers. 7 LRGs are manually curated reference sequence records specifically designed for the reporting of variants with clinical implications. The inclusion of stable and unique genomic, transcript, and protein reference sequences ensures that variants are unambiguously and consistently reported over time (www.lrg-sequence.org). The records will contain all relevant DMR annotations. Information regarding the recommended naming, localization, and sizes of each DMR are given in Table 1.

Standardization of reporting exact sites of imprinted methylation
It has previously been discussed that in order to allow correct identification and eventual reproduction of published observations, a universal system for the description of specific sites of DNA methylation tested needs to be employed. 8 In the case of imprinted DMRs, this is reasonably straightforward if laboratories use commercially available methods to analyze methylation, such as methylation-sensitive multiplex ligation-dependent probe amplifications (MLPA) or high-density methylation arrays. In such cases, the precise location of the probe identifier, restriction site, or the interrogated CpG probes found on commonly used methylation profiling platforms can easily be identified, the genomic nucleotide tested accurately described, and methylation values reported. As an initial step to assist in this standardization, we have provided a resource listing all probes mapping to imprinted DMRs on the popular Infinium HumanMethyla-tion450 BeadChips (Illumina, USA)(Suppl. Table 1), as well as the CpG dinucleotides interrogated by commonly used methylation-sensitive MLPA kits (MRC Holland, Netherlands)(Suppl. Table 2). For custom technologies, such as in-house pyrosequencing, for example, different CpG positions within the imprinted DMRs may be examined. In such cases, we recommend that the genomic coordinates targeted by the assays be listed and the methylation status described as an average percentage of all CpGs analyzed. However, such a description lacks resolution at the individual CpG level and, for future standardized reporting, it would be advantageous to have this information, not only for methylation at imprinted DMRs but also for all CpG positions in the genome. Such an approach could be based upon the current annotation of genomic locations as recommended by the Human Genome Variation Society (HGVS), allowing methylation values to be paired to each CpG position. 9,10 Use of the suggested nomenclature Following HGVS recommendations, methylation values at a specific region are described with (A) the chromosome number or LRG followed by (B) a colon :";" (C) prefix "g." for genomic DNA; (D) the position of the cytosine nucleotide or the range of nucleotides tested for the CpG containing interval; (E) the "j" character to indicate that it is a modification of the sequence not a sequence variant; (F) prefix describing the specific modification.
In collaboration with the HGVS' Sequence Variant Description Working Group (SVD-WG), it was decided to use the abbreviation "gom" to report a gain of methylation and "lom" for a loss. For non-specific methylation resistant to bisulphite conversion we suggest "bis" followed by a methylation value in brackets. If the molecular assay differentiates between 5-methylcytosine and its oxidative derivative 5-hydroxymethylcytosine, we propose the use of the "met" and "hmt," respectively. This is consistent with HGVS standards to use three-letter abbreviations that do not include the nucleotide so that the modification can be added to any DNA base. When utilizing this format, it is important to mention the correct imprinted DMR name, the genome build used, and the technique used to measure the methylation status. This is because the EUCID COST action has previously reported that different methods targeting subtly different locations within the same imprinted DMR having different sensitivities. 11,12 Furthermore, to help characterize variation due to tissue mosaicism, the tissue source from which the DNA is derived should be stated in any report because the methylation levels can be different in different tissues. 13 For example, the nomenclature for a bisulphite PCR targeting the KCNQ1OT1:TSS-DMR negative DNA strand: GRCh37/hg19 chr11:g.  Table 1. The extent of imprinted methylation defined by methyl-seq data sets with the commonly used name for each imprinted DMR, those proposed by of EUCID using HGNC approved gene names, previous aliases and LRG identifiers. For completeness, origin of the allelic methylation is given, as are any associated disorders and information whether the methylation is germline or somatically derived. Secondary DMRs are regions of differential methylation, the establishment of which is often somatically acquired and dependent on hierarchical interactions with a neighboring germline DMR. All coordinates are given as GRCh37/hg19. M, maternally derived methylation; P, paternally derived methylation; gDMR, germline DMR; PHP1b, Pseudohypoparathyroidism; SRS, Silver Russell syndrome; BWS, Beckwith Wiedemann syndrome; AS, Angelman syndrome; PWS, Prader Willi syndrome; MLID, Mutlilocus imprinting disturbance; TS14, Temple syndrome; KOS14, Kagami-Ogata syndrome; TNDM, Transient Neonatal Diabetes Mellitus. All relevant DMR information and aliases can also be found in the "community" section of each LRG record. In light of our suggestions, we encourage comments and discussion from clinical geneticists, molecular geneticists, and researchers from the epigenomics community and trust that the recommendations we have made for standardized reporting format will be useful for accurately communicating results. To give the wider epigenetics community the opportunity to be involved in the final discussions the proposed gom/lom nomenclature is open for community consultation on the HGVS webpage (see http://varnomen.hgvs.org/bg-material/con sultation/svd-wg005/). We hope that by giving precise methylation values as percentages, it will overcome issues of comparing results between laboratories who often describe abnormalities using different methylation indexes.
The next issue that needs a consensus is defining the criteria to allow the description 'lom' or 'gom'. This is complicated as not only statistical cut-offs need to be discussed (i.e., using mean § standard deviation), but also the number of controls analyzed to define the normal range. Furthermore, utilizing fixed statistical criteria will be complicated in cases with mosaic epimutations, as methylation variance at different CpGs within a DMR need to be taken into account, as does the reproducibility of the molecular techniques used. 11

Disclosure of potential conflicts of interest
No potential conflicts of interest were disclosed.

Funding
This work was supported by the European COST action under grant number BM1208, the Bundesministerium f€ ur Bildung und Forschung (BMBF) under grant number 01GM1513A, B, C and D, the Spanish Ministerio de