KDM2A/B lysine demethylases and their alternative isoforms in development and disease

ABSTRACT Aberrant levels of histone modifications lead to chromatin malfunctioning and consequently to various developmental defects and human diseases. Therefore, the proteins bearing the ability to modify histones have been extensively studied and the molecular mechanisms of their action are now fairly well understood. However, little attention has been paid to naturally occurring alternative isoforms of chromatin modifying proteins and to their biological roles. In this review, we focus on mammalian KDM2A and KDM2B, the only two lysine demethylases whose genes have been described to produce also an alternative isoform lacking the N-terminal demethylase domain. These short KDM2A/B-SF isoforms arise through alternative promoter usage and seem to play important roles in development and disease. We hypothesise about the biological significance of these alternative isoforms, which might represent a more common evolutionarily conserved regulatory mechanism.


Introduction
Epigenetics has become one of the pinnacles of the modern biomedical research and is now known to affect virtually all biological processes ranging from embryogenesis to aging [1][2][3][4][5][6][7]. In the nucleosome, the fundamental unit of chromatin, DNA is wrapped around two sets of core histones, H2A, H2B, H3 and H4, whose amino acid residues can be modified by various post-translational modifications such as methylation, acetylation or phosphorylation [8][9][10][11]. This creates a large number of potential combinations of various epigenetic marks, each of which has at least one functional consequence frequently distinct from those of other combinations [8,[12][13][14]. This complex network of post-translational modifications of the core histones have been shown to be essential for a large number of cellular processes such as transcription, DNA replication, DNA damage repair, recombination, chromatin structure, cell cycle, and pre-mRNA splicing [1,3,4,8,13,[15][16][17].
In this review we focus on mammalian KDM2A and KDM2B (KDM2A/B), the only two lysine demethylases whose genes have been described to produce both the full length demethylases, KDM2A/B-LF, and also alternative shorter isoforms that lack any demethylation activity, KDM2A/B-SF, but yet seem to also play important roles in various biological processes [44][45][46][47][48]. KDM2A and KDM2B are two closely related lysine demethylases that demethylate histone H3 at lysine K36 [49]. Methylation of H3K36 is indeed an intriguing histone modification since it is involved in a number of various nuclear processes such as transcriptional regulation, gene dosage compensation, pre-mRNA splicing, DNA replication, recombination and DNA damage repair [40,[50][51][52][53]. As for transcriptional activity, H3K36 methylation is associated with actively transcribed gene bodies to prevent spurious transcription initiations [36,40]. However, H3K36me2 is localised more in the 5´regions of transcribed genes, whereas H3K36me3 is present predominantly in the 3ŕ egions [54][55][56]. Moreover, elevated levels of H3K36me2 have been found to be associated with active gene promoters and removal of this epigenetic mark is associated with transcriptional repression of these promoters [57][58][59][60][61][62][63][64].
Although plant homeodomains (PHD domains) are generally known to recognise and interact with histone H3 tails [79,80], Zhou et al did not find the KDM2A PHD domain to be able to interact with any of the 600 tested histone variations [81]. However, as opposed to KDM2B, KDM2A is able to interact with heterochromatin proteins HP1 [72], and the KDM2A PHD domain has been shown to be involved in this direct interaction [82]. The HP1 proteins are known to be involved in transcriptional repression by directly interacting with the repressive chromatin mark H3K9me3 [83][84][85][86][87][88], and the interaction of KDM2A with the HP1 proteins has been suggested to play a role in transcriptional repression of pericentromeric heterochromatin [72]. Furthermore, the ataxia-telangiectasia mutated Ser/Thr kinase has been shown to phosphorylate the KDM2A PHD domain [89], which results in a lower chromatin binding capacity of KDM2A at double strand breaks (DSBs). Consequently, DSBs exhibit higher levels of H3K36me2, which are necessary to attract the DSB repair machinery [89]. KDM2A is further involved in DNA damage repair by interacting with p53-binding protein 1 (53BP1), an essential regulator of DNA DSB repair [90,91]. KDM2A stimulates ubiquitination and stability of 53BP1, promotes recruitment of 53BP1 to DNA breaks, and disruption of the interaction between KDM2A and 53BP1 leads to increased DNA damage-induced genomic instability [91]. As opposed to KDM2A, whose PHD domain does not interact with histones [81], the KDM2B PHD domain has been shown to mediate the interaction with H3K36me2 and H3K4me3 [59]. Interestingly, the KDM2B PHD domain also exhibits an E3 ubiquitin ligase activity [59]. KDM2A and KDM2B are the only members of the JmjC family that contain a C-terminal F-box domain [48,[92][93][94]. As opposed to the KDM2A F-box domain, whose function is still elusive, the KDM2B F-box domain has been shown to mediate protein-protein interactions. For example, using its F-box domain KDM2B interacts with the CUL1-RING ubiquitin ligase complex [95]. The KDM2B F-box and LRR domains are also necessary for interaction with the PcG proteins [96,97].

KDM2A and KDM2B in development and disease
KDM2A and KDM2B are both highly expressed in mouse embryonic stem cells (ESCs), where they bind to unmethylated CpG island-containing promoters [66,68,[98][99][100]. KDM2B has been shown to form complex with polycomb repressive complex 1 (PRC1) to repress unmethylated CpG island-containing promoters of developmental genes in order to keep mouse ESCs (mESCs) undifferentiated [68,98]. Consistently with these findings, depletion of KDM2B in mESCs induces aberrant differentiation [98,101]. Interestingly, in mESCs the KDM2B promoter is directly bound and activated by the pluripotent stem cell factors SOX2 and OCT4 [98]. KDM2B further functions as an anti-adipogenesis factor independently of its N-terminal demethylase domain by dragging the PRC1 repressive complex to the promoters of adipogenesis and cell cycle genes, which results in their repression and in keeping preadipocytes undifferentiated [97]. Similarly, depletion of KDM2A in stem cells from apical papilla (SCAPs) leads to transcriptional derepression of the stem cell factors NANOG and SOX2, and consequently to differentiation of SCAPs into chondrocytes and adipocytes [102]. KDM2A and KDM2B thus act as stem cell factors that keep various stem cells undifferentiated [60,71,98,[101][102][103], and have been shown to promote generation of induced pluripotent stem cells [100,104,105]. KDM2A is also involved in maintaining cell specific alternative splicing of the FGFR2 mRNA by forming complex with the PcG protein complex PRC2 and the long noncoding FGFR2 antisense RNA [106].
Consistent with the high KDM2A expression levels during embryogenesis, the mutant mice lacking the full length KDM2A-LF protein, but retaining the short KDM2A-SF isoform, fail to develop beyond mouse embryonic day 10, show growth retardation and neural tube closure defects [107]. On the other hand, the mice lacking the full length KDM2B-LF protein, but retaining the short KDM2B-SF isoform, exhibit a not fully penetrant phenotype with 40% of the mutants exhibiting retinal coloboma and only 44% of knockouts displaying fatal neural tube closure defects [99]. Although three other studies describe more severe KDM2B knockout phenotypes, these phenotypes are most likely attributable to the loss of both KDM2B-LF and KDM2B-SF as further discussed in the next chapter [96,108,109].
KDM2A exhibits proliferative properties and is upregulated in lung, gastric and breast cancer [47,57,58,74,110]. However, the KDM2A properties are likely to be context dependent since in prostate cancer KDM2A is downregulated [72], and under stress conditions it exhibits anti-proliferative properties by repressing ribosomal RNA (rRNA) genes [70,75]. KDM2A promotes stemness and angiogenesis of breast cancer [64], and promotes silencing of tumor suppressor genes in breast cancer [111]. Overexpression of either KDM2A or KDM2B has been shown to immortalise mouse embryonic fibroblasts in a JmjC domain-dependent process [63,112], and KDM2A and KDM2B have been recently shown to be transcriptionally upregulated in hypoxia [113]. KDM2B is significantly overexpressed in pancreatic ductal adenocarcinoma, ovarian cancer and acute myeloid leukemia [114][115][116], and drives self-renewal of breast cancer stem cells [117]. Moreover, KDM2B has been recently shown to drive synovial sarcoma [118]. Similarly to KDM2A, the KDM2B properties are also likely to be context dependent since it exhibits anti-proliferative properties by repressing rRNA genes, and its expression is significantly decreased in glioblastoma [69].
Taken together, although KDM2A and KDM2B affect various biological processes through their enzymatic demethylation activities (e.g. direct demethylation of lysines), some processes are affected by these proteins independently of their demethylase domain through their protein partners (e.g. HP1 or PcG proteins) or other catalytic activity (e.g. the ubiquitin ligase activity of the KDM2B PHD domain).

KDM2A/B-SF: isoforms of lysine demethylases with no demethylation activity
Alternative promoter usage, one of the most widely used mechanisms involved in generating the enormous mammalian proteome complexity, is often responsible for creating protein isoforms lacking their N-terminal domains [119][120][121]. Since these N-terminal domains frequently have some important function, the canonical isoform and its alternative isoform are then likely to be functionally distinct [46,[121][122][123]. In addition to the still poorly characterised canonical promoters that give rise to the full length KDM2A/B mRNAs (KDM2A/B refers to both KDM2A and KDM2B), the KDM2A and KDM2B loci also contain alternative intronic promoters, through which shorter alternative mRNAs are produced [44,46]. Although the alternative intronic promoters driving the expression of KDM2A/B-SF mRNAs (SF stands for short form) have not been characterised yet, they are clearly defined by the presence of the alternative first exons and by the epigenetic profile characteristic for promoters/regulatory elements. For example, the chromatin immunoprecipitation data from various cell types that are publicly available in the UCSC genome browser show that the intronic regions surrounding the alternative first KDM2A/B-SF exons exhibit elevated levels of H3K4me3 and H3K27Ac, epigenetic marks associated with active promoter regions [4,124] (Figure1(b)). The alternative KDM2A-SF and KDM2B-SF mRNAs originate in introns 12 and 11, respectively, and lack the upstream exons that encode the JmjC demethylase domain (Figure 1(c)). The KDM2A/B-SF mRNAs thus encode shorter proteins that lack the N-terminal demethylase domain and that are not able to function as demethylases. However, KDM2A/B-SF still retain all the other functional domains including the CXXC, PHD, F-box and LRR domains (Figure 1(a)). As mentioned earlier, many biological processes are mediated by KDM2A and KDM2B independently of their demethylation activity through protein-protein interactions, e.g. interaction with HP1 or PcG proteins [72,82,96,98,101,117,125], or through the ubiquitin ligase activity of the KDM2B PHD domain [59]. KDM2A/B-SF are thus likely to interact with the same proteins as KDM2A/B-LF and to be involved in the same demethylationindependent processes as KDM2A/B-LF. This assumption has already been supported by the following studies. First, both KDM2B isoforms have been indeed shown to be able to interact with PcG proteins and to drag them to CpG islands [96]. Second, KDM2A-SF, which shares the HP1 interaction motif with KDM2A-LF [82], has been indeed shown to complex with the repressive mark H3K9me3 in an HP1adependent manner [125]. In our recent study we showed that KDM2A-SF, unlike KDM2A-LF, forms distinct nuclear foci at pericentromeric heterochromatin in an HP1a-dependent way [46].
Since the JmjC demethylase domain of KDM2A/B functions as a 2-oxoglutarate oxygenase [126], its function is dependent on the oxygen levels. Therefore, under low oxygen levels (e.g. hypoxia) the function of the JmjC demethylase domain is compromised and the full length KDM2A/B proteins could then behave as the short KDM2A/B-SF isoforms. In this regard it is also important to mention two more proteins that are related to the KDM2A/B lysine demethylases, but lack any demethylation catalytic activity. Although the JmjC domain of JARID2, a member of the JmjC domain-containing protein family, is not active due to amino acid substitutions in comparison to catalytically active JARID1, its loss-of-function phenotype is embryonically lethal and JARID2 is essential for early development [127]. On the other hand, FBXL19, a member of the F-box family proteins resembles KDM2A/B-SF since it contains the CXXC, PHD, F-box and LRR domains, but completely lacks the JmjC demethylase domain [48,93,94,128]. Despite lacking demethylation activity, FBXL19 plays an important role as a substrate-recognition component of the SCF (SKP1-CUL1-F-box protein)-type E3 ubiquitin ligase complex [93,94], and it has recently been shown to be essential for mouse development [128].

KDM2A/B-SF in development and disease
Both KDM2B-SF and KDM2A-SF are highly expressed in mESCs and during embryogenesis [46,99]. As opposed to KDM2A-SF, for which knockout mutant mice have not been described yet, the knockout mice that lack KDM2B-SF, but retain KDM2B-LF, have been prepared and analysed [44]. Surprisingly, losing just the short KDM2B-SF isoforms results in severe craniofacial and neural tube defects that are not seen in the KDM2B-LF mutants [44]. The phenotype of losing just KDM2B-LF is milder and not fully penetrant with only 44% of knockouts displaying fatal neural tube closure defects [99]. Although three other studies describe embryonically lethal KDM2B knockout phenotypes, both KDM2B-LF and KDM2B-SF were disrupted in these mutant mice and the described phenotype is thus most likely attributable to the loss of both isoforms [96,108,109]. Based on their different mouse loss-of-function phenotypes, KDM2B-LF and KDM2B-SF seem to be involved in different biological processes. However, similarly to KDM2B-LF, KDM2B-SF is also able to interact with PcG proteins and to attract them to CpG islandcontaining promoters [96]. Therefore, it is possible that the short and long isoforms also have some redundant roles. In comparison to KDM2A-LF, KDM2A-SF is strongly overexpressed in multiple breast cancer cell lines, exhibits proliferative properties and its knockdown inhibits breast cancer cell growth [47]. Unlike KDM2A-LF that has been shown to repress rRNA genes under stress conditions [70,75], KDM2A-SF has been recently shown promote proliferation by activating rRNA genes in the MCF-7 breast cancer cells [45]. Although the precise molecular mechanism of the KDM2A-SF action is not clear in this case, the authors show that binding of KDM2A-SF to the rDNA promoter results in decreased levels of H4K20me3, a transcriptionally repressive mark, and to activation of the rDNA promoter [45].

Conclusion
Since the full length KDM2A/B proteins and their shorter alternative KDM2A/B-SF isoforms share the same DNA binding CXXC domain (Figure1(a)), they are likely to bind to the same CpG island-containing DNA regions. However, binding of KDM2A/B-SF, which lack the demethylase domain, cannot lead to demethylation of histone H3 lysines in these regions and to transcriptional repression of associated promoters. Although it needs to be experimentally verified, it is possible that by being unable to demethylate H3K36me2, KDM2A/B-SF would prevent some promoters from being demethylated at H3K36 and from being repressed. Therefore, KDM2A/B-SF would indirectly function as transcriptional activators of these promoters, which is consistent with KDM2A-SF being able to activate the rDNA gene promoter [45]. This assumption is also consistent with the recent study of FBXL19, a member of the KDM2 family that lacks the N-terminal demethylase domain similarly to KDM2A/B-SF. In their study Dimitrova et al. show that FBXL19 is essential for mouse embryogenesis by inducing the expression of developmental genes during ESC differentiation [128]. KDM2A/B-SF might play a role in fine tuning the transcriptional activity of CpG island-containing promoters in various spatio-temporal context. It is possible that certain promoters must be repressed by KDM2A/B-LF in certain cell types and at a certain time-point, one biological context, whereas the same promoters must be activated by KDM2A/B-SF in different cell types and at a different time-point, another biological context (Figure1(d)). Since this context dependent mechanism and the expression of KDM2A/B-SF are likely to be strictly regulated themselves, it will be important to identify the factors regulating both the KDM2A/B and KDM2A/B-SF promoters. It will be also interesting to analyse whether the KDM2A/B and KDM2A/B-SF promoters are disrupted by single nucleotide polymorphisms (SNPs) in human diseases or developmental defects. Numerous genome wide association studies have shown that the majority of the SNPs associated with human diseases are located in introns [129,130], and some disease-causing SNPs have even been shown to disrupt transcription factor binding sites in intronic regulatory elements and to affect the expression of the corresponding genes [131,132].
Given some studies on KDM2A and KD2MB have been done without distinguishing the roles of the long isoforms from those of the short isoforms (e.g. the KDM2B-LF and KDM2B-SF double knockout mice [96,108,109], or knockdown of both KDM2A-LF and KDM2A-SF [66,72,91]), it will be necessary to specifically study the roles of just the long or just the short isoforms (e.g. KDM2B-LF specific knockout [99] and KDM2B-SF specific knockout [44]).
As opposed to evolutionarily higher organisms, the fruitfly D. melanogaster contains only one KDM2 lysine demethylase. In Drosophila, the dKdm2 gene encodes a protein that contains the JmjC, CXXC zinc finger, PHD, F-box and LRR domains, and is able to demethylate both H3K36me and H3K4me [133][134][135]. It will be interesting to determine whether the KDM2 genes of various lower model organisms (e.g. D. melanogaster, C. elegans) also encode an alternative demethylation inactive isoform. Whether KDM2A/B-SF are the only demethylation inactive alternative isoforms of lysine demethylases or whether this is a more common evolutionarily conserved regulatory mechanism and other alternative isoforms have not been yet identified due to their highly specific spatiotemporal expression pattern, remains to be elucidated in the future.