PHF13: A new player involved in RNA polymerase II transcriptional regulation and co-transcriptional splicing

ABSTRACT We recently identified PHF13 as an H3K4me2/3 chromatin reader and transcriptional co-regulator. We found that PHF13 interacts with RNAPIIS5P and PRC2 stabilizing their association with active and bivalent promoters. Furthermore, mass spectrometry analysis identified ∼50 spliceosomal proteins in PHF13s interactome. Here, we will discuss the potential role of PHF13 in RNAPII pausing and co-transcriptional splicing.


Introduction
The role of histone binding and chromatin modulating proteins in various transcriptional regulatory processes has been well documented. We recently identified PHF13 as a novel H3K4me2/3 chromatin reader and transcriptional co-regulator. 1 We found that PHF13 interacts with and tethers PRC2 (a H3K27 methyltransferase) to H3K4me3/H3K27me3 bivalent promoters and that its depletion led to increased expression at these targets and reduced PRC2 binding. In addition to bivalent targets, PHF13 was also found at a greater number of active promoters (H3K4me3-only) genome wide and its depletion led to reduced expression from these targets. Together these findings imply a general role for PHF13 in transcription regulation, an assumption that is supported by the finding that PHF13 also co-localized genome-wide with promoter-associated RNA polymerase II (RNAPII) S5P and S7P. More conclusively, we could demonstrate that PHF13 interacts with RNAPII and that its depletion reduced the interaction of RNAPII S5P with H3K4me3 and H3K27me3, indicating a role for PHF13 in stabilizing RNAPII S5P at H3K4me3 active and H3K4me3/H3K27me3 bivalent chromatin. Last, while we did not explore these interactions in any detail, mass spectrometry-based analysis of PHF13 interacting proteins identified more than 50 different spliceosome proteins in its interactome, arguing for an additional role of PHF13 in co-transcriptional splicing. Here, we will discuss the implications of PHF13 affiliation with RNAPII S5P at bivalent genes, with RNAPII S5P/S7P at promoter proximal pausing and with RNAPII S2P/S5P at splicing junctions.

RNAPII, CTD modification and the transcription cycle
RNA polymerase II (RNAPII) is a highly conserved multi-subunit enzymatic complex, consisting of 12 polypeptides that are required to transcribe DNA sequences into RNA, namely mRNAs and ncRNAs. 2 The largest subunit is Rbp1 (Polr2a) that is conserved from bacteria to mammals and harbors the enzymatic activity of the RNAPII complex. 2 Eukaryotic Rbp1 contains an exposed C-terminal domain (CTD) composed of tandem heptad repeats (Y 1 S 2 P 3 T 4 S 5 P 6 S 7 ), ranging from 26 in budding yeast to 52 in vertebrates. 2 Some conserved substitutions exist in specific heptads, with only 21 of 52 heptads perfectly matching the consensus sequence. 2 The high conservation of the 52 heptad repeats in vertebrates indicates that polymorphisms in this sequence are not tolerated. It is supposed that the CTD of Rbp1 acts as RNAPII code, which depending on the combinatorial modification pattern, recruits different histone and chromatin modifiers to impact transcription and RNA processing. To this end, we have recently identified the chromatin reader and effector protein PHF13, as a novel RNAPII S5 interacting protein. 1 Modifications of all CTD amino acids have been described (namely; phosphorylation, glycosylation, methylation, acetylation and isomerization) and their timely appearance is required for coordination of the transcription cell cycle. 3 The CTD of RNAPII can be phosphorylated at Y 1 , S 2 , T 4 , S 5 and S 7 and the phosphorylation of each of these residues is affiliated with distinct functions in the transcription cycle. 4 All of these residues are important for cell viability and their substitution to either phenylalanine or alanine, nonphosphorylatable residues, is lethal. 3,4 Similarly mutation of S 2 , T 4 , S 5 or S 7 with glutamate, an acidic phospho-mimic is lethal, 5,6 indicating that RNAPII's ability to return to a hypo-phosphorylated state is as important as the sequential phosphorylation of its CTD. Hypo-phosphorylated RNAPII is recruited to promoters with the help of the general transcription factors (TFIIA-TFIIH) and mediator complex which assemble to form the pre-initiation complex (PIC). 7 The PIC is phosphorylated by Cdk7, a component of the general transcription factor TFIIH, at position S5 and S7 of RNAPII CTD, causing its release from the mediator complex and leading to its translocation to the TSS or just downstream of the TSS. 4,8,9 RNAPII S5P/S7P is transcriptionally competent but is paused before elongation, a delay that is stabilized by the recruitment of two negative regulators of transcription, namely negative elongation factor (NELF) and DRB sensitivity inducing factor (DSIF). 10 In order for productive transcriptional elongation to commence, Cdk9 phosphorylates both NELF and DSIF converting them into positive regulators of transcription and concomitantly phosphorylates S2. 11,12 Early in transcription elongating RNAPII is phosphorylated at S2 and S5, and as elongation proceeds S5P is successively removed whereas S2P gradually increases toward the 3 0 end. 4 Therefore, the phosphorylation status of the CTD domain and key regulatory interacting factors is mediated by specific Cdk kinases, which in turn regulates the transcription cycle. Interestingly, PHF13 also contains several putative SP/TP Cdk phosphorylation sites and a Cdk consensus sequence S/T RX K/ R, 13 raising the possibility that it may also be a substrate of Cdk7 and/or Cdk9.

Polycomb poising
Polycomb complexes (PRC1/PRC2) repress many developmental genes in embryonic stem cells (ESCs) which are characterized by bivalent histone marks (H3K4me2/3 and H3K27me3) and are associated with RNAPII containing only S5P. 14 ERK1/2 is responsible for S5 phosphorylation of RNAPII at bivalent repressed genes and not Cdk7 19 , potentially explaining the absence of S7 and S2 phosphorylation on RNAPII at bivalent loci. RNAPII-mediated transcription is severely impaired in bivalent chromatin landscapes due to reduced histone acetylation 18 and the absence of the elongating S2P. 14 Coincidentally, PHF13 is highest expressed in stem cells, 20 where PRC2 repression of developmental promoters is important for pluripotency and likewise where PRC2-mediated repression of pluripotency factors is required for differentiation. Knockout (KO) of either PHF13 or core PRC2 factors results in defects in stem cell differentiation and maintenance arguing for their importance in these state transitions. 20,21 Furthermore, PHF13 forms a common complex with H3K4me2/3, PRC2 and RNAPII S5P in mouse ESCs as was demonstrated by their co-elution and co-precipitation in column chromatography. 1 Consistently, PHF13 co-localized at a subset of PRC2 bound genes and its depletion led to increased expression from these bivalent targets, arguing that it acts as a transcriptional co-regulator at a subset of polycomb regulated genes. 1 Furthermore, PHF13 depletion led to the reduction of both PRC2 and RNAPII S5P at H3K4me3/H3K27me3 demarcated chromatin, arguing for a specific role of PHF13 in the recruitment or stabilization of these complexes. 1 Taken together, these findings strongly suggest a mutual relationship between PHF13, PRC2 and RNA-PIIS5P in the transcriptional co-regulation of specific targets.

Promoter/promoter proximal pausing
In contrast to bivalent-repressed chromatin, RNAPII which is localized to H3K4me3 active promoters and promoter proximal regions is phosphorylated at both S5 and S7. RNAPII S5P/S7P is associated with promoter/promoter proximal pausing that allows for 5 0 capping of the nascent RNA transcript, 8 which can be visualized in ChIP sequencing as a bimodal accumulation of reads near the TSS and at the C1 nucleosome. 22 Interestingly, the mean exon length in mouse and humans is 147 nt and corresponds to the length of DNA wrapped around a nucleosome. 23 For a detailed reviews on RNAPII pausing, we refer the reader to Liu et al. (2015) and Jonkers et al. (2015) 24,25 . RNAPII S5P/S7P and RNAPII that is hypo-phosphorylated at S2, highly correlate genome wide with H3K4me3 and PHF13 enrichment, at or just downstream of the TSSs. 1 Consistently, PHF13 co-elutes with H3K4me3, RNAPII S2P, S5P and S7P in column chromatography, however, it preferentially co-precipitates with H3K4me3, RNAPII S5P and S7P from overlapping fractions, in line with their stronger correlation genome wide. 1 These findings indicate an intimate relationship of PHF13 with promoter and promoter proximal-associated RNAPII, which is lost in the elongating form of RNAPII, as S5P becomes depleted. Furthermore, PHF13 depletion specifically reduced the interaction of H3K4me3 with RNAPII S5P and not with RNAPII S2P with a concomitant decreased in gene expression from several of these targets. This suggests that PHF13 tethers promoter affiliated RNAPII to H3K4me3 and has less of an impact on elongating RNAPII which is enriched in gene bodies devoid of H3K4me3 and marked by H3K36me3. Together, these findings implicate a role for PHF13 in stabilizing promoter-associated RNAPII and in the co-transcriptional regulation of active genes. It remains, however, to be determined which activating transcriptional complexes PHF13 is involved with other than RNA polymerase II itself and whether or not PHF13 can recruit H3K4 methyltransferases to active promoters, in a manner similar to its ability to recruit PRC2 to bivalent promoters. Interestingly, PHF13 has been previously shown to play a role in DNA damage response (DDR) and DNA repair and likewise to interact with several proteins important for DNA repair and surveillance. 26 One such factor, TRIM28 (also known as TIF1b and KAP1), has been recently shown to globally regulate RNAPII promoter proximal pausing and release, which hinged on whether or not it was phosphorylated by DNA-PK and ATM, two important DDR kinases. 27,28 PHF13 interacts with all of these proteins to modulate DDR, and we have recently shown that in the absence of induced DNA damage (albeit DNA damage and recovery is a continuous and naturally occurring process, even in the absence of DNA damage inducing agents) PHF13 interacts with TRIM28 and ATM. 1 This raises the possibility that PHF13 may cooperate with these proteins in RNAPII promoter proximal pausing and release.

Splicing checkpoint
At splicing site junctions, RNAPII is phosphorylated at S2P/S5P. S5P is required for efficient co-transcriptional splicing and has been reported to be involved in spliceosome assembly at the 5 0 and 3 0 splice sites triggering a splicing checkpoint and RNAPII pausing at intron-exon junctions. [15][16][17][29][30][31] Mammalian NET-seq demonstrated RNAPII S5P enrichment at 5 0 and 3 0 splice sites 15 and phospho-specific RNAPII immunoprecipitations have revealed that RNAPII S5P interacts with key proteins involved spliceosomal assembly. 15,16,29 For a detailed reviews on co-transcriptional splicing, we refer the reader to Saldi et al. (2016) and Jonkers et al. (2015) 25,32 . Spliceosome assembly is a multi-step process that occurs on the pre-mRNA transcript, starting with recruitment of U1 snRNP at the 5 0 splice site 33 and U2 snRNP to the exonic nucleosome at the 3SS 34 and then at the branch point to form the pre-spliceosome (complex A). U4/ U6 and U5 snRNPs are then recruited to form complex B, which is then reorganized to displace U1 and U4 and form the activated spliceosome (complex C). Mass spectrometry of PHF13s interactome revealed that it co-precipitated with »50 spliceosomal related proteins that were enriched in complex A, B and C specific factors and U2 snRNPs (Fig. 1). PHF13 colocalized genome wide with H3K4me3 and RNAPII S5P at 5 0 splice sites after the first exon of expressed genes (Fig. 2). A functional interdependence between H3K4me3, RNAPII S5P, spliceosome assembly and co-transcriptional pre-mRNA processing has been previously shown. 15,30,35 Considering that PHF13 depletion reduces binding of RNAPII S5P to H3K4me3, this suggests a possible role for PHF13 in coupling transcription and co-transcriptional splicing at H3K4me3. Interestingly, U2 snRNPs also bind to H3K4me3 at the exonic 3SS 35 and may suggest a looping mechanism between the 5SS and 3SS exons during splicing and the formation of complex A.

H3K4me3 and co-transcriptional splicing
Several arguments support a role for H3K4me3 in cotranscriptional splicing: (1) H3K4me3 is found at active promoters predominantly overlapping with hypomethylated CpG islands, localizing from the TSS to the 5 0 splice site of the first exon. 36 (2) A recent study exploring alternative splicing, which used large data sets from ENCODE, has demonstrated the presence of promoter like signatures (H3K4me3, H3K27Ac and H3K9Ac) in exons which marked them for inclusion. 37 (3) In vitro transcription assays have demonstrated that H3K4me3 is not essential for transcription, 38 indicating that its presence either requires other effector proteins to impact transcription and/or that it is important for co-transcriptional processes and not transcription per se. (4) The deposition of H3K4me3 by H3K4-specific histone methyltransferases (HMTs) requires active transcription, 38 arguing that its positive correlation with transcription is rather a consequence than a cause and suggests that H3K4me3 may be more important for co-transcriptional processes. (5) H3K4me3 and pre-mRNA splicing are interdependent as attenuation of one negatively impacts the other. 35,36,39 Together, these findings support that H3K4me3 is more relevant for co-transcriptional splicing than transcription itself. Intriguingly, CHD1, an H3K4me3 reader and chromatin remodeler was demonstrated to bridge H3K4me3 to the U2 snRNP and depletion of either CHD1 or H3K4me3, perturbed pre-mRNA splicing. 35 This indicates that H3K4me3 found at splice sites of the first exon and at exons marked for alternative splicing, act as a molecular cue to direct the co-transcriptional splicing machinery via H3K4me3 molecular readers. PHF13 is another good candidate for such a role, making it tempting to speculate that PHF13 acts at the interface between RNAPII mediated transcription and co-transcriptional splicing.
Tying it all together with PHF13 Serine 5 phosphorylation of RNAPII is associated with different forms of transcriptional pausing: at bivalent promoters, at promoter proximal regions and at 5 0 /3 0 splice sites. This correlation with paused transcription similarly implies that PHF13 which is associated with RNAPII S5P in each of these contexts may also contribute to RNAPII pausing or retention. We have schematically depicted the different roles and relationship of PHF13 with RNAPII S5P in Fig. 2 and have used data obtained from ChIP sequencing experiments to support these models (for accession codes see figure legend). We suggest that PHF13 in conjunction with TRIM28 may act to regulate RNAPII S5P promoter proximal pausing, with PRC2 to enforce transcriptional pausing at bivalent loci and with spliceosomal proteins at splice site junctions to exert a transient splicing checkpoint. In conclusion, we speculate that PHF13 tethers and stabilizes RNAPII S5P to H3K4me3 that culminate in RNAPII pausing to allow for specific and essential co-transcriptional regulatory processes to occur.
In the future, it will be important to see if PHF13 can interact with and recruit specific H3K4 histone methyltransferases (HMTs) to chromatin, similar to its described function with PRC2 (a H3K27 HMT) and to explore the impact of PHF13 depletion on H3K4me3 deposition and splicing as both processes  SRR391040). The ChIP-seq reads for each experiment were individually mapped against the mouse genome (genome version mm10) using Bowtie2 with default parameters. ChIP-seq enrichment calling was done using normR R package (Johannes Helmuth and Ho-Ryun Chung (2016). normr: Normalization and difference calling in ChIP-seq data. R package version 0.99.7. https://github.com/yourhighness/normR) for PHF13, RNAPII (8WG16 antibody), H3K4me3 and H3K27me3 against individual input controls (FDR 0.05). For further analysis only PHF13 positive promoters were considered. Promoters enriched for H3K4me3 were regarded as active, while bivalent promoters were defined as H3K4me3 and H3K27me3 positive. Pausing index was calculated with custom script using RNAPII ChIP-seq data, genes with pausing index >10 were considered as paused. For data visualization, the signals were normalized and plotted in R.
are transcriptionally coupled. Furthermore, it will also be clearly important to experimentally and functionally characterize the specific impact of PHF13 on splicing and/or alternative splicing and it will be interesting to explore the consequence of PHF13 depletion on RNAPII pausing and the downstream processes that are associated with it. Answers to these questions will provide substantial insight into the detailed mechanisms of PHF13 co-transcriptional regulatory functions.

Disclosure of potential conflicts of interest
No potential conflicts of interest were disclosed.