Shared transcription factors contribute to distinct cell fates

Genome-wide transcription factor (TF) binding profiles differ dramatically between cell types. However, not much is known about the relationship between cell-type-specific binding patterns and gene expression. A recent study demonstrated how the same TFs can have functional roles when binding to largely non-overlapping genomic regions in hematopoietic progenitor and mast cells. Cell-type specific binding profiles of shared TFs are therefore not merely the consequence of opportunistic and functionally irrelevant binding to accessible chromatin, but instead have the potential to make meaningful contributions to cell-type specific transcriptional programs.

G enome-wide transcription factor (TF) binding profiles differ dramatically between cell types. However, not much is known about the relationship between cell-type-specific binding patterns and gene expression. A recent study demonstrated how the same TFs can have functional roles when binding to largely non-overlapping genomic regions in hematopoietic progenitor and mast cells. Cell-type specific binding profiles of shared TFs are therefore not merely the consequence of opportunistic and functionally irrelevant binding to accessible chromatin, but instead have the potential to make meaningful contributions to celltype specific transcriptional programs.
Transcription factors (TFs) are important regulators of cell-type-specific gene expression, and represent the paradigm for DNA-binding proteins that influence cellular development. Hematopoiesis has long served as a model to study the transcriptional control of cell type specification, and transcriptional regulatory elements for several major regulators of hematopoietic stem cells (HSCs) have been studied in detail using both mammalian and lower vertebrate model organisms. [1][2][3][4][5][6][7][8][9][10][11][12][13][14] Nevertheless, much remains to be learned about the cell-type-specific transcriptional mechanisms that govern hematopoietic cell type identity. A thorough investigation into how TFs can contribute to distinct transcriptional programs, therefore, is critical for understanding how cells acquire and maintain their identity. During hematopoietic stem cell differentiation and blood cell development, it is well known that these cells share many TFs across distinct lineages. The PU.1 TF, for example, auto-regulates itself in 2 different cell types (myeloid and B-cells) through co-operative interaction with distinct cell-specific TFs. 6 Furthermore, the SCL TF is required not only for specifications of HSC but also differentiation along the erythroid and megakaryocytic lineages. 15,16 Interestingly, binding sites for the same TF in 2 cell types have been shown to be largely non-overlapping and behave in a cell-type-specific manner. 17,18 These findings, therefore, raise the question as to whether both cell-typespecific and common TF binding patterns in the genome have functional consequences for defining cell fate. Several recent review papers provided in-depth discussions on what may constitute functional TF binding, suggestions to discover functional enhancers or indeed if co-operative TF binding represents a continuum rather than just 2 states. [19][20][21][22] These discussions revealed many unresolved questions on the topic including the fundamental need to identify binding activity that results in stable cellular states.
In a recent study, we examined the nature of binding site preferences and cooccupancy in 2 closely related cell types. 23 The cell types compared were primary mast cells and a multipotent hematopoietic progenitor cell line, HPC7, 24 which we have established as a useful model for studying early blood stem/progenitor cells. 25,26 Gene expression profiling by RNA-sequencing (RNA-seq) revealed many significant differences in the expression profiles of the 2 cell types. Nevertheless, many known TFs display similar expression in both cell types and this includes key regulators of hematopoietic stem cells (i.e., E2A, Erg, Fli1, Gata2, Lmo2, Meis1, PU.1, Runx1, and Scl). Generation of genome-wide TF binding maps by chromatin-immunoprecipitation followed by sequencing (ChIP-seq) of those 'shared' TFs uncovered largely nonoverlapping promoter and enhancer occupancy between the 2 cell types. For most of these HSC TFs, it was not known whether (and if so, how) they might have a role in the transcriptional regulation of mast cells.
Having established a unique large-scale dataset for comparative analysis, we next asked the question as to what is the role of 'shared' TFs in controlling mast cell specific transcriptional programs?
We examined the observation of distinct binding patterns further by quantifying the differences in binding and their relationship to gene expression in a regression model (Fig. 1A). Regression models provide a useful and simple approach to quantify the relationships between multiple predictor variables (i.e., 'shared' TFs) to a response variable (i.e., gene expression). Furthermore, the availability of high resolution genome-wide data (i.e., ChIP-seq and RNA-seq) allowed the construction of accurate predictive models. By considering genes bound by at least one of these TFs, these models describe gene expression as a function of combinatorial effects of one or more relative TF binding strengths. In the past, other studies have also utilized regression statistics to build a variety of prediction models that include, for example, TF binding data, histone modification, and consensus binding motifs. 27,28 Until recently, these studies have focused on predicting gene expression in one cell type, obtaining high levels of correlation with observed data. However, applying the model to another cell type often results in poor accuracy since static expression levels were used to construct the model. Our study, on the other hand, employed regression models for 2 cell types to predict changes in gene expression. Thus, when differential promoter and distal enhancer occupancy were encoded into the model, quantitative changes in TF binding were found to be predictive of quantitative changes in differential gene expression. Moreover, prediction accuracy improved when multiple binding events were taken into account.
Many regression models described so far assumed that TF binding and gene expression have a linear relationship. Although linearity provides an easy means for performing computations, it is known that this relationship could be non-linear, at least for a subset of TFs. Indeed, gene expression has been shown to be a nonlinear function of TF binding as shown in an analysis of K562 and GM112878 cell lines. 29 Here, the authors used generalized additive models 30,31 (GAMs) to identify TFs that influence cell-type-specific gene expression. Similarly, in the HPC7 and mast data set, using GAMs to predict changes in differential gene expression improved prediction accuracy compared to linear regression in 2 ways: (i) allowing non-linearity of TF binding and; (ii) incorporating TF interaction terms to account for co-operative interaction between 2 TFs. Importantly, we could conclude from our modeling that 'shared' TFs play an important role in HPC7 and mast cell transcriptional programs, since their differential binding is predictive of expression.
TF binding depends on interactions with DNA and other TFs at regulatory regions, although it has been suggested that in some occasions it can be largely driven by its cellular environment with no functional consequences. We compared the motif content of common and celltype-specific regulatory regions (Fig. 1B) under the hypothesis that the presence of binding motifs for 'shared' TF at celltype-specific bound regulatory regions may provide additional evidence of direct involvement of the 'shared' TFs in celltype-specific programs. We uncovered large numbers of cell-specific and common binding regions that contained consensus sequence motifs for the 'shared' TFs. To further analyze the role of 'shared' TFs in cell-type-specific genetic programs, we followed 2 approaches: (i) we reduced the levels of some of the 'shared' TFs (E2A, Erg, Fli1, Gata2, Lmo2, PU.1) by performing shRNA perturbation experiments (Fig. 1C) and analyzed the effect on cell-type-specific genetic programs and; (ii) we mutated the putative binding motifs present in celltype-specific promoters and analyzed the direct effect on gene expression. Our first approach showed that individual knockdown of these regulators significantly affects large numbers of regulated targets with significant changes in cell-typespecific gene expression. Worthy of note, we also found that reduction of the levels of one 'shared' factor can affect the recruitment of other 'shared' factors. Our second approach demonstrated that ablation of binding motifs for 'shared' TFs resulted in strong reduction or even complete abolition of promoter activity.
Although binding of 'shared' TFs without functional consequences cannot be ruled out for a given specific binding event, our results clearly indicate that the same TF can play an active and determinant role in different cell-type-specific transcriptional programs.

Conclusions and Future Directions
Our recent analysis of genome-wide binding sites and gene expression in HPC7 and mast cells has provided new insights into our understanding of lineage-specific transcriptional regulation e978173-2 Volume 5 Issue 5 Transcription during differentiation. Furthermore, it reports comprehensive genome-scale data for primary mast cells, where until now very little genome-wide data existed. The use of computational and experimental approaches provided several lines of evidence to show that occupancy of distinct regions in different cell lineages by the same TFs are functionally important and not just a consequence of the cellular environment (Fig. 2).
It is well known that combinatorial TF binding is prevalent in the regulation of metazoan gene expression. However, the specific rules governing these TF interactions and the effects of individual regulatory elements on gene expression remain largely unknown, even though they are recognized to have broad implications for cellular reprogramming. In a study of differentiating embryonic stem cells, for example, the Isl1 protein functions as a component of 2 different regulatory complexes that differ by one TF but lead to either spinal or cranial motor neuron formation. 32 TF binding site arrangements, genomic sequence context (e.g., flanking bases, GC content), enhancer composition (heterotypic or homotypic arrangements of TF binding), and DNA tertiary structure are all expected to influence transcriptional activity. Elucidation of the molecular mechanisms, therefore, will improve our understanding of how the same TF contributes to distinct cell fates. Research in this area has so far provided little consensus and one example is demonstrated by the 'enhanceosome' vs. 'TF collective' debate. The classical example of TF complex formation as proposed by the enhanceosome model requires a precise configuration of multiple TFs to function as a unit of regulation. In the well-studied IFN-b enhancer, synergistic interaction between all essential TFs is necessary and leads to an 'all-or-nothing' response. 33 At the opposite end of the spectrum, the 'TF collective' model suggests that the effects of TF binding on gene expression are cumulative with varying degrees of potency and redundancy. A recent study comparing TF binding in 2 mouse strains has shown that both 'shared' and celltype-specific TFs are important for establishing the epigenetic and transcriptomic landscapes in mouse macrophages. 34 Of note, comparison of naturally occurring single nucleotide polymorphisms (SNPs) that differed between the 2 mouse strains revealed strain-specific binding site motifs that correlated with strain-specific gain or loss of TF binding and influenced the recruitment of cell-type-specific factors. The findings in this study also emphasized the presence of binding sites and nucleosome conformation as important features for co-operative TF binding although defined distances between TFs was not crucial. Integrative analyses taking into account dynamic binding, sequence information and 3-dimensional DNA structure to infer general principles of transcriptional regulation would, therefore, help to resolve the disparate findings in this field.

Disclosure of Potential Conflicts of Interest
No potential conflicts of interest disclosed.  www.landesbioscience.com e978173-3 Transcription