Modeling the 3D genome of plants

ABSTRACT Chromosomes are the carriers of inheritable traits and define cell function and development. This is not only based on the linear DNA sequence of chromosomes but also on the additional molecular information they are associated with, including the transcription machinery, histone modifications, and their three-dimensional folding. The synergistic application of experimental approaches and computer simulations has helped to unveil how these organizational layers of the genome interplay in various organisms. However, such multidisciplinary approaches are still rarely explored in the plant kingdom. Here, we provide an overview of our current knowledge on plant 3D genome organization and review recent efforts to integrate cutting-edge experiments from microscopy and next-generation sequencing approaches with theoretical models. Building on these recent approaches, we propose possible avenues to extend the application of theoretical modeling in the characterization of the 3D genome organization in plants.


Introduction
The organization of chromosomes modulates the activity and efficiency of all DNA-related processes -from mitosis to DNA repair and from replication to gene expression [1][2][3]. Changes to the native structure of chromosomes by natural processes, such as mutation, transposition and recombination, or transgenic approaches can lead to drastic changes in the activity of these processes. In humans, this has been shown to result in severe disease phenotypes [4][5][6].
The effect of the genome on transcription can be divided into three interwoven elements. First, the genome is a one-dimensional object consisting of an array of nucleotides that define genes and regulatory sequences (e.g., promoters, enhancers, and insulators) on the DNA molecule [7]. Second, the DNA is locally wrapped around histones to form chromatin. Both DNA and histones can be decorated by a regulatory layer of proteins (e.g., transcription factors, mediators) or chemical groups, which affect how the DNA is read and used in the nucleus without changing the sequence. The mechanisms of action of these DNA regulators and their inheritability are active research topics in the fields of epigenomics and epigenetics. Third, the genome is organized in chromosomes which are the physical carriers of the genes, and have a complex and dynamic structural (3D) folding. When completely stretched out, chromosomes are centimeters in length [8]; however, within the eukaryotic nucleus, they occupy a highly limited space of few micrometers undergoing extreme compaction and compartmentalization. As a result, chromosome folding may bring functional sequence elements that are distant along the genomic sequence close to each other in the 3D space, thus, constituting a fundamental layer of gene regulation (Bonev and Cavalli 2016).
Each layer exhibits a remarkable flexibility and inherent property to dynamically reorganize. Chromosomal regions may suffer changes in their compaction states from a highly condensed to a looser structure and may localize to different areas of the nucleus. Interestingly, this characterization of the genome's 3D structure was possible thanks to synergistic experimental and theoretical approaches, which have allowed the analysis, interpretation, and modeling of the experimental data unveiling how the layers of the organization interplay with each other, and how the 3D genome relates to (epi)genetic features ( Figure 1).
In plants, chromosome structure has historically been studied at microscopic level. Since 2012, chromosome conformation capture (3C) technology, which uses the likelihood of contacts between distant chromosomal sites to infer chromosome folding, and its high-throughput variant Hi-C have been implemented in the field of plant sciences and, in conjunction with high-and superresolution microscopy, allowed major progress in the elucidation of the organization of chromosomes in the 3D space of plant nuclei [15]. Overall, plant chromosome organization follows the architectural principles observed in animals [15]. However, a set of unique features and significant heterogeneity across species has been noted in chromosomal organization of plants. An array of recent review articles provides a comprehensive overview of our current understanding of the organization of chromosomes in the 3D space of the plant nucleus [16][17][18][19][20][21]. Here, we will summarize the main characteristics of  nuclear chromosome organization in plants, and introduce a conceptual framework for the application of theoretical chromosome modeling to expand our ability to interpret and understand the complexity in plant genome organization in the nuclear space.

Nuclear shape
Plant nuclei are characterized by a remarkable structural plasticity. In the model species Arabidopsis thaliana alone, nuclear form varies between spherical, spindle, oval, invaginated, flattened and rod-shaped [22][23][24][25]. The differences in shape are accompanied by variability in size, whereby cell size largely correlates with nuclear size [25]. In addition to cell-type-specific size differences, changes in nuclear size may occur in response to developmental and environmental conditions. For example, in plants, a decrease in nuclear size has been observed upon osmotic stress and seed dormancy, while the application of heat stress has been shown to be accompanied by an increase in nuclear size [26][27][28][29][30]. Components of the nucleoskeleton, cytoskeleton, nuclear envelope and nuclear pores as well as lamin-like structures adjacent to the nuclear envelope have been suggested to determine nuclear shape in plants [25,[31][32][33][34][35][36][37][38][39][40]. The nuclear envelope and the associated lamin and lamin-like structures have been shown to play important roles in the topological organization of genomes and transcriptional activity in both animals and plants [41]. Large sections of chromosomes are associated with the nuclear periphery [42][43][44]. These so-called 'laminaassociated domains' (LADs) and 'plant laminaassociated domains' (pLADs) are typically characterized by heterochromatic organization and low transcriptional activity [42,43,45]. Lamin and lamin-like proteins mediate the interaction between the nuclear periphery and chromatin. Loss of lamin-like CROWDED NUCLEIC (CRWN) proteins in A. thaliana has been shown to lead to drastic changes in chromatin organization [31,44,46].

Territorial organization of chromosomes
In the interphase nucleus, individual eukaryotic chromosomes segregate into distinct regions [1,9]. This segregation into chromosome territories has been suggested to enable a functional compartmentalization of the nucleus [1]. In plants, chromosome territories have been identified in species of various genome sizes [22,[47][48][49]. Within these territories, plant chromosomes often adopt distinct configurations. The relatively small genome of A. thaliana (~135 Mb) arranges into a so-called 'rosette' structure which is characterized by a core element consisting of heterochromatic chromocenters and emanating loops of euchromatic chromosome arms [47]. Larger plant genomes such as those of wheat, barley and oat are often organized in a 'Rabl' configuration [50,51]. Rabl chromosomal arrangements are characterized by the localization of centromeres and telomeres at opposite poles of the nucleus. Notably, these chromosomal configurations can be variable across plant tissues [51]. A third major structural configuration -the 'bouquet' -is associated with the early meiotic prophase and characterized by clustering of telomeres at the nuclear envelope [51][52][53][54][55].

Chromosome compartments
Within chromosome territories, chromosomal regions can be broadly divided into active A and repressive B compartments [10]. Largely defined by chromatin marks and transcriptional activity, these compartments reflect the preferential interaction of active with active and inactive with inactive areas of the genome [10,56]. Plant genomes show a similar partitioning into active and inactive compartments [46,57,58]. Analyzed on whole chromosome-level, euchromatic chromosome arms represent active A compartments and centromeric as well as pericentromeric regions correspond to repressive B compartments [46,57]. Examined within chromosome arms only, a further partitioning into sub A and B compartments can be observed. Referred to as loose and closed structural domains (LSDs and CSDs) in A. thaliana, these sub A and B compartments separate euchromatic and NUCLEUS heterochromatic regions within chromosome arms [46,57]. The distribution of sub-compartments along chromosomes shows tissue-specific dynamics and may change upon activation and repression of chromosomal regions [59][60][61].

Local physical domains or TADs
TADs are organizational units of the 3D genome that show increased within contact frequency [11,62]. While TADs are hallmarks of the mammalian and the Drosophila 3D genome, plant genomes show a more diverse TAD and TAD-like organization. Species such as A. thaliana and Arabidopsis lyrata lack a traditional TAD pattern and TAD-like structures are limited to small and dispersed chromosomal regions [46,63,64]. In contrast, in other species, such as wheat, maize, rice and the liverwort Marchantia polymorpha a more prominent TAD patterning of their respective genomes can be observed [49,57,58,65]. However, these TADs do not always neighbor each other; instead, non-TAD areas may be located adjacent to TADs [65]. On plant chromosomes, heterochromatic DNA elements are often enriched within TADs while TAD boundaries are marked by active genes [49,57,65]. Concia and collaborators have introduced the term ICONS (intergenic condensed spacers) to describe the non-canonical genetic organization within these TADs [49]. TADs can be further classified into different categories depending on their decoration with chromatin marks and association with transcription factors [57,66]. Unlike in other plant species, genes within TADs of M. polymorpha show an increased tendency for co-expression [65].

Chromatin loops
Chromatin loops describe short-and long-range interactions of chromosomal sites distant from each other on the linear sequence level. In plants, chromatin loops have been described in the context of distant regulatory site and promoter contacts, contacts between 5 and 3 ends of genes, and interactions of gene islands and heterochromatic islands scattered across the genome [46,[67][68][69][70][71][72][73][74][75]. For example, high-resolution Hi-C analysis has identified a high prevalence of short-range loops in the A. thaliana genome [76]. Furthermore, by Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-Pet) and in situ digestion-ligation-only Hi-C (DLO-Hi-C) a widespread formation of gene-to-gene, promoter -promoter and promoter -distal regulatory site loops have been identified in maize and rice [69,70,74,77]. Here, loops predominantly span regions between 100 and 500 kb yet can connect DNA sites up to >2 Mb away from each other. Genes connected in gene-to-gene loops show a tendency for co-expression and are suggested to form spatial gene clusters in accordance with the concept of transcription factories as described below [49,71,78,79]. Gene -distal regulatory site loops in maize are established between single promoters or promoters of multiple genes and correlate with gene expression differences. Interestingly, such chromosomal loops partly overlap with intergenic quantitative trait loci in both maize and rice [69,70,74]. In addition to transcription factors and chromatin markings, long non-coding RNAs (lncRNA) have been implicated in chromatin loop formation. Two recent reports provide mechanistic insight into the interplay between lncRNA and chromatin loops in A. thaliana.
Here, it is shown that the lncRNA APOLO is involved in the regulation of auxin-responsive genes and the lncRNA MARS in the abscisic acidinduced expression of a biosynthetic gene cluster. Thereby, APOLO interacts with LHP1, a homolog of the animal HP1 and component of the PcG complex, and associates with locally formed loops at multiple loci across the A. thaliana genome. The recognition of target sites by APOLO is suggested to be mediated by the formation of R-loops [80,81]. MARS interacts with LHP1 and modulates loop formation within the marneral biosynthetic gene cluster. Once expressed, MARS binds LHP1 and decoys LHP1 away from the cluster. The formation of this loop is abscisic acid driven and is suggested to connect an enhancer element with its target gene [82].
An intriguing characteristic of nuclear chromosome structure in A. thaliana is the so-called KNOT chromosome structure [46,63]. Here, chromosome regions of 50 to 150 kb in size and enriched in transposable elements form a strong network of intra-and interchromosomal contacts in the 3D nuclear space. Recent findings indicate that KNOT regions have the potential to reorganize and incorporate newly integrated DNA elements into the KNOT structure [83]. This process has been proposed as a mechanism of gene silencing of foreign DNA elements [83]. Interestingly, KNOT formation is altered in several chromatin mutant lines and shows organ-specific diversity [59,83,84]. A similar structure, called the 'compact silent center' (CSC), has been identified in rice genomes and shows similar potential for chromosomal reorganization in different cell-types [61,75]. In M. polymorpha, a network of chromosomal regions labeled with H3K27me3 and showing extensive long-range interaction has been suggested to resemble KNOT regions in A. thaliana and rice [65].

Major structural units within the nucleus
Multiple major structural units can be identified in the nucleus in addition to the hierarchical units of chromosome territories, compartments and domains. These units anchor chromosomal regions and are of functional importance for chromosome processes such as transcription and replication. Structures like the nucleolus, chromocenters and telomeres are formed around major chromosomal sequence elements [21]. Other nuclear bodies, such as Cajal bodies and speckles, are associated with distinct epigenomic and transcriptional states of chromosomal regions [21].

Nucleolus
The nucleolus is the largest compartment in the nucleus. It is the site of ribosome biogenesis and is characterized by a high density of proteins [85]. In higher plants, the nucleolus is typically organized in a near-spherical shape that dynamically adapts form, size, and position within the nucleus according to the cell type, cell cycle phase, transcriptional activity and physiological state of the cell [86,87]. The nucleolus forms around active nucleolar organizer regions (NORs), tandem arrays of rRNA genes [88]. In addition to the NORs, a substantial fraction of the genome can dynamically associate with the nucleolus. These regions are collectively termed 'nucleolus-associated chromatin domains' (NADs) [89]. In humans, NADs comprise primarily gene-poor and heterochromatic chromosomal regions [89]. In A. thaliana, NADs contain actively transcribed rRNA genes, subtelomeric regions and hundreds of silenced genes [90]. It has been suggested that NAD composition is primarily defined by rRNA gene organization and transcriptional activity. Indeed, it has been observed that both loss of rRNA copies and changes in the rRNA expression state lead to changes in NAD composition [90][91][92][93]. The application of modest heat stress to A. thaliana seedlings results in reorganization of the nucleolus. This, however, is not associated with changes in NAD composition [94].

Chromocenters
Chromocenters are detectable as highly condensed chromosomal regions in the interphase nucleus. In A. thaliana, chromocenters are formed by heterochromatic centromeric and pericentromeric regions [47]. Here, chromocenters tend to be positioned at the nuclear periphery [22,47,57]. In Hi-C contact maps of A. thaliana chromosomes, chromocenters are characterized by strong interaction patterns [63]. Intra-centromeric and pericentromeric interactions are less pronounced in other plant species such as maize, tomato, rice and foxtail millet [57,58]. However, in genomes with Rabl configuration, significant enrichment for intercentromeric contacts are detectable [57]. In A. thaliana, intra-chromocenter interactions vary between tissues and plants grown under different environmental conditions. For example, heat stress is associated with reduced chromocenter interactions and a root-leaf comparison shows decreased chromocenter contacts in root nuclei [30,59,95]. Furthermore, chromocenter decondensation has been shown in A. thaliana mutants with reduced capacity for DNA methylation and histone H3 lysine 9 methylation [63,96]. Similar to chromocenters, so-called knobs are visible as condensed chromosomal areas in the interphase nucleus. However, in contrast to chromocenters, they are not associated with centromeres. Instead, they are composed of arrays of tandem repeats primarily positioned on chromosome arms [97][98][99][100]. Interestingly, circular chromosome conformation capture (4C) experiments that measure the genome-wide contact probabilities of a target site have NUCLEUS shown an enrichment of interactions between the hk4s knob region and pericentromeres in A. thaliana [95]. The hk4s knob is derived from an inversion of a pericentromeric region and it is suggested that its 3D interactome reflects the original genome positioning of the knob [95].

Telomeres
Telomeres constitute the ends of chromosomes and are characterized by an array of repetitive elements. In both rosette and Rabl configurations, telomeres of different chromosomes are colocalized. In A. thaliana and sorghum, telomeres are positioned at the nucleolus and in species such as wheat and barley they are polarized to one side of the nucleus [47,50,52,90]. Nucleosome decoration and arrangement have been implicated in the conformational properties of chromocenter and telomeres in A. thaliana. For example, loss of the linker histone H1 has been associated with a global chromatin decondensation particularly pronounced for the pericentromeric chromosome regions [101]. Loss of H1 was further shown to be associated with more frequent interactions between telomeres and their re-positioning away from the nucleolus [102]. A similar pattern of enhanced chromocenter decondensation and telomere interactions has been observed in the chromatin remodeling mutant morc6 [63,103].

Nuclear bodies
Nuclear bodies such as Cajal bodies, Polycomb bodies and transcription factories are detectable as small cytological structures interspersed throughout the nucleus [21,104]. The shape and formation of nuclear bodies is dependent on the developmental and physiological state of the individual cell. Typically, they provide microenvironments for specialized nuclear processes such as transcriptional regulation and DNA repair and are often enriched for distinct proteins [21,104].
Associated with the nucleolus, Cajal bodies contain components of the RNA processing machinery [105]. Variable in size and number across cell types, these subnuclear organelles play important roles in the processing of RNA species and ribonuclear proteins. In plants, Cajal bodies have been suggested to be involved in gene regulation, viral infections and the environmental stress response [106,107]. In animal systems, Cajal bodies have been implicated in genome organization and shown to associate with clusters of histone genes [105,108].
Polycomb bodies are enriched for Polycomb group (PcG) proteins. PcG proteins play major roles in the epigenetic silencing of genes and establish distinct foci in the interphase nucleus. Genomic regions bound by PcG proteins and marked with the histone modification H3K27me3 tend to cluster in the linear and threedimensionally folded genome in both plants and animals [105,[108][109][110][111][112][113]. Impaired PcG activity in A. thaliana results in reduced contact probability between H3K27me3 marked domains [59,63,114].
In contrast to polycomb bodies, transcription factories are associated with active transcription. Transcription factories are discrete nuclear foci that are composed of a transcriptional complex containing active RNA polymerase II. Linearly nearby genes as well as genes distant in cis and trans may be positioned within a single transcription factory [78,79]. Recent findings by Concia and collaborators (2020) suggest that transcription factories are also established in the 3D genome of wheat [49]. Such sub-nuclear co-localization of genes in transcription factories is proposed to facilitate co-ordinate expression of genes [49,115].

Modeling 3D chromosome architecture
Together, classical experimentation and recent advances in microscopy and structural genomics have provided us with a solid knowledge base on the nuclear chromosome organization of plants. It is, however, worth noting that our current models of plant chromosome organization are so far, by large, lacking a generalized interpretation and a robust understanding of the key elements driving nuclear chromosome folding. In the following section, we introduce the latest development from structural (3D) computer modeling of chromosomes and highlight how numerical approaches have helped to analyze and interpret experimental results. Most applications of modeling have been aimed at animal species, but, notably, a recent application involved unveiling the constitutive mechanisms of the genome in A. thaliana [116]. Furthermore, we propose that modeling approaches could be extended to other plant species and help to unravel the specific complexity of their 3D genomes.

Data-driven modeling of the 3D genome architecture
In the past, the development of new experimental techniques in structural genomics has been complemented by theoretical approaches aimed to generate three-dimensional (3D) models of the genomic region of interest. An important example was the introduction of the 3C technique, which determined the folding of yeast chromosome III [117]. More recently, the introduction of the single-cell Hi-C (scHi-C) [118] technique was complemented with the modeling of the cell-specific entire X chromosome at a resolution of 500 kb, which allowed correlation of the scHi-C data with results from FISH imaging. Also, the potential of new super-resolution imaging techniques has been accelerated by combined experimental and modeling approaches. Nir and collaborators showed that by integrating OligoSTORM (Stochastic Optical Reconstruction Microscopy) and OligoDNA-PAINT (Point Accumulation for Imaging in Nanoscale Topography) imaging with Hi-C interaction maps, it was possible to reconstruct the structure of active and repressive (A/B) compartments. This allowed a quantitative examination of the compartment-type dependent degree of entanglement, which was not immediately accessible from neither the images nor the interaction data [119].
The modeling approaches discussed so far are part of the so-called data-driven (top-down) modeling. The latter encompasses a plethora of strategies in which the 3D organization is directly inferred from experimental data [120]. These approaches typically follow four methodological steps:

Data collection
Source data for modeling approaches are produced contextually or gathered from repositories such as GEO (Gene Expression Omnibus) [121] and subsequently formatted and analyzed to make them usable for a modeling pipeline [122,123]. Examples of data which can be used are as follows: the shape and size of nuclei [124,125], the positions and the spatial distances between genomic loci in the nucleus [119], or the interactions counts measured in 3C-based experiments [124,[126][127][128].

Data representation
The first important step of 3D genome modeling is to represent the chromatin fiber as a physical object (polymer) of consecutive particles (monomers). Most of the modeling strategies use spherical particles which, depending on the approach, can have all the same size and represent the bins of the experimental interaction map obtained in Hi-C experiments [122,127], or can have different sizes to describe TADs [128][129][130] and the regions probed during imaging experiments [131].

Model scoring
A mathematical function is defined to evaluate the consistency between each conformation of the models' particles and the experimental data. The aim of this is to favor the 3D models that recapitulate the input data (spatial distances or contact propensities) and penalize the ones that are not compatible. The definition of this so-called scoring function is typically one of the most delicate tasks of the approach and requires significant trial-anderror, since an inaccurate score might lead to inadequate solutions. This is especially true for modeling based on Hi-C data, because the definition of the scoring function requires the transformation of the interaction counts into spatial distance restraints, which is typically a complex task.

Model sampling
The possible model conformations are sampled using Monte Carlo or molecular dynamics methods to explore as many solutions as possible compatible with the scoring function. Finally, the sampled structures are ranked based on the scoring function and the models optimally satisfying the imposed data-driven restraints are deemed the ones representing the input data and retained for further analysis [120,127].
Although data-driven approaches have been widely used in animals [120,132,133], applications to characterize the structural organization of plant genomes are limited. In rice, single-cell Hi-C NUCLEUS (scHi-C) interaction maps have been used to obtain genome-wide models of eggs, sperm, unicellular, zygotes (Z) and mesophyll (M) cells [61]. The 3D models were instrumental in characterizing cell-specific features of chromosome compartments and telomere/centromere configurations. In particular, the 3D genomes of the eggs and unicellular zygotes were found to contain a 'compact silent center' (CSC) that is absent in sperm cells. CSC appears to be reorganized after fertilization, and may be involved in the regulation of zygotic genome activation [61].
In recent work [59], we used TADbit [122], one of the available modeling tools, to study the spatial organization of clusters of neighboring and coexpressed genes in A. thaliana. Using highresolution capture Hi-C data as source data, our structural modeling of the major 3D domains associated with such a cluster indicated that the transcriptionally active cluster assumes a compact conformation in which the clustered genes are in spatial proximity. When transcriptionally silent, the gene cluster is more extended and incorporated into a chromatin loop, which brings the coexpressed genes in spatial proximity with a nearby region of unknown function (Figure 2) [59].

Bottom-up modeling of the 3D genome: a lesson from animal species
Theoretical strategies in chromosome modeling also include bottom-up (hypothesis-driven) approaches. The latter aims to build predictive models that test mechanistic hypotheses derived from experimental observations. By comparing predictions of genome structure to experiments, the models allow to invalidate or consolidate the underlying assumed mechanisms and, more interestingly, to propose and guide new experiments to obtain further insight. Relying on computer simulations and theoretical arguments as their primary tools, bottom-up modeling takes advantage of experimental data to parametrize the models and to validate the obtained results. The ultimate goal of this approach is to provide simple testable rules that can contribute, even partially, to understand the complexity of genome architecture, The application of bottom-up modeling helped to propose and test several hypotheses on the passive and active physical mechanisms regulating the structural organization of genomes at different scales (for references see for example [134][135][136]).

Territorial organization of chromosomes
At the scale of entire chromosomes, polymer physics arguments and computer simulations hypothesized that chromosomes might organize as unknotted and unentangled crumpled or fractal globules [10,134,137,138], which can recapitulate the average spatial organization of chromosomes from imaging [139] and Hi-C measurements [10]. Furthermore, building on the analogy between ring and long confined polymers, physics arguments can recapitulate the formation of chromosome territories in interphase [134,138,140,141].

Chromosome compartments
Within chromosome territories, bottom-up approaches have suggested that chromosome compartmentalization might be stabilized by epigenomic-driven interactions [135,[142][143][144]. Chromatin domains with the same epigenomic marks are proposed to interact with each other. The central idea is that phase-separation mediated by proteins, shown in vitro for heterochromatin protein 1 (HP1) [145,146], might also occur in vivo, leading to chromatin compartmentalization. The first study in this field focussed on Drosophila melanogaster [135]. It showed that block copolymer models built from the epigenomic landscape reproduce the formation of chromatin domains found in Hi-C interaction maps [62]. Additionally, these models suggested that epigenomic-driven chromosome domains are multi-stable as they can form and disassemble over time as well as interact dynamically with each other.

Local physical domains or TADs
The formation of physical domains or topologically associating domains [11,62,147] can be recapitulated by different mechanisms, including DNA supercoiling [148], loop-extrusion via active or passive mechanisms [136,149,150], or transcription factor-mediated contacts [151]. At the local scale, where promoter-enhancer contacts occur, the string-and-binders polymer model [152] has been employed to dissect the folding at several loci, such as Xist and HoxB [153,154]. In particular, the loop extrusion mechanism proposed a role for insulating proteins (CCCTC-binding factor or CTCF) and for proteins actively extruding chromatin (Cohesin). Interestingly, CTCF encoding genes are absent in plants and no functionally related proteins have been identified to date [19,155]. Cohesin proteins have been shown to be essential for chromosome pairing and meiosis in plants [156,157]. The role for cohesins in territory or domain formation in plant interphase nuclei, however, remains unknown.

Bottom-up modeling in plants
In plant species, a limited set of modeling approaches have been proposed so far. For example, Pecinka and collaborators developed models of A. thaliana chromosomes at the resolution of 1Mb per particle and used simulations to test whether the association of chromosome territories (CTs) in interphase nuclei could be ascribed to random chromosome pairing [22]. Specifically, chromosomes were organized initially as linear rods and allowed to decondense inside confined environments of different shapes and sizes to Left, 2D Capture Hi-C interaction maps of a 170 kb region in the A. thaliana genome that contains a gene cluster of co-expressed neighboring genes. Right, 3D modeling of the same 170 kb region using TADbit. In green, gene cluster. In (a), the gene cluster is silenced. In (b), the gene cluster is expressed. Adapted from Nützmann et al. [59]. NUCLEUS mimic the diverse nuclei until the available space is filled uniformly [22]. Interestingly, the models demonstrated that chromosome pairing could be ascribed to random association even though some chromosomes had high association rates in images (e.g., chromosomes 1, 3 and 5 were seen to associate in up to 70% of the spherical nuclei).
More recently, the genome of A. thaliana was also explored by polymer-based modeling [158]. In particular, the authors tested which chromosome topology could recapitulate the positioning of chromocenters and the nucleolus at the periphery and center of the nucleus, respectively. Interestingly, the models suggested that only a chromosomal rosette conformation could recover the expected nuclear positionings. However, none of the tested models was able to reproduce the association between chromocenters. This suggested that additional mechanisms that were not implemented in the models play critical roles in chromocenter associations [158].
To deepen the understanding of the 3D genome organization in A. thaliana, we recently applied bottom-up modeling approaches integrating data on the length of the chromosomes and the nucleolar-organizing regions, the size and shape of the nucleus and the nucleolus, as well as epigenomic features [116]. Specifically, we incorporated the observation that chromosome regions hosting the same histone marks tend to co-localize in the 3D space forming compartments. Hence, we partitioned the genome in epigenomic states by looking at the enrichment in histone marks: active (A, enriched in H3K4me1/3 and H3K27ac), constitutive heterochromatin (CH, enriched in H3K9me3), facultative heterochromatin or polycomb-like (FH, enriched in H3K27me3), and undetermined (nonenriched) chromatin. We next tested several possible physical interactions between beads of the same or different epigenomic state and found that to optimize similarity with Hi-C contact patterns some interactions were needed (Figure 3(a)). These included attractive interactions between the nucleolar organizing regions on chromosomes 2 and 4, repulsive interactions between constitutive heterochromatin and the other chromatin states, as well as self-attractive interactions of active and polycomb-like domains. Additionally, to maximize the correlation with the Hi-C data, we had to organize the initial chromosome conformations as V-shaped objects (Figure 3(b)). The latter is a sign of an interesting 'structural memory': chromosomes in interphase are partially reminiscent of the overall reorganization they undergo during anaphase, when the two copies of each chromosome during cell division are pulled centromere first to opposite poles of the mother cells. These mechanisms allowed the recovery of several experimental features including the genome-wide Hi-C interaction pattern, the formation of the nucleolus in the nuclear center, the positioning of telomeres at the nucleolar periphery, and the enrichment of constitutive heterochromatin at the nuclear periphery (Figure 3(b,c)).

Perspectives
So far, approaches of theoretical modeling have not widely been employed in studies investigating plant genome architecture. To expand their application in top-down or bottom-up modeling strategies of plant genomes, a set of minimal data must be available for the species or condition of interest. Specifically, an estimation of chromosome length and ploidy state for the investigated cells is essential for model parameterization and for accuracy and reliability of the quantitative predictions. Additionally, information regarding size and shape of the nucleus will benefit the confinement of genome-wide models. Epigenomic and transcriptomic data will enable the development of polymer models with beads of individual interaction characteristics. Importantly, it should be noted that a subset of data could be left aside for initial model production and instead be used for model validation.
Several intriguing open questions in plant genome organization could be addressed using modeling approaches. It is feasible to expand the analyses performed in A. thaliana [116,158] to predict chromosome architecture across different cell-types and plant species. An obvious direction would be to develop simulations that merge larger experimental datasets, covering epigenomics, Hi-C, and microscopy, with different nuclear shapes and sizes to study their impact on genome structural organization. Another route may be to simulate different chromosome length as well as positioning and size of centromeres to study their impact on Rabl, rosette or bouquet-like chromosomes organization. This could be combined with experimental data from plant species with diverse chromosome sizes and centromere positioning. An important aspect that emerged from our previous work is the importance of 'structural memory' in A. thaliana chromosome organization and how the preferred V-shape of chromosomes is ultimately related to the presence of a unique centromere, which corresponds to the kinetochore. It would be of great interest to model chromosomes with multiple centromeres [159] and characterize Chromosomes are initially organized as V-shaped objects and, after molecular dynamics simulations in which the epigenomic-driven interactions are enforced, the system reaches a steady-state conformation where chromosomes spread within the spherical model nucleus. The contact maps computed on the models recapitulate experimental (Hi-C) data [76]. (c) The radial positioning of epigenomic regions in the optimal-interaction system is compared with a reference case (black curves) in which all interactions are dropped but the ones involving NORs and telomeres. The model predictions recapitulate what is expected from imaging experiments: the nucleolus (NORs) mainly occupies the nuclear center, telomeres localize at the nuclear periphery (~1400 nm from the nuclear center), and constitutive heterochromatin is significantly enriched at the most peripheral shell of the model nucleus (two-sided Wilcoxon test p-value <0.0001). Adapted from Di Stefano et al. [116]. NUCLEUS their role in genome-wide organization. Furthermore, our current modeling approaches have been unable to integrate KNOT structures in the optimal models for A. thaliana genome organization. Refinement in the preconditioning of these models and incorporation of novel experimental data on genomic features of KNOTs and their short or long-range interactions may enable us to elucidate their functional and structural constraints within the nucleus.
In plants, complex ploidy states and the extreme variance in genome sizes provide further tantalizing routes for chromosome modeling. Moreover, simulating the integration of novel DNA elements, such as transposons, introgressions and transgenes, into plant genomes will benefit our abilities to predict their impact on native genome organization. Altogether, these approaches may expand our fundamental understanding of eukaryotic genome organization and improve gene technology and breeding processes.
A wider adoption of modeling approaches in the plant kingdom will be favored by an expansion of existing or the establishment of novel databases that host genomics, epigenomics and microscopic data for various plant species with unified quality standards and nomenclature. This would enable a rapid input of data into generated simulation pipelines and significantly facilitate the development of optimized models.

Conclusions
Here, we reviewed essential aspects of plant chromosome organization and recent efforts of the experimental and modeling community to unveil the principles regulating 3D genome organization, and its interplay with the epigenome. Overall, we believe that further synergistic studies integrating experiments and modeling approaches will advance our understanding of the rules and constraints of plant chromosome organization. In our view, these studies should aim both to apply the tools developed and used to study the 3D genome in animals, but also to establish new modeling strategies to help address open questions in plant chromosome organization. Ultimately, this will allow us to build unified models of genome organization in the eukaryotic nucleus.