Comparative transcriptomics of cupuassu (Theobroma grandiflorum) offers insights into the early defense mechanism to Moniliophthora perniciosa, the causal agent of witches’ broom disease

ABSTRACT Cupuassu (Theobroma grandiflorum) is a fruit tree native to the Amazon region, presenting high social and economic value. Besides, owing to its suitability for agroforestry cultivation, cupuassu is useful for the conservation of the Amazon Forest. Cupuassu plantations are severely affected by Moniliophthora perniciosa. Thus, to gain insights into resistance against M. perniciosa, transcriptomes of susceptible (SG) and resistant (RG) genotypes of cupuassu were analyzed at the early stage of infection using RNA sequencing. A total of 21,441 unigenes were identified, and differentially expressed genes were detected in intra- (440) and inter-genotype (301) analyses. Gene expression was altered at 24 h after inoculation (HAI) in SG. This alteration was prominent at 48 HAI in RG. These datasets allowed the identification of genes potentially involved in defense mechanisms. Phytohormone signature analysis revealed a significant effect of hormones on genotype responses. The present study is the first large-scale transcriptomic analysis of cupuassu.


Introduction
Cupuassu (Theobroma grandiflorum (Willd. ex Spreng.) Schum.) is a fruit tree native to the Amazon region and is closely related to the cacao tree (Theobroma cacao L.). Both species are diploid (2n = 20) (Kuhn et al. 2010;da Silva et al. 2017) and belong to the Malvaceae family. Cupuassu presents immense economic potential due to multiple uses of its pulp and almonds in the food and cosmetic industries. Several products are manufactured from pulp, such as juices, ice cream, liqueurs, jellies, sweets, and cosmetics (Calzavara et al. 1984;de Nazaré et al. 1990;Alves et al. 2007;Salgado et al. 2011). Additionally, in almonds, which represent 15% of fruit weight, approximately 60% of dry weight is composed of an easily digestible thin fat, which primarily constitutes oleic and stearic acids (Vasconcelos et al. 1975;Lannes et al. 2003). From this thin fat, a product very similar to chocolate, named cupulate, can be obtained (de Nazaré et al. 1990). Furthermore, these trees are suitable for cultivation in sustainable agroforestry systems, highlighting their socioeconomic and environmental significance to the Amazon region (Reisdorff et al. 2000;Alves et al. 2014Alves et al. , 2021. However, cupuassu plantations are severely affected by witches' broom disease (WBD), a major disease of this crop and a limiting factor for fruit production (Alves et al. 2014). In both cacao and cupuassu, WBD is caused by the basidiomycete Moniliophthora perniciosa (Stahel) Aime & Phillips-Mora 2006 (Aime and Phillips-Mora 2005). It is a hemibiotrophic fungus with biotrophic and saprotrophic phases (Meinhardt et al. 2008). Initially, the fungus attacks the plant meristematic tissues (shoots, young pods and flower cushions), growing in intercellular spaces as a mononuclear mycelium of the biotrophic phase, and no apparent symptoms are observed up to 20 days in cacao (Silva et al. 2002;Sena et al. 2014). Then, plant physiological and morphological alterations occur, such as loss of apical dominance and abnormal bud proliferation, resulting in swollen shoots with many branches (green broom). As the disease progresses, the fungus shifts to its saprotrophic phase, with a dikaryotic mycelium. At this stage, M. perniciosa develops intracellularly, leaves on broom begin to necrose, followed by necrosis of its stems, resulting in their death (dry broom) (Purdy and Schmidt 1996;Meinhardt et al. 2008;Sena et al. 2014).
Cupuassu with different resistance levels are available, but they are primary clones collected from wild parents in the Amazon region, Brazil. Obtaining plants that combine resistance and productivity is a challenge for the cupuassu breeding program. However, the current knowledge of the T. grandiflorum molecular genetics and the molecular basis of its interaction with M. perniciosa is incipient, hampering the development of the breeding program.
Meanwhile, for T. cacao, genomic (Argout et al. 2011;Motamayor et al. 2013), transcriptomic, and proteomic data are available (Gesteira et al. 2007;Leal et al. 2007;Argout et al. 2008;da Hora Junior et al. 2012;Britto et al. 2013;dos Santos et al. 2020) supporting its breeding program. The resistant genotype of cacao presents an intense response produced by the expression of transcripts associated with signaling processes (e.g. receptor proteins) at the early stages of infection (48-120 h), which affects fungal colonization (Leal et al. 2007). Furthermore, disease resistance is associated with strong production of elicitors and reactive oxygen species (ROS) at the beginning of interaction, as well as, posterior signal transduction and ROS detoxification (da Hora Junior et al. 2012;dos Santos et al. 2020). In susceptible plants, infected tissues undergo massive metabolic reprogramming, characteristic carbon deprivation signature, downregulation of photosynthesis-related genes, and upregulation of genes associated with plant defense responses, which fail to prevent the disease (Leal et al. 2007;da Hora Junior et al. 2012;Teixeira et al. 2014;Royaert et al. 2016;dos Santos et al. 2020), as well as accumulation and degradation of calcium oxalate crystals, which are involved in plant cell death (Ceita et al. 2007). It is also reported that phytohormones play an important role in WBD development. The pathogenicity of M. perniciosa is associated with the increase of auxin (IAA) and salicylic acid (SA) during colonization of cacao leaves (Kilaru et al. 2007). In addition, Costa and collaborators (2021) suggest that M. perniciosa produces cytokinins that might interfere with host cytokinin biosynthesis, contributing to the disease development (Costa et al. 2021). Furthermore, phytohormone signature analysis reveals an increase in the levels of auxin, gibberellin, and ethylene in infected tissues of susceptible cacao plants (Teixeira et al. 2014).
Although these findings have shed light on the molecular basis of disease development in cacao and, consequently, in the Theobroma genus, specific data must be generated for T. grandiflorum. Recently, transcriptomic data from cupuassu seeds have allowed the development of expressed sequence tag (EST) microsatellite markers (Santos et al. 2016a), selection of reference genes for expression (Santos et al. 2016b), and analyses of specific cupuassu chitinases and their potential involvement in plant resistance to M. perniciosa (Santana Silva et al. 2020). Additionally, the first high-density genetic map of T. grandiflorum was published, and a quantitative trait locus (QTL) on chromosome 6 was identified to be linked with resistance to M. perniciosa (Mournet et al. 2020).
In the present study, aiming to increase the molecular knowledge about T. grandiflorum and its interaction with M. perniciosa, a large-scale RNA sequencing (RNA-Seq) was performed. These data allowed a functional genomic analysis of this plant-pathogen interaction at its initial asymptomatic phase. The differentially expressed genes (DEGs) were identified at the first two days after inoculation, aiming to detect early responses in resistant and susceptible cupuassu genotypes, as well as the intrinsic differences between them. The phytohormone signature and the enrichment of ontology terms were analyzed to better understand the differences in genotypes molecular responses to the presence of the fungus. Furthermore, the DEGs belonging to categories of genes known to be part of the resistance mechanism, such as transcription factors (TFs), pattern-recognition receptors (PRRs) and pathogenesis-related proteins (PRs) were analyzed in detail.

Biological material and inoculation
Clones of T. grandiflorum, one resistant and one susceptible to M. perniciosa, were grown in the field for approximately 3 years at the experimental campus of José Haroldo Genetic Resources Station (ERJOH) of CEPLAC -Executive Commission of the Cacao Farming Plan (Marituba, PA, Brazil), GPS coordinates 1°22 ′ S 48°17 ′ W. These were primary clones collected from wild parents in the Amazon State (Brazil). Clone 174/cultivar BRS Coari (resistant genotype -RG) was collected from the Coari County (AM) and clone 1074 (susceptible genotype -SG) from the Itacoatiara County (AM). M. perniciosa basidiospores were obtained from basidiocarps developed on brooms of cupuassu trees at ERJOH. Basidiospores were collected in 16% glycerol in 0.01 M 2morpholinoethanesulfonic acid (MES), pH 6.1 (Frias et al. 1995;Leal et al. 2010), and preserved in liquid nitrogen, in the Phytopathology Laboratory of ERJOH, until further use.
For cupuassu inoculation, plants have been previously pruned to induce new profusely branched shoots. Selected apical shoots were inoculated with 30 μL of M. perniciosa basidiospore suspension (10 5 spores mL −1 ) (Surujdeo-Maharaj et al. 2003), with germination rate higher than 95%, assessed by observation under a reverse-phase optical microscope, immediately before inoculation. The inoculated apical shoots were covered in plastic bags for 24 h to ensure a humid environment This artificial inoculation method has already been used successfully in cupuassu (Mournet et al. 2020). Afterwards, apical shoots were collected, trimmed to remove leaves and petioles, and the tip (0.5 cm) was cut and immediately immersed in RNAlater® solution (Invitro-genTM, Carlsbad, CA, USA). Each treatment (non-inoculated, 24 and 48 HAI), composed of five apical shoots, was performed with three biological replicates. The experiment was carried out once, in a completely randomized design. Therefore, the experiment comprised 2 genotypes (RG x SG) * 3 treatments (non-inoculated, 24, 48 HAI) * 3 replicates each, generating 18 samples, further used to construct the cDNA libraries. Non-inoculated shoots were collected on the same day of 24 HAI samples. The experiment was carried out in a dry season to prevent natural fungal sporulation (March 2016, at temperatures varying from 23 to 29°C and under photoperiod of 12 h/12 h, with total precipitation of 480 mm in that month). The investigation was approved by the Brazilian Genetic Heritage Management Council (CGEN) (SISGEN registration number AD93167).

RNA isolation
RNA isolation method was adapted from the one described for pine trees (Chang et al. 1993). Apical shoot samples were ground in liquid nitrogen. Next, 1 mL of CTAB buffer (2% CTAB, 4% PVP40, 100 mM Tris-HCl [pH 8], 25 mM EDTA [pH 8], 2 M NaCl, 2% β-mercaptoethanol) was preheated to 65°C and then added to 100 mg of macerated material. The samples were vortexed for 10 min, maintained at 65°C for 15 min, and then vortexed again for 1 min every 5 minutes. Then, 0.8 mL of chloroform: isoamyl alcohol solution [24:1 (v/v)] was added to each sample, followed by vortexing for 1 min and centrifugation at 12,000 × g and 4°C for 5 min. The aqueous phase was collected, and the previous step was repeated. Total RNA was precipitated by adding lithium chloride at a final concentration of 2 M and incubating for 2 h at −20°C. Following centrifugation at 12,000 × g and 4°C for 15 min, the precipitated pellets were washed with 70% ethanol and resuspended in 20 μL of DEPc-treated H 2 O. The samples were then treated with DNAse I (Fermentas, 1 U·mg −1 of RNA) and purified using the Reliaprep kit TM (Promega, Madison, WI, USA) according to the manufacturer's instructions.
The quantity and quality of RNA samples were assessed using a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Finally, 1 μg of RNA, stabilized in an RNAstable tube (Biomatrica, San Diego, CA, USA), was sent for mRNA sequencing to an NGS service provider.
Library construction, sequencing, annotation, and data processing Eighteen cDNA libraries were constructed from cupuassu apical shoots (three replicates for each of the three treatments [non-inoculated, 24, 48 HAI] and two genotypes [RG and SG]). The three treatments were labeled RT and ST for non-inoculated controls, R24 and S24 for 24 HAI, and R48 and S48 for 48 HAI. Libraries were prepared and sequenced at the Roy J. Carver Biotechnology Center of the University of Illinois at Urbana-Champaign (Illinois, USA). Each library was paired-end sequenced on a HIseq 2500 sequencer (Illumina platform) at 100 bp size, generating approximately 20 million reads per library. The data (Fastq) were analyzed at the Bioinformatics Laboratory of Embrapa Genetic Resources and Biotechnology. The reads were filtered based on positional quality over a flow cell using the FilterByTile program of the BBMap package (https://sourceforge.net/projects/bbmap/). Quality reads were mapped to the T. cacao V.2 genome using a STAR software package (Dobin et al. 2013), followed by HTSeq-count (Anders et al. 2015). Transcriptome completeness was accessed using the BUSCO software after merging STAR bam files with PICARD and extracting transcripts with Trinity.
The raw number of mapped reads calculated per sample was used as the input for the R package NOISeq (Tarazona et al. 2015), applied to produce normalized values accounting for sample-specific effects and to obtain the inference of differential gene expression.
Intra-and inter-genotype comparisons were performed. For intra-genotype analysis, inoculated samples (24 and 48 HAI) of each genotype were compared to their respective non-inoculated control samples (RT and ST). For inter-genotype analysis (SG × RG), the samples were compared to identify intrinsic differences between the control, 24, and 48 HAI. Genes were considered differentially expressed if the base 2 logarithm of fold change (log 2 FC) was >2 or < −2 and the probability value was >0.95. The gff file from T. cacao genome [Belizean Criollo genotype (B97-61/B2), version 2, available at https://cocoa-genome-hub. southgreen.fr] was used for transcript annotation.
A heatmap of differentially expressed genes (DEGs) from each genotype under different treatments was generated based on log 2 FC values using the ClustVis web tool (https://biit.cs.ut.ee/clustvis/).

Gene ontology (GO) annotation and enrichment analysis
GO enrichment hypergeometric analysis of DEGs was performed using the FUNC software following PFAM and GO term annotation. Terms that showed significant enrichment (FDR > 0.05) were considered for sampling frequency.
The dataset is deposited in the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/) under accession number (the deposit is in process).

Phytohormone signature analysis
The HORMONOMETER program (https://hormonometer. weizmann.ac.il/) (Volodarsky et al. 2009) was used to compare T. grandiflorum gene expression in response to M. perniciosa to Arabidopsis thaliana gene expression in response to exogenous application of phytohormones. T. grandiflorum genes orthologous to Arabidopsis genes were selected from using bidirectional BLAST analyses. Gene pairs were considered orthologous if bidirectional best hit yielded an e-value ≤ 1e-5 in both directions.
Experimental validation of differential expression data using qRT-PCR qRT-PCR (Applied Biosystems) was performed using SYBR Green I PCR Master Mix (Applied Biosystems) on the ABI 7900 Real-Time PCR System (Applied Biosystems, Foster City, USA). Primers used in the reactions are listed in Table S1 (Supplemental Material). The housekeeping gene malate dehydrogenase of T. grandiflorum was used to normalize the expression data, as previously described (Santos et al. 2016b). The formation of primer dimers and multiple products was controlled using melting-curve analysis. Fold changes in the expression of selected genes were calculated as Log10 relative quantity (RQ) values using the Pfaffl 2 −ΔΔCt method (Pfaffl 2001).

T. grandiflorum transcriptome assembly
Transcriptomic data for T. grandiflorum were obtained through Illumina sequencing, and ∼20 million raw reads were generated with a read length of 198 bp paired ends. The number of input reads varied between 8,608,976 and 21,222,319, and 81-87% of these were uniquely mapped. Around 15% reads were unmapped or mapped to multiple loci were (Table S2, Supplemental Material).
A total of 21,441 unigenes were identified. After applying a filter for count (cut-off = 10 reads), the number of transcripts was reduced to 17,413, which were then used for differential gene expression analysis. Moreover, 75% of cupuassu protein sequences shared over 94% similarity with cacao sequences.
BUSCO analysis of T. grandiflorum transcriptome assembly revealed a completeness score of 93.5%.

Differential gene expression analysis
To evaluate the pattern of gene expression at the initial stages of response to M. perniciosa infection (24 and 48 HAI) in SG and RG, gene expression under each condition (genotype and time of inoculation) was analyzed using the R package NOISeq, and the DEGs were plotted on a heatmap ( Figure  1A). A total of 440 DEGs were identified among the different conditions. Note that each genotype analysis used its own non-inoculated library as the control (ST vs S24, ST vs S48, RT vs R24, and RT vs R48).
Furthermore, inter-genotype analysis was performed to assess the intrinsic differences of the genotypes. In this case, each analysis compared both genotypes under each treatment (ST vs RT, S24 vs R24, and S48 vs R48). A total of 301 DEGs were identified ( Figure 1E and F).
A markedly different response pattern between the two genotypes was observed. The pattern of differential gene expression in RG was different from that in SG at 24 and 48 HAI. Both heatmap and Venn diagram ( Figure 1) revealed some noteworthy differences.
In intra-genotype analysis, genes were upregulated under four different conditions. However, the number of DEGs was greater in SG than in RG (84 for S24 vs 37 for R24 and 135 for S48 vs 115 for R48) (Figure 1A-D; Table  S3A; Supplemental Material). Most DEGs were differentially expressed under specific conditions ( Figure 1B). Under all conditions, the number of upregulated DEGs was greater than that of downregulated ones. However, the ratio of up-to downregulated (up/down) genes was higher in SG than in RG (84/23 = 3.6 for S24 vs 135/16 = 8.4 for S48 and 37/15 = 2.5 for R24 vs 115/104 = 1.1 for R48). The number of downregulated genes was greater in R48 (104 DEGs) than in the other conditions ( Figure 1A, C, D). RG showed a significant increase in the number of DEGs from 52 at 24 HAI to 219 at 48 HAI, representing an increase of 167 DEGs. In SG, the change was less pronounced (107-151 DEGs), with an increment of only 44 DEGs ( Figure 1A and B).
In non-inoculated samples, intrinsic differences were observed between the genotypes (inter-genotype analysis), with most DEGs exhibiting higher expression levels in SG than in RG (ST = 93 and RT = 62). In inoculated samples, most DEGs at 24 HAI exhibited higher expression levels in SG than in RG (S24 = 131 and R24 = 76) ( Figure 1E and F; Table S3B; Supplemental Material). Meanwhile, at 48 HAI, only seven DEGs were identified (S48 = 1 and R48 = 6).
RNA-Seq data were validated using the RT-qPCR of selected DEGs. The results are shown in Fig. S1A (Supplemental Material). The differences in gene expression between the two genotypes at 24 and 48 hours after M. perniciosa inoculation were compatible with the results of RNA-Seq, with a Pearson's correlation coefficient (R 2 ) exceeding 0.8 (Fig. S1A, S1B).
To identify genes involved in disease resistance, DEGs in specific categories were analyzed: PRRs, TFs, PRs, phytohormone signatures, and gene ontology differential term enrichment.
In SG, the expression of most differentially expressed TFs increased at 24 HAI ( Figure 3A, continuous ellipse) but decreased at 48 HAI to the base level. In RG, the expression of differentially TFs remained unchanged at 24 HAI but decreased at 48 HAI. However, ethylene-responsive transcription factor 9 (ERF9), JUNGBRUNNEN 1 (JUB1), and MYB82 exhibited opposite trends, with increased expression in RG at 48 HAI ( Figure 3A, dashed ellipse).
In inter-genotype analysis, eight differentially expressed TFs were identified. Specifically, ERF9 and BEE3 showed higher expression levels in non-inoculated SG (RT vs ST). In addition, seven differentially expressed TFs were detected between R24 and S24 ( Figure 3B).
Additionally, in R48, ERF9 expression level was nearly 100 times higher than that in RT and R24 ( Figure 3A). However, in SG, ERF9 was already expressed at this level in noninoculated controls (ST) and showed no change afterward ( Figure 3B).

Pathogenesis related proteins
A total of 288 PR-like genes were identified in T. grandiflorum. Of these, 37 were differentially expressed in intra-genotype analysis. Several PRs presented opposite expression trends depending on the genotype. For instance, some PRs were up-regulated in S48 ( Figure 4A, dashed ellipse) but downregulated in R48, whereas some others were exclusively upregulated in R48 ( Figure 4A, continuous ellipse).
In inter-genotype analysis, the expression of four PR-like genes, namely chitinase 1, thaumatin-like protein, major allergen Pru ar 1, and acidic endochitinase, was higher in ST than in RT ( Figure 4B). However, the expression levels of these PRs in R48 increased to values similar to those in S48.
The expression profile of S24 was strongly and positively correlated (index > 0.6) with the profile of Arabidopsis treated with MJ60 (derivative of JA), ABA180, and IAA30. In S48, the strength of correlation increased under treatment with SA180 but decreased under treatment with MJ30/60/ 180, ABA180, and IAA30/60/180 compared with that in S24. In contrast, the expression profile of R48 showed weak or no correlation with MJ and IAA but was correlated with ABA60/180 and SA180. Furthermore, the expression profile of R48 was positively correlated with Zeatin180 (cytokinin). Meanwhile, SG showed no correlation with Zeatin180 but showed a strong negative correlation with brassino30/60 and ACC30/60 (ethylene) ( Figure 5).
The proportion of SA/JA in R48 was higher than that in S48, with correlation index ratios of 0.6/0.2 and 0.8/0.4, respectively, indicating the favoring of SA over the JA pathway, which was more evident in R48.
ABA and GA showed antagonist behavior in both genotypes at 24 and 48 HAI.
Since auxin is an important player in WBD, the behavior of small auxin-upregulated RNAs (SAURs)-a set of auxinresponsive genes-was examined. Some SAURs were upregulated in S24 and S48, whereas most were downregulated in R48 ( Figure 6).

GO enrichment analysis
To evaluate the pattern of gene expression at the initial stages of response to M. perniciosa infection, DEGs obtained using RNA-Seq were subjected to hypergeometric GO analysis.
The intra-genotype analysis revealed two enriched GO terms of upregulated DEGs in SG (terpene synthase activity and DNA-binding transcription factor activity) and only one enriched GO term of downregulated DEGs (heme binding) in RG at 24 HAI. Meanwhile, at 48 HAI, more GO terms of upregulated DEGs were enriched (18 terms), most of which were different between SG and RG ( Figure 7A).
Regarding responses to stimuli, for instance, the GO terms were related to response to biotic stimuli (enriched by major allergen Pru ar 1, PR-4B, and MLP-like protein 31 genes) in R48 and response to wounding (enriched by Glu S. griseus protease inhibitor gene) in S48.
Furthermore, RG and SG differed in terms of oxidoreductase activity (GO:0016491). For instance, more protein detoxification genes were upregulated in SG. In R48, the most enriched GO term was peroxidase activity, which was not enriched in SG.
The GO term extracellular region was enriched by the upregulated genes expansin B1, PR1, xyloglucan endotransglucosylase/hydrolase protein 2 (XTH2), and putative metalloendoproteinase 2-MMP in R48. Two enriched GO terms of the downregulated genes (response to auxin and sexual reproduction) appeared in R48. Response to auxin was enriched by SAUR21 and SAUR71, while sexual reproduction was enriched by two expansin genes A1 and B2.
Inter-genotype analysis revealed seven enriched GO terms for controls, RT, and ST. The expression levels of most of these genes were higher in ST than in RT ( Figure  7B). At 24 HAI, nine GO terms were enriched, and expression of all these genes were higher in S24 than in R24 ( Figure 7B). No enriched GO term was identified at 48 HAI. Only two GO terms (enzyme inhibitor activity and ADP binding) were enriched for genes with higher expression levels in RT. The GO term enzyme inhibitor activity was enriched by the following DEGs: putative pectinesterase/pectinesterase inhibitor 54 (Tc09v2_g006040), 21 kDa protein (Tc03v2_g021470), and pectinesterase 2 (Tc02v2_g024980). The GO term ADP binding was enriched by genes encoding putative disease-resistance proteins in both genotypes. Two of the genes, namely putative disease resistance protein At4g11170 (disease resistance protein of the TIR-NBS-LRR class family) and putative disease resistance protein RPS2, which contains a nucleotide-binding site and leucine-rich repeats, showed the highest expression levels in RG.

Discussion
In the present study, depth sequencing and differential gene expression analysis were used to identify genes involved in resistance to M. perniciosa in T. grandiflorum for better understanding the molecular basis of disease resistance mechanism.

T. grandiflorum transcriptome assembly
Using RNA-Seq and T. cacao genome as the reference (Argout et al. 2011), genes expressed in meristematic apical shoots, one of the target tissues of M. perniciosa, were described on a large scale. The genome sequence of cacao has already been used as reference to construct a high-density genetic map of cupuassu through genotyping-bysequencing. Regarding to synteny, the average homology between the linkage groups of the two species has been reported to be 97.2% (Mournet et al. 2020). Furthermore, the transferability of molecular markers between the two species has been reported (Alves et al. 2006;Kuhn et al. 2010;Santos et al. 2016a;da Silva et al. 2017).
The transcriptome sequence assembled in the present study is a valuable resource, considering that only a few genes are described in the NCBI data bank, with a majority (126) being related to T. grandiflorum chloroplast genome (Niu et al. 2019) (https://www.ncbi.nlm.nih.gov/gene/?term = theobroma %20grandiflorum, accessed on September 30, 2021).

Differential gene expression
In response to pathogen challenge, the analysis revealed 440 putative genes associated with early resistance mechanisms, corresponding to 2.5% of the total unigenes. In a study on cacao, also performed at the early stages of infection (24-72 HAI) using cDNA macroarray, only a few DEGs were described (111 DEGs among 2,855 genes analyzed, i.e. 3.9%) (da Hora Junior et al. 2012). In the present study, the tested cupuassu genotypes exhibited a differential response at the initial stages of infection (up to 48 HAI). Specifically, in SG, the defense machinery was activated earlier than that in RG (24 HAI), as evidenced by the upregulation of most of its DEGs. This response may be related to the establishment of the pathogen in tissues. Interestingly, previous studies have shown that susceptible cacao manifests an early but ineffective defense response and is unable to halt the disease (Leal et al. 2007;da Hora Junior et al. 2012;Teixeira et al. 2014). Indeed, more M. perniciosa hyphae were detected in susceptible cacao (Sena et al. 2014). In the present experiment, RG appeared to respond later, with a significant increase in the number of DEGs from 24 to 48 HAI, with a more balanced ratio between upregulated and downregulated genes.
Therefore, susceptibility to M. perniciosa may be associated with differences in gene expression profiles. In SG, disease establishment may be related to the higher expression levels of susceptibility genes. In the present study, among non-inoculated controls, several genes in SG showed an expression level that was reached by genes in RG only at 48 HAI, such as expansin B1, ERF9, and thaumatin-like protein, which have been implicated in susceptibility. Specifically, in A. thaliana, expansin EXLA2 is required for Botrytis and Alternaria infection (Van Schie and Takken 2014), and ERF9 overexpression allows plant colonization by an endophytic fungus (Camehl and Oelmüller 2010). Thaumatinlike proteins are produced by M. perniciosa, and they likely facilitate competition with endophytes, allowing their own establishment (Franco et al. 2015).
Overall, this set of genes provides clues into effective defense mechanisms. Therefore, some key categories of genes, including PRRs, TFs, and PRs, were addressed in this work and DEGs were subjected to GO enrichment and phytohormone signature analyses.

Pattern-recognition receptors
PRRs are transmembrane receptors that perceive extracellular molecules and activate a defense program known as PAMP-triggered immunity (PTI). PRRs comprise receptorlike kinases (RLKs, or receptor kinases) and receptor-like proteins (RLPs) (Han 2019). These genes provide a greater ability to perceive the presence of a pathogen and thus trigger a response. Here, it was demonstrated that some PRRs ( Figure 8) were activated in RG at 48 HAI. Heterologous PRR expression enhances drug resistance. In sweet orange, the expression of predicted A. thaliana LRR-RLK, an elongation factor-Tu receptor, enhanced resistance to citrus canker (Mitre et al. 2021). In addition, a mutation in LysM RLK1, a PRR required for chitin signaling in A. thaliana, increased susceptibility to fungal pathogens (Wan et al. 2008). In the present study, the inter-genotype analysis revealed higher expression levels of some PRRs in RG than in SG. Thus, these cupuassu PRR genes may be involved in recognizing M. perniciosa and triggering host defenses to halt infection.

Transcription factors
TFs are proteins involved in a multitude of biological processes and are implicated in defense against plant pathogens Theobroma grandiflorum-Moniliophthora perniciosa interaction. A) Heatmap of variation in TFs expression in intra-genotype analysis. The color scale presents unit variance, with down-and upregulated genes under different conditions indicated in blue and red, respectively. B) Heatmap of variation in TFs expression in inter-genotypes analysis. Genes with higher expression levels in SG are indicated in blue, while those with higher expression levels in RG are indicated in red. R = resistant, S = susceptible, RT and ST = non-inoculated controls, R24 and S24 = 24 HAI, R48 and S48 = 48 HAI. (John et al. 2021). In the present study, analysis of TFs demonstrated marked differences in their expression patterns depending on the genotype and time of M. perniciosa inoculation.
Among the differentially expressed TFs, some belonged to the WRKY family, which is known to be involved in biotic and abiotic stress responses (Rushton et al. 2010). Here, it was demonstrated that the transcript levels of tgWRKY40 and tgWRKY60 were increased in S24 but decreased in R48. WRKY40 is associated with susceptibility to M. perniciosa in cacao (Silva Monteiro de Almeida et al. 2017) and to the hemibiotrophic fungus Dothiorella gregaria Sacc. in Populus trichocarpa (Karim et al. 2015).
In addition, WRKY72 was up-regulated in S48. Hou et al. (Hou et al. 2019) showed that WRKY72 suppressed JA biosynthesis in rice. Thus, tgWRKY72 likely plays the same role in cupuassu, since its expression level was higher in S48 and its phytohormone signature was weakly correlated with JA treatment. Additionally, this TF suppressed the gene encoding oxophytodienoate reductase, an enzyme involved in JA synthesis, as evidenced by its downregulation in S48.
MYB82 was downregulated in S24 and slightly upregulated in R48 ( Figure 3A). MYB82 is directly involved in the formation of trichomes (Liang et al. 2014), which is a site of M. perniciosa penetration (Sena et al. 2014). Considering cytokinin is potentially involved in trichome formation (Greenboim-Wainberg et al. 2005;Li et al. 2021) and resistance to pathogen (Albrecht and Argueso 2016; Akhtar et al. 2020), it is possible that a relationship between cytokinin and MYB82 could be part of a mechanism used to prevent fungal invasion.
ERF9 showed the largest fold-change among the TFs evaluated in intra-genotype analysis. Specifically, its expression was increased nearly 100-fold in R48. The inter-genotype analysis showed that in SG, ERF9 was already expressed at this level in the non-inoculated control, and this level persisted at 24 and 48 HAI. ERF9 is a negative regulator of stress resistance in A. thaliana (Maruyama et al. 2013), and it is involved in endophyte establishment (Camehl and Oelmüller 2010). RG could maintain a different endophyte microbiome, thereby preventing WBD development. In addition, M. perniciosa has been identified as an endophyte in a cacao-resistant variety (Lana et al. 2011). JUB1, which belongs to the NAC TF class, was also upregulated in R48. TFs of this class serve important functions in plant development and abiotic stress response and can also act as regulators of plant immunity against fungal pathogens (Nakashima et al. 2012;Yuan et al. 2019). In Arabidopsis, JUB1 (AT2G43000) was induced by infection of the biotrophic fungus Golovinomyces orontii (Chandran et al. 2009), whereas in grapevines, the JUB1 ortholog NAC-like transcription factor 42 [VIT_12s0028g00860] was responsive to infection by the biotrophic fungus Erysiphe necator (Schw.) Burr, which is the causative agent of powdery  mildew (Toth et al. 2016). Considering that M. perniciosa is in the biotrophic phase at the early stages of disease, JUB1 upregulation in R48 may be associated with defense against this fungus.
Overall, this dataset suggests that several TFs play pivotal roles in plant defense against M. perniciosa, indicating candidate genes for further studies.

Pathogenesis related proteins
Plant PR proteins are induced in response to biotic and abiotic stresses (Van Loon 1997;van Loon et al. 2006). In the present study, a group of PRs was upregulated in S48 but downregulated in R48, indicating opposite gene expression profiles according to the genotype. (Teixeira et al. 2014) studied the transcriptome of susceptible cacao infected by M. perniciosa and reported at least 67 upregulated PRs; however, they were not sufficient to prevent disease progression.
Another hypothesis is that a set of PRs expressed in SG may inhibit endophytic fungi, which normally colonize cupuassu tissues, and this inhibition may eventually favor M. perniciosa infection. This hypothesis is supported by the fact that M. perniciosa also expresses some PR-like proteins, such as MpPR-1 and MpTLP, and these PRs have been proposed to inhibit competing fungi (Teixeira et al. 2012;Franco et al. 2015).

Phytohormone signature analysis
The modulation between disease and resistance is related to the induction of hormonal changes and complex crosstalk  To gain insights into the hormonal aspect of resistance in cupuassu, phytohormone signatures were evaluated in RG and SG in response to M. perniciosa. A clear difference in hormonal changes between the genotypes following inoculation was observed. In a previous study, HORMONOMETER analysis was performed for cacao green brooms to detect hormonal imbalances during WBD development (Teixeira et al. 2014).
One of the mechanisms used by fungi to establish disease in cacao is linked to the presence of auxin (Kilaru et al. 2007). The HORMONOMETER analysis of cupuassu revealed that SG was correlated with the presence of auxin, while R48 was not. Furthermore, genes responsive to auxins, such as SAURs, were downregulated in R48. It is known that SAUR19-24 are positive regulators of cell expansion (Spartz et al. 2012). Thus, the lack of expression of these genes in RG can limit cell expansion-a process that occurs during M. perniciosa infection (Teixeira et al. 2014), thereby restricting pathogen establishment.
Next, the involvement of SA and JA pathways in resistance mechanisms was evaluated. SA is the most important phytohormone for the regulation of defense against biotrophic and hemibiotrophic pathogens, whereas JA enhances resistance to necrotrophic pathogens (Robert-Seilaniantz et al. 2011;Fu and Dong 2013). The antagonism between SA and JA is well-known (Pieterse et al. 2012;Verma et al. 2016). Here, the results indicated that R48 favored response via the SA pathway, antagonizing the JA pathway, while SG presents gene expression responses related to increased levels of SA and JA, as early as 24 HAI. This may be the core of susceptibility. Furthermore, the SG has higher levels of transcripts of genes known to be related to acquired resistance, a process known to be initiated by SA. It is already known that M. perniciosa produces and induces an increase of IAA and SA phytohormone in cacao leaves and the presence of SA or JA stimulates mycelial growth (Kilaru et al. 2007). Therefore, increasing levels of these phytohormones benefits the fungus.
A hormonal signature that calls attention is the significant correlation index of CK (zeatin180) in R48, unlike that in S24 and S48. This phytohormone plays pivotal roles in pathogen resistance (Albrecht and Argueso 2016;Akhtar et al. 2020). In rice, the synergism between SA and CK augments resistance to the hemibiotrophic fungus Magnaporthe oryzae (Jiang et al. 2013). Considering that M. perniciosa is also a hemibiotrophic fungus, this process may occur in cupuassu trees, as evidenced by the strong correlation between CK and SA signatures in R48. Moreover, a crosstalk was observed when the response of RG to BRs was analyzed. The response profile of R48 to BRs was negatively correlated with the response profile of Arabidopsis to this hormone. BRs are involved in plant growth and development (Cheng et al. 2017), and their responses to pathogens are based on the growth and defense balance (Huot et al. 2014). In addition, plants defective in their response to BRs show greater resistance to fungi (Ali et al. 2014;Goddard et al. 2014). Here, the results indicate that RG exhibited hormonal balance, inhibiting growth-promoting pathways, decreasing BRs, and increasing SA-induced defense responses. Furthermore, there was a negative correlation with ethylene, indicating the occurrence of an SA-mediated defense mechanism against biotrophic rather than necrotrophic fungi (Li et al. Overall, the signaling pathways and crosstalk between hormones begin very early after pathogen inoculation, and hormonal interplay is crucial for an efficient defense response. Of note, however, these responses are also affected by other stresses to which plants are subjected in the field, including insects or other microorganisms.

GO enrichment
To better understand the overall scenario of transcriptomic differences, either intra-or inter-genotype, GO enrichment analyses were performed.
GO data revealed significant differences in response to pathogen inoculation, and this difference was more evident at 48 HAI, as shown by intra-genotype analysis. In SG response, no GO term of downregulated genes was enriched. In contrast, in RG response, three terms of downregulated genes were enriched (response to auxin, heme binding, and sexual reproduction). The GO term sexual reproduction was enriched by two expansin genes (A1 and B2), which were downregulated in R48. Expansins play diverse biological roles in plant development, involving the loosening of cell walls (Marowa et al. 2016), and their suppression has been described in resistance to phytopathogens (Abuqamar et al. 2013;Ding et al. 2018). Thus, the downregulation of these genes in R48 may prevent cell wall expansion, thereby limiting M. perniciosa development. This speculation is consistent with the GO term extracellular region, which was enriched with upregulated genes in R48, comprising enzymes involved in cell wall construction and extracellular matrix degradation and remodeling, such as xyloglucan endotransglucosylase/ hydrolase protein 2 (XTH2) and putative metalloendoproteinase 2-MMP (Stratilová et al. 2020).
Interestingly, the GO term response to the stimulus showed a clear difference between the genotypes at 48 HAI. In SG, terms related to the response to wounding were enriched. Meanwhile, in RG, terms related to the response to biotic stimuli were enriched by genes, such as the major allergen Pru ar 1, PR-4B, and MLP-like protein 3. Therefore, these genes may be reliable candidates as determinants of resistance mechanisms and can serve as targets of further studies.
Differences between the responses of genotypes to oxidoreductase activity indicate an important role of ROS in resistance mechanisms. ROS accumulation is involved in susceptibility and resistance to pathogens in cacao, depending on the pathway triggered. Fister et al. (2015) (Fister et al. 2015) (Fister et al. 2015) (Fister et al. 2015) (Fister et al. 2015) reported that when plants were treated with SA to mimic pathogen infection, the expression levels of genes potentially involved in ROS accumulation were higher in the chloroplasts and mitochondria of a WBD-resistant genotype (Sca6) than in those of a susceptible one (ICS1) (Fister et al. 2015). Here, peroxidase activity was the most enriched GO term related to oxidoreductase activity in R48. Similar results have been reported in cacao, in which two peroxidases were expressed at higher levels in resistant cultivars (Leal et al. 2007). Similarly, Camillo et al. (2013) (Camillo et al. 2013) showed that cytosolic ascorbate peroxidase is involved in cacao resistance to M. perniciosa (Camillo et al. 2013).
Another important difference was related to the GO term terpene synthase activity. It was enriched as early as 24 HAI in SG and enriched only at 48 HAI in RG, although the genes in each GO term differed between the genotypes. Terpene synthases are involved in the production of precursors to numerous terpenoids, including phytohormones, such as ABA (Nambara 2017). This report is consistent with the results of T. grandiflorum hormone analysis, which showed that at 24 HAI, SG presented a response equivalent to ABA treatment. As genes in the enriched GO terms differed between the two genotypes, the composition of terpenes may have varied among the plants, leading to distinct pathogen responses.
In RG, two GO terms were enriched by upregulated genes, namely enzyme inhibitor activity and ADP binding. The former was enriched by pectinesterases/pectinesterase inhibitors, whose action affects the texture and rigidity of the cell wall (Pelloux et al. 2007). This result supports the notion that cell wall alterations are involved in molecular defense mechanisms.
Next, the GO term ADP binding was enriched in genes encoding putative disease resistance proteins in both genotypes. Here, disease resistance proteins of the TIR-NBS-LRR family and the putative disease resistance protein RPS2, harboring a nucleotide-binding site and leucine-rich repeats, were highlighted, as their expression levels were the highest in RG. In transgenic rice, RSP2 triggered a resistance response to the hemibiotrophic fungus M. oryzae (Li et al. 2019). Here, the higher expression of these receptors in RG indicates a mechanism of resistance.
This work suggests that the resistance and susceptibility in cupuassu may be related to the microenvironment that the fungus encounters in the first hours of interaction in the contrasting genotypes. This microenvironment could influence the colonization and manipulation of meristematic cells by M. perniciosa. In the susceptible plants, the fungus finds a typical environment of acquired systemic response, which can alter the endophytic community, which in turn could favor the pathogen. In 24 h, the increase of phytohormones, such as auxins, SA, and JA benefits mycelial growth. In the resistant plants, the fungus encounters a greater amount of proteins related to cell wall modification, which may hinder the manipulation of the cell wall by the fungus. Additionally, in RG, the fungus faces a greater amount of some PRRs-like proteins, which may be involved in the activation of differentiated and efficient defense cascades, resulting in differences related to auxin response, cell wall expansion, biotic stress responses, ROS detoxification and phytohormones balance (Figure 8). These insights indicate avenues for more detailed studies of genes and defense mechanisms involved in the Theobroma-M. perniciosa interaction.

Conclusions
T. grandiflorum transcriptome depth sequencing and mapping to the T. cacao genome identified 21,441 unigenes, generating the most comprehensive transcript database of cupuassu apical shoots so far. This dataset represents the first transcriptome of T. grandiflorum-M. perniciosa interactions at very early and asymptomatic stages of WBD. The results revealed critical molecular differences between contrasting cupuassu genotypes with respect to their susceptibility and resistance to WBD. These data can serve as a valuable resource for further analyses aimed at elucidating the mechanisms of resistance and susceptibility in cupuassu trees, supporting the development of cupuassu breeding programs and WBD control measures. Furthermore, considering the relatedness of T. grandiflorum and T. cacao, the cupuassu-M. perniciosa pathosystem can also benefit the understanding of the cacao-M. perniciosa pathosystem.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Notes on contributors
Loeni Ludke Falcão holds a degree in Agronomy from the Federal University of Santa Maria, Brazil (2000), master's degree from the University of Brasilia, Brazil (2003). Since 2002, she has been a research analyst at Embrapa Genetic Resources and Biotechnology. She is currently a doctoral student at the University of Brasilia.
Joseilde Oliveira Silva-Werneck the University of Brasília, Brazil (1985), a MSc degree in Molecular Biology from the University of Brasília (1997) and a PhD degree in Molecular Biology from the University of Cambridge, UK (2006). She is currently a researcher at Embrapa Genetic Resources and Biotechnology. She has experience in the areas of Molecular Biology, Biochemistry, Microbiology and Biological Control, working mainly on the following topics: characterization, cloning and gene expression, protein purification, plant-microorganisms interaction, transcriptome analysis, plant transformation and characterization.
Paulo Sergio Bevilaqua Albuquerque holds a degree in Agronomy from the Federal Rural University of Amazon, Brazil (1985), a Master's in Agronomy (Phytopathology) from the University of São Paulo, Brazil (1993) and a PhD in Agronomy (Phytopathology) from the University of São Paulo (2006). He is currently a researcher at the CEPLAC (Executive Commission of the Cacao Farming Plan). He has experience on Phytopathology, working mainly on the following topics: resistance of cacao to Moniliophthora perniciosa, linkage maps and identification of QTLs related to resistance to M. perniciosa.

Rafael Moyses Alves
He holds a degree in Agronomy from the Federal Rural University of Amazon, Brazil (1977), a Master's in Genetics and Plant Breeding from the University of São Paulo, Brazil (1985) and a PhD Genetics and Plant Breeding from the University of São Paulo (2003). He is a researcher at the Embrapa Eastern Amazon. He has experience in Agronomy, with emphasis on Plant Breeding, working mainly on the following topics: Theobroma grandiflorum, characterization of clones, cultivars, native fruit trees and genetic diversity.
Priscila Grynberg is graduated in Biological Sciences (2004) with a Master's in Parasitology (2005) and a Doctorate in Bioinformatics (2011) from the Federal University of Minas Gerais, Brazil. She is a researcher of the Bioinformatics Lab at Embrapa Genetic Resources and Biotechnology.
Roberto Coiti Togawa He holds a degree in Computer Science from the University of Brasília (1984) and a Ph.D. from the University of Bedfordshire, UK (2006). He worked in system management, software support, and systems development for several years. He is currently a research analyst at Embrapa Genetic Resources and Biotechnology since 1995 at the Bioinformatics Laboratory. Has experience in Computer Science, with emphasis on Programming Languages, working mainly on the following topics: Development of genome sequence analysis tools, and NGS sequence analysis. Development of tools for analysis of protein structures, Structural Bioinformatics, and Development of the tools for prediction and analysis of Membrane Proteins.
Marcos Mota do Carmo Costa He holds a degree from UniCEUB (University Center of Brasília), Brazil (1979), a master's degree from the Pontifical Catholic University of Rio de Janeiro, Brazil (1985), and a doctorate from the University of London (1990). He was a professor at the Catholic University of Brasilia. He has worked at Embrapa Genetic Resources and Biotechnology since 1980. He has experience in Computer Science, with an emphasis on Computer Theory, working mainly on the following subjects: distributed objects, automatic theorem proof, genomic analysis, bioinformatics, and program analysis, working in the field of Bioinformatics since 2001.
Dr. Marcelo Macedo Brigido is a molecular biologist and full professor at the University of Brasilia, Brazil.
Lucilia Helena Marcellino is a biologist with master's degree and doctorate in Molecular Biology at the University of Brasilia, Brazil (2002). She is a researcher at Embrapa Genetic Resources and Biotechnology, acting on the following subjects: functional genomics; development and application of genomic tools for plant breeding; gene prospecting for plant pathogen resistance; gene regulation and gene promoters. Coordinates and participates in projects related to functional genomics and plant-pathogen interaction with emphasis in Theobroma-Moniliophthora perniciosa.

Funding
This work is part of an international project (TheoBRomics) funded by a tripartite call involving Embrapa (Brazilian Agricultural Research Corporation), Capes (Brazilian Federal Agency for Support and Evaluation of Graduate Education) and Agropolis Foundation. Cupuassu population maintenance, basidiospores collection, storage and inoculation were funded by CEPLAC (Executive Commission of the Cacao Farming Plan).

Author contributions
LLF and LHM conceived and designed the experiments. LLF performed the experiments. PG, RCT, and MMCC performed bioinformatic analyses. MMB provided intellectual and editorial comments. PSBA provided the plant material and developed the inoculation method. RMA developed cupuassu clones. LHM and MMB are advisors to LLF. LLF, JOSW, and LHM performed data analysis and wrote the manuscript. LHM supervised this project. All authors have read and approved the submitted version of the manuscript.

Data archiving statement
The database was deposited in the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/) under accession number (the deposit is in process; the accession numbers will be supplied once available).