New insights into the evolution of host specificity of three Penicillium species and the pathogenicity of P. Italicum involving the infection of Valencia orange (Citrus sinensis)

ABSTRACT Blue and green molds, the common phenotypes of post-harvest diseases in fruits, are mainly caused by Penicillium fungal species, including P. italicum, P. digitatum, and P. expansum. We sequenced and assembled the genome of a P. italicum strain, which contains 31,034,623 bp with 361 scaffolds and 627 contigs. The mechanisms underlying the evolution of host specificity among the analyzed Penicillium species were associated with the expansion of protein families, genome restructuring, horizontal gene transfer, and positive selection pressure. A dual-transcriptome analysis following the infection of Valencia orange (Citrus sinensis) by P. italicum resulted in the annotation of 9,307 P. italicum genes and 24,591 Valencia orange genes. The pathogenicity of P. italicum may be due to the activation of effectors, including 51 small secreted cysteine-rich proteins, 110 carbohydrate-active enzymes, and 12 G protein-coupled receptors. Additionally, 211 metabolites related to the interactions between P. italicum and Valencia orange were identified by gas chromatography-time of flight mass spectrography, three of which were further confirmed by ultra-high performance liquid chromatography triple quadrupole mass spectrometry. A metabolomics analysis indicated that P. italicum pathogenicity is associated with the sphingolipid and salicylic acid signaling pathways. Moreover, a correlation analysis between the metabolite contents and gene expression levels suggested that P. italicum induces carbohydrate metabolism in Valencia orange fruits as part of its infection strategy. This study provides useful information regarding the genomic determinants that drive the evolution of host specificity in Penicillium species and clarifies the host-plant specificity during the infection of Valencia orange by P. italicum. IMPORTANCE P. italicum GL_Gan1, a local strain in Guangzhou, China, was sequenced. Comparison of the genome of P. italicum GL_Gan1 with other pathogenic Penicillium species, P. digitatum and P. expansum, revealed that the expansion of protein families, genome restructuring, HGT, and positive selection pressure were related to the host range expansion of the analyzed Penicillium species. Moreover, gene gains or losses might be associated with the speciation of these Penicillium species. In addition, the molecular basis of host-plant specificity during the infection of Valencia orange (Citrus sinensis) by P. italicum was also elucidated by transcriptomic and metabolomics analysis. The data presented herein may be useful for further elucidating the molecular basis of the evolution of host specificity of Penicillium species and for illustrating the host-plant specificity during the infection of Valencia orange by P. italicum.


Introduction
Host specificity refers to the ability of an organism to colonize another organism [1]. Some plant pathogens have a broad host range, whereas others can only colonize specific plant species or families. Even within the same genus, there are considerable differences in the host range among species, which may be related to pathogen evolution. During the evolution of plant pathogens, host shifts may occur to broaden or limit the host range [2]. In extreme cases, a pathogen evolves to colonize a new host that is phylogenetically distant from the original hosts (i.e., host jump), possibly resulting in the emergence of new fungal diseases [3,4]. Therefore, elucidating the molecular basis of the evolution of host specificity in fungal plant pathogens is critical for developing strategies to decrease the yield losses of economically valuable crops.
The genus Penicillium comprises ascomycetous fungal species that are widely distributed, with important implications for natural environments as well as in the food and drug industries. Some members of this genus produce penicillin, an antibiotic, whereas other species are used to make cheese. However, some Penicillium species are among the most important plant pathogens responsible for postharvest diseases. For example, Penicillium italicum, Penicillium digitatum, and Penicillium expansum cause losses of up to 10% of harvested crops. Of these species, P. italicum and P. digitatum exclusively infect citrus fruits, but P. expansum can infect diverse fruit and vegetable crops, such as apple, pear, peach, strawberry, and tomato, but not citrus fruits [5]. The genomes of P. italicum, P. digitatum, and P. expansum strains have been sequenced [6][7][8][9]. Additionally, Li et al. [8] and Ballester et al. [7] focused on the relationship between secondary metabolism and infections in P. expansum. However, the pathogenicity and evolution of host specificity in Penicillium species have remained largely uncharacterized.
The evolution of host specificity of fungi is mediated by a complex process involving multiple genes and mechanisms. A multidisciplinary approach may unveil new molecular determinants that drive the evolution of host specificity. In this study, we sequenced the genome of a P. italicum strain and then applied comparative genomics, dual-transcriptomics, and metabolomics analyzes to uncover the genomic determinants associated with the evolution of Penicillium species and elucidate the molecular mechanism underlying the host-plant specificity during the infection of Citrus sinensis by P. italicum. The results presented herein provide the molecular basis for clarifying the genomic determinants that drive the evolution of host specificity of Penicillium species and the host-plant specificity during the infection of Valencia orange by P. italicum.

Penicillium italicum genome sequencing and annotation
An Illumina HiSeq 2000 sequencing platform was used to generate 1,815 Mb of raw data for the P. italicum strain GL_Gan1 genome. The assembled genome contained 31,034,623 bp, with an average GC content of 45.83%, including 361 scaffolds ranging from 1,002 bp to 994,651 bp and 627 contigs ranging from 213 bp to 867,867 bp. The N50 values of the scaffolds and contigs were 316.59 kb and 196.54 kb, respectively (Table 1), which were considerably higher than those of the released genomes of P. italicum B3 and P. italicum PITC [7,8]. Moreover, the P. italicum GL_Gan1 genome comprised 9,447 genes, which consisted of 15,524,911 bp, including 13,960,971 bp of exons and 1,563,940 bp of introns. The repeat sequence contained 3,791,084 bp. The predicted genes included 7,136 core genes and 85 species-specific genes (Dataset S1), which represented 75.54% and 0.90% of the total number of genes, respectively. The genes were functionally annotated with the Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Clusters of Orthologous Groups (COG) databases (Fig.  S1). The top matches for the GO, KEGG, and COG databases included 3,350 (35.46%), 1,172 (12.41%), and 1,447 (15.31%) genes, respectively, and the most abundant assigned terms were 'metabolic process', 'xenobiotic bio-degradation and metabolism', and "general function prediction only", respectively (Fig. S1).
Horizontal gene transfer (HGT) is a major evolutionary factor that enables organisms to quickly acquire new genes that enhance environmental adaptation and survival. In this study, we identified 383 transferred genes in the P. italicum GL_Gan1 genome (Dataset S1). Regarding the functional annotation with the COG database (Dataset S1), the top three categories were secondary metabolite biosynthesis, transport, and catabolism (49 proteins), amino acid transport and metabolism (48 proteins), and energy production and conversion (44 proteins). Similarly, the top categories for the functional annotation with the GO database (Dataset S1) were metabolic process (294 genes) and catalytic activity (291 genes). Interestingly, the most enriched KEGG pathways (Dataset S1) were xenobiotics biodegradation and metabolism (137 genes), amino acid metabolism (104 genes), and carbohydrate metabolism (97 genes). These results suggested that genes associated with metabolic processes and/or products are prone to be horizontally transferred [10]. Moreover, GL_Gan1_GLEAN_10000555 encodes a polyketide synthase (PKS) with two active domains (MDR and Qor). This PKS is a core enzyme responsible for catalyzing the first reaction during the biosynthesis of secondary metabolites (SMs) [11]. We constructed a phylogenetic tree based on PKS sequences from Aspergillaceae, Bacteria, Glomerellaceae, Nectriaceae, Clavicipitaceae, and so on. There were three major clades, Polyporaceae and bacteria, Aspergillaceae and others. Surprisingly, when PKS from seven Bacteria genes and seven Penicillium genes were constructed a phylogenetic tress, it was found that Paenibacillus mucilaginosus gene was clustered with Penicillium genes (Fig. S2). Paenibacillus mucilaginosus is able to promote crop growth and widely used as microbial fertilizer in China [12]. Possibly, horizontal gene transfer occurred for P. italicum PKS gene from Paenibacillus mucilaginosus, however, further experiment is required to verify the hyposthesis. A similar HGT was reported for the phytohormone biosynthetic genes in Fusarium species [13].
Plants interact with pathogens through microbial effectors and host immune receptors. The successful colonization of a plant pathogen is the result of the interaction between pathogen effectors and host immune receptors. Pathogen effectors repress the host immune response to facilitate pathogen growth and colonization, which are critical for the pathogenicity of filamentous fungi. Sanchez-Vallet et al. proposed that some small-size peptides for secretion, cell walldegrading enzymes, protease inhibitors, interactors with the ubiquitin-proteasome system, and disruptors of the hormone signaling pathway can act as effectors [14]. It has been reported that some fungal effector proteins identified are relatively small protein containing fewer than 200 amino acids and more than 2% cysteines, i.e. small secreted cysteine-rich proteins (SSCPs), constituting a common source of fungal effectors [15]. In the present study, a total of 444 putative effector genes were identified in the P. italicum GL_Gan1 genome, of which 51 genes encoding SSCPs (Dataset S1) possibly play roles in trigger resistance or susceptibility in citrus fruit [16].
Plant cell wall is a complex matrix of polysaccharides, including pectin, hemicellulose and cellulose, which provides mechanical support for cells and forms a barrier against fungal invasion. Plant pathogens are equipped with multiple carbohydrate-active enzymes (CAZymes), including glycoside hydrolases (GHs), glycosyl transferases (GTs), polysaccharide lyases (PLs), carbohydrate esterases (CEs) and carbohydrate-binding modules (CBMs), which facilitate fungal invasion via degradation or modification of plant cell walls [17,18]. P. italicum is a necrotrophic pathogen and only infects citrus fruit through peel injuries in the field, the packing house, or during commercialization chains [6]. Initiative infection of P. italicum is usually seen only after about 3 d of incubation at room temperature, with water-soaked and soft peel. Therefore, hydrolytic enzymes produced by P. italicum appear responsible for the maceration of the tissue during disease development [19]. In the present study, 1,290 genes encoding CAZymes were identified in the P. italicum genome, including 18 PLs, 383 GTs, 623 GHs, 141 CEs, and 274 CBMs, which was much more than that observed in the other published and annotated P. italicum genome [8]. The number of CAZymes in P. italicum was similar to that of the necrotrophic fungi Fusarium oxysporum (approximately 1200), but much more than those of saprotrophic fungi, such as Chaetomium thermophilum var. thermophilum DSM 1495 and Neurospora crassa OR74A (less than 600) [17]. We also noticed that GHs were the most abundant CAZymes in P. italicum genome, of which some subgroups, including GH76, GH78, GH92, GH3, GH16, GH51, GH55, GH43, GH5, GH2, GH17, GH71, were involved in degradation of cell wall polysaccharides (www.cazy.org). The distribution of CAZymes in P. italicum was similar with other  [17].

Comparative genomics analysis reveals the evolution of Penicillium species
To elucidate the evolution of host specificity and the pathogenicity of Penicillium species, the genomes of two P. italicum strains (GL_Gan1 and PHI1/PITC), one P. digitatum strain (PHI26), and three P. expansum strains (CMP1, Pd1, and MD8) were compared. A genome wide analysis of synteny showed that Pe_CMP1/Pi_Gan1 ( Figure 1a) and Pi_PHI1/ Pi_Gan1 (Figure 1d) had high levels of synteny, followed by Pd_PHI26/Pi_Gan1 (Figure 1e). A total of 7136 single-copy core genes were used to constructed phylogenetic tree (figure 1f), with Aspergillus nidulans_FGSCA4 as an outgroup species, to further elucidate the evolutionary relationships among P. italicum, P. expansum, and P. digitatum. The results showed that P. italicum Gan1 and PHI1 were cluster into one group, while P. expansum CMP1, Pd1, and MD8 were cluster into another group. Moreover, the phylogenetic distance between P. italicum Gan1 and PHI1 was shorter than those between PHI26 and P. expansum CMP1, Pd1, and MD8. The inconsistence between synteny and phylogenetic tree might be related to the application of genome and single-copy core genes for comparative analysis, respectively. However, the comparative analysis of core genome showed the close relationship between P. italicum and P. expansum. Our results were consistent with the species tree reconstructed using 524 single-copy genes present in seven Penicillium species [7] and 2,134 single-copy genes present in eight Penicillium species [20].
The comparative genomics analysis demonstrated the evolutionary plasticity of these Penicillium genomes. The core genomes comprised 7,690 genes (P. digitatum), 8,375 genes (mean) (P. italicum), and 9,342 genes (mean) (P. expansum) ( Table 2), suggesting that lineage-specific gene expansion was essential for P. expansum evolution and its adaptations to diverse hosts. Similar results were reported by Yoshida et al. [21] who proposed that gene gains and losses are two of the primary contributors to the genome shift essential for species evolution.
The expansion of protein families associated with fungal pathogenicity was speculated to force the speciation of these Penicillium species. Table 3 presents the differences in the protein content among Penicillium species. 6,309 and 6851 protein families were detected in P. digitatum and P. italicum, respectively, whereas 7,640 protein families were detected in P. expansum, implying that the expansion of protein families has occurred from P. digitatum and P. italicum to P. expansum, which might be related to the fungal pathogenicity of the three Penicillium species.
Pathogen-host interaction (PHI) are crucial for initiating the infections of a host by a pathogen, which is regulated by pathogenesis-related secreted proteins, such as CAZymes and SSCPs [22]. PHI-base database (http://www.phibase.org/) is to provide expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions. In the present study, searching against the PHI database identified 1728 putative PHI genes in the P. italicum GL_Gan1 genome which were homologous genes involved in pathogenicity in other fungi. A comparison of the number of PHI genes revealed that P. digitatum (average 1,503) has fewer PHI genes than P. expansum (average 2,214) and P. italicum (average 1,728) (Table 3). Furthermore, P. italicum, P. digitatum and P. expansum shared common subgroups (399) with differential numbers of PHI genes in some subgroups. Moreover, each species had its own specific PHI subgroups (Dataset S2). Possibly, the differential compositions and number in PHI genes among three Penicillium species were related to the different host ranges.
G protein-coupled receptors (GPCRs) are seventransmembrane domain-containing receptors that transduce extracellular signals and activate intracellular responses [27]. They have important functions related to fungal growth and survival. The number of GPCRs varies depending on the fungal species. For example, the Sporisorium scitamineum genome encodes only six GPCRs, whereas the Magnaporthe grisea genome includes 61 GPCR genes [28,29]. Penicillium species have a few GPCRs, and no GPCR gene expansion was detected during evolution.
Cytochrome P450 s (P450 s) belong to a superfamily of mono-oxygenases, which are distributed in a wide range of organisms. The diversity in the number of P450 genes and families among organisms reflects an evolutionary driving force toward speciation [30]. Compared with the number of P450 genes in P. italicum (average 29) and P. digitatum (average 12), a marked gene expansion occurred in P. expansum (average 43).
Secondary metabolites are necessary for fungi to produce a variety of chemical compounds, including toxins and antibiotics, which contribute to fungal virulence, host specificity, and adaptations to ecological niches [31]. The specialist P. digitatum has fewer SM biosynthetic gene clusters (average 32) than P. italicum (average 43) and P. expansum (average 65). Recently Wu et al. (2019) reported the similar composition of SM gene clusters in different Penicillium species, but further functional characterization of all these genes are required to be exactly identified by RNA interference and/or gene deletion studies [32].
Therefore, protein-family expansions in these Penicillium lineages were possibly associated with the evolutionary processes in these species. It seems that P. italicum is a transitional species in the Penicillium lineage that developed during the evolution of P. digitatum to P. expansum. Thus, P. expansum may be a newer species than P. italicum. This is consistent with the principle of speciation that suggests the newly evolved P. expansum should have a broader host range than the older related species. Additionally, our data are consistent with those of previous studies on seven Metarhizium species and three Fusarium species, which indicated that the genome and protein families expanded considerably in the transitional and generalist species [33,34]. Alternatively, these Penicillium species may have undergone a parallel evolution, but the divergence of P. italicum and P. digitatum occurred because of gene losses, particularly the loss of genes associated with virulence and metabolic products, resulting in major changes to P. italicum and P. digitatum that enabled them to infect new hosts. Accordingly, these two species may have been exposed to extreme external pressures that led to a rapid evolution of effectors for their colonization of a new host species. Similarly, a loss of genes associated with virulence and toxin biosynthesis in host-specific strains of Metarhizium species resulted in the divergence or rapid evolution of specialist Metarhizium species [35]. Moreover, gene losses rather than gains occurred in Melanopsichium pennsylvanicum, which expanded its host range from monocot to dicot hosts [36].
Genome restructuring may have contributed to the evolution of host specificity. Transposable elements (TE) cause gene mutations and the evolution of genomes [37]. We detected considerably more TEs in P. digitatum and P. italicum than in P. expansum (Table S1), implying that P. expansum has fewer mutations than P. digitatum and P. italicum after the reinforcement of reproductive isolation. Moreover, the plasticity of the P. digitatum and P. italicum genomes was greater than that of the P. expansum genome.
Using Ka/Ks > 1 and P < 0.01 as criteria, we detected 17 genes under positive selection pressure among the examined Penicillium species. Positively selected genes, most of which are related to fungal virulence (Dataset S1), were frequently detected between P. italicum and P. expansum, rather than in P. italicum itself or between P. italicum and P. digitatum. These results imply that these genes may have contributed to the divergence of the host specificity between P. italicum and P. expansum, thereby providing important evidence of the evolutionary processes of these Penicillium species as well as the adaptation of P. expansum to various hosts.

Dual-transcriptome analysis of P. italicum GL_Gan1 and Valencia orange during colonization
To explore the mechanism underlying the colonization and pathogenicity of P. italicum on Citrus sinensis, the transcriptome profiles of Valencia orange and P. italicum GL_Gan1 during the colonization were analyzed. We constructed RNA-sequencing (RNA-seq) libraries for the inoculated Valencia orange at 0, 1, 3, 5, and 10 days post-inoculation (dpi). The libraries were sequenced with the Illumina RNA-seq technology, and more than 20 M reads were generated for each library (Table S2). Additionally, 0.61%-57.27% of the clean reads were mapped to the P. italicum genome, in which 6,840-9,145 expressed genes were detected. Moreover, 12.07%-82.82% of the clean reads were mapped to the C. sinensis genome, in which 17,610--21,896 expressed genes were detected (Table S2). Furthermore, 6575 and 15,223 genes were expressed at all stages in P. italicum and Valencia orange, respectively ( Figure 3). Compared with the expression levels at 0 dpi, 228, 414, 376, and 336 genes in P. italicum and 301, 1,314, 1,955, and 479 genes in Valencia orange were differentially expressed at 1, 3, 5, and 10 dpi, respectively.
A short time-series expression miner (STEM) clustering analysis of gene expression during the infection of Valencia orange by P. italicum revealed that the fruit responses to P. italicum included cell wall and membrane degradation, accelerated polysaccharide metabolism and energy transformation, and increased metabolite production (Figure 4a). The genes associated with cellular components that exhibited the same temporal expression patterns were clustered together. Among these genes, the expression levels of the genes associated with the chloroplast, including stroma (95 assigned genes), envelope (95 assigned genes), thylakoid (17 assigned genes), and organization (19 assigned genes), were significantly down-regulated, suggesting that the chloroplast of the infected Valencia orange may have degraded or was otherwise adversely affected. Similarly, the expression levels of many genes related to protein synthesis, including the structural constituents of ribosomes (115 assigned genes) and translation (111 assigned genes), were down-regulated in the inoculated Valencia orange, indicating the infection impeded protein synthesis (Figure 4a). Additionally, glucosyltransferases and the related metabolism may be activated in Valencia orange. The expression levels of five of nine genes associated with fructokinase activity were up-regulated. It is possible that the growth of P. italicum exhibits a preference for glucose rather than fructose as an energy source. The impact of fungal pathogen on plant host transcriptome have widely revealed the altered gene expression patterns involved in defense and stress responses, cell wall structure, chloroplast biogenesis, carbon metabolism and transportation, and biosynthesis of secondary metabolites [38,39].
Plants have evolved sophisticated strategies to escape various pathogen attacks. Plant resistance genes (R genes) are important for resisting pathogen infections. These genes are structurally conserved in vertebrates and plants, with the following typical domains: NBS, LRR, TIR, and CC. A total of 636 R genes were identified in the Valencia orange genome (Table S3). The expression of most of these genes was down-regulated during infection ( Figure 5). We randomly selected 16 down-regulated R genes from the transcriptome data and compared their expression in non-inoculated and inoculated Valencia orange. Of the 16 genes, the expression of 12 was significantly lower in the inoculated Valencia orange than in the non-inoculated fruit at 3 dpi ( Figure 6). These results indicated that the P. italicum infection adversely affected R gene-based immunity. Houterman et al. [40] reported that Fusarium oxysporum secretes an effector, Avr1, which can induce and suppress R gene-based immunity in infected tomato plants. This effector activates the disease resistance of tomato plants via R genes (I or I-1), but suppresses the protective effects of two other R genes, I-2 and I-3.
Our STEM clustering analysis also confirmed that P. italicum exhibited accelerated growth and development, as the expression levels of all five genes  associated with a cell division control-related protein were up-regulated during infection. Similarly, many genes related to ATP hydrolysis also exhibited upregulated expression (Figure 4b). Therefore, P. italicum may be very actively absorbing nutrients, leading to accelerated growth.
P. italicum secretes many proteins and substances that enable it to infect and colonize Citrus sinensis, such as CAZymes, SSCPs, GPCR, and SMs. The CAZymes degrade plant cell walls to create an entry point in the plant host. In the present study, 110 genes encoding CAZymes were differentially expressed in P. italicum during infection, including genes belonging to the GH28, GH78, GH95, GH105, CE8, CE12, PL1, PL3, and PL4 families. These CAZymes help degrade the pectin, hemicellulose, and cellulose in plant cell walls. Specifically, the expression levels of several genes encoding CAZymes, including GL_Ganl_CLEAN _10003242, GL_Ganl_CLEAN_10005653, GL_Ganl_ CLEAN_10005681, GL_Ganl_CLEAN _10005669, GL_Ganl_CLEAN_10000302, and GL_Ganl_CLEAN _10007363, were substantially up-regulated in P. italicum during the infection of Valencia orange (Figure 7). Thus, these genes may be critical for the infection. Ballester et al. [7] reported that a large number of gene encoding cell wall-degrading enzymes including polygalacturonase, pectate lyase, glycosyl hydrolase and xylanase, are obviously up-regulated in P. expansum when infecting apple fruit at 2-3 dpi. The SSCPs represent a common source of fungal effectors [41]. In the current study, 51 genes encoding SSCPs were differentially expressed in P. italicum during infection (Dataset S3). The expression of a number of SSCP genes was down-regulated in P. italicum during infection. Additionally, 15 genes encoding SSCPs were not expressed at 0 dpi, but were expressed at 1 or 3 dpi. Moreover, the expression levels of four SSCP genes (GL_Gan1_GLEAN_10002725, GL_Gan1_GLEAN_ 10004237, GL_Gan1_GLEAN _10005638, and GL_Gan1_GLEAN_10001595) increased considerably during the early infection stage, implying these SSCPs influence P. italicum pathogenicity. Similarly, Bradshaw et al. [42] examined the gene expression dynamics of Dothistroma septosporum when it invades its host, and revealed the up-regulated expression of genes mainly encoding SSCPs, CAZymes, and SMs in three infection cycle stages. Alkan et al. [43] simultaneously analyzed the transcriptomes of Colletotrichum gloeosporioides and tomato fruits during their interaction, and determined that many SSCP genes are expressed in different stages. Overall, the RNA-seq results highlighted the arsenal of putative pathogenicity factors and identified specific candidates that should be functionally analyzed to further characterize their effect on P. italicum pathogenicity.

Metabolomics analysis of non-infected and infected Valencia orange by P. italicum
We used GC-TOF-MS to analyze the metabolome of the cuticle tissues in non-infected and infected Valencia orange fruits at 3 dpi. A total of 592 metabolites were identified and quantified, of which 207 metabolites had the similarity over 700, including sugars, polyols, organic acids, amino acids, and SMs (Table S4). The abundance of 51 metabolites was significantly different between the inoculated and non-inoculated Valencia orange fruits ( Table 4). Most of the differentially accumulated metabolites are related to carbohydrate metabolism, amino acid metabolism, fatty acid metabolism, secondary substance metabolism, and nucleic acid metabolism.
Trehalose is a widely distributed non-reducing disaccharide that has important effects on plant growth and development as well as responses to abiotic stresses [44]. There is emerging evidence that trehalose is also involved in plant responses to pathogens. For example,  exogenous trehalose acts as an elicitor to induce the resistance of wheat to powdery mildew disease [45]. Zhang et al. [46] applied virus-induced gene silencing to reveal the involvement of several putative trehalose-6-phosphate synthase/phosphatase genes in the resistance of tomato plants to Botrytis cinerea and Pseudomonas syringae. In the present study, the trehalose concentration of non-infected Valencia orange was 3.45 times higher than that in infected fruits at 3 dpi ( Table 4). The decreased accumulation of trehalose in infected Valencia orange was also confirmed by ultra-high performance liquid chromatography and triple quadrupole mass spectrometry (UPLC-QQQ-MS) (Figure 8a). The Valencia orange genome carries six genes (Cs3g05430, orange1.1t05805, Cs5g01775, Cs5g01780, Cs7g22230, and Cs5g01790) encoding trehalase, which is the enzyme responsible for the degradation of trehalose [47]. Of these genes, Cs3g05430, orange1.1t05805, Cs5g01775 and Cs7g22230 were more highly expressed in the infected Valencia orange than in the non-infected fruit at 3 dpi, whereas there were no significant differences in the expression of Cs5g01780   and Cs5g01790 (Figure 8b). In addition, we identified six trehalose-phosphate synthase genes (GL_Gan1_ GLEAN_10003645, GL_Gan1_GLEAN_ 10,003,647, GL_Gan1_GLEAN_10007719, GL_Gan1_GLEAN_ 10006325, GL_Gan1_GLEAN _10009252, GL_ Gan1_GLEAN_10007138) and one trehalase gene (GL_Gan1_GLEAN_10009387) in P. italicum GL_Gan1. Among these genes, expression of GL_Gan1_GLEAN_10003645 were significantly downregulated at 3 dpi compared with at 0 dpi, while expression of other genes showed no significant difference between at 0 dpi and at 3 dpi. The up-regulated expression of trehalase genes in Valencia orange and the down-regulated expression of trehalose-phosphate synthase in P. italicum was consistent with the decreased accumulation of trehalose in the infected Valencia orange fruit. Therefore, during the infection of Valencia orange, P. italicum induces the expression of trehalose degradation-related genes, which accelerates trehalose degradation and decreases disease resistance.
Salicylic acid is a key plant hormone that mediates host responses to microbial pathogens. A UPLC-QQQ-MS analysis indicated the salicylic acid concentration was 0.27 ng/mg in inoculated fruits and 0.055 ng/mg in non-inoculated fruits at 3 dpi, implying that a P. italicum infection induces salicylic acid biosynthesis in Valencia orange. Sphingosine is an important regulator of SA accumulation in plant cells [48]. Additionally, D-sphingosine levels in non-inoculated Valencia orange was significantly lower than that in inoculated Valencia orange (Table 4), which was consistent with the UPLC-QQQ-MS data (Figure 8a). Serine palmitoyl transferase, a heterodimer composed of LCB1 and LCB2 subunits, catalyzes the first reaction of the biosynthesis of LCB (sphingoid long-chain base) [48]. The LCB1, LCB2, and LCB2-like genes were expressed more highly in inoculated Valencia orange than in non-inoculated Valencia orange (Figure 8c). This explains the high D-sphingosine content in the inoculated Valencia orange. These results suggest that the infection of Valencia orange by P. italicum involves the signaling pathways associated with sphingolipid metabolism and SA accumulation.
The abundance of several disease resistance-related metabolites, including galactinol, ferulic acid, and benzoic acid, significantly decreased in the inoculated Valencia orange at 3 dpi, which likely contributed to the susceptibility of the fruits to a P. italicum infection [49,50].
An analysis of the correlation between metabolite accumulation and gene expression clarified the role of primary or secondary metabolism in the disease resistance of Valencia orange or the pathogenicity of P. italicum. The expression levels of several carbohydrate metabolism-related genes, including Cs7g28910 (UDP-apiose/xylose synthase), Cs4g15520 (hexokinase), Cs5g11560 (UDP-glucose 6-dehydrogenase), Cs5g22920 (fructokinase), and Cs9g16550 (fructokinase), were positively correlated with the accumulation of metabolites (Figure 9), indicating that carbohydrate metabolism was related to the infection by P. italicum.

Conclusion
We sequenced the genome of P. italicum GL_Gan1, which is a local strain in Guangzhou, China. A comparative genomics analysis among three Penicillium species, including P. italicum, P. digitatum, and P. expansum, suggested that the host range expansion from P. italicum and P. digitatum to P. expansum might be related to the expansion of protein families, genome restructuring, HGT, and positive selection pressure. In addition, we explored the molecular basis underlying the pathogenicity of P. italicum to Valencia Orange (Citrus sinensis). Diverse strategies mediating the pathogenicity of this fungus were revealed, including (1) the activation of effectors, such as CAZymes, SSCPs; (2) the suppression of host R genes; (3) the regulation of defense responses-related metabolites in host, such as trehalose and salicylic acid. However, in the present study, we have not compared diseased tissue to non-diseased tissue or fungal transcripts in pathogenic vs nonpathogenic interactions from a transcriptome perspective. More transcriptome analysis are necessary to elucidate the pathogen-host interaction. The data presented herein may be useful for further elucidating the molecular basis underlying the evolution of the host specificity of Penicillium species and for illustrating the the pathogenicity of P. italicum to Valencia Orange.

Fungal strain
Penicillium italicum GL_Gan1 was first isolated in Guangzhou, China from infected sweet orange exhibiting typical blue mold symptoms. The strain was purified as a monospore and stored at −80°C prior to use.

Fruit inoculation
The P. italicum GL_Gan1 conidiospores were suspended in sterile water containing 0.05% (v/v) Tween 80. The spore concentration in the suspension was adjusted to 1 × 10 7 spores/ml with a hemocytometer under an optical microscope (Olympus, Japan).
Valencia orange (C. sinensis) was purchased from a local market and sterilized with 75% ethanol. Five cross-like wounds (width: 0.5-1; depth: 1-2 cm) were made at the base of the Valencia orange with sterilized tips. The wounds were then inoculated with 10 µl spore suspension, after which the fruits were incubated in a controlled-environment room (26 ± 1°C, natural photoperiod, and 70%-80% humidity). Infected Valencia orange peel tissues (flavedo and albedo) were collected at 0, 1, 3, 5, and 10 dpi, and then immediately frozen in liquid nitrogen for the subsequent RNA-seq and quantitative real-time PCR (qPCR) analyses. After inoculation, the tissues was immediately collected as the sample at 0 dpi.

Extraction of DNA and RNA
Penicillium italicum GL_Gan1 was cultured on potato dextrose agar for 7 days, after which spores were harvested. Genomic DNA was extracted from conidiospores (80 mg) with the HiPure Fungal DNA kit (Magen, Guangzhou, China). High-quality DNA was submitted to BGI-Shenzhen (Shenzhen, China) for genome sequencing. Total RNA was extracted from the inoculated Valencia orange peel tissue with TRIzol® reagent according to the manufacturer's instructions (Invitrogen, China).

Genome sequencing and assembly
The P. italicum GL_Gan1 genomic DNA was used to construct 500-bp sequencing libraries. Sequencing data comprising 1,815 Mb (125-bp paired-end) were generated with the Illumina HiSeq™ 2000 system at BGI-Shenzhen (Shenzhen, China). To ensure the accuracy of the assembly, reads with 25 low-quality (≤ Q2) bases, 10% Ns, or a 15-bp overlap between the adapter, and duplicated reads were eliminated. The short reads were assembled with SOAPdenovo 1.05, with the assembly optimized with the key parameter K = 63.
The CAZyme database (http://www.cazy.org/) was used to identify proteins involved in carbohydrate metabolism. Pathogenicity-and virulence-related genes were identified using the PHI-base database (http://www.phibase.org/). We identified PKS and NRPS with SMURF [53]. Cytochrome P450 s were identified with the Cytochrome P450 database (http:// drnelson.uthsc.edu/CytochromeP450.html). The SSCPs of P. italicum GL_Gan1 were analyzed as described by Que et al. [28]. The GPCR sequences were evaluated regarding their seven transmembrane regions with Phobius [54] and the default settings of TMHMM 2.0. The R genes were identified as previously described [44]. The STEM program (version 1.3.9) [55] was used to compare and visualize the RNA-seq data.
The single-copy core genes of the above-mentioned six Penicillium species/strains and A. nidulans (outgroup species) were identified with Hcluster_sg [56]. Multiple sequences were then aligned with MUSCLE [57]. The phyml subprogram of TreeBeST (http://tree soft.sourceforge.net/treebest.shtml) was used to construct a phylogenetic tree, with default parameters and 1,000 bootstrap replicates. The core genome and pangenome of the six Penicillium species/strains were obtained based on a previous analysis of a Yersinia pestis population [58]. Additionally, Ka and Ks were estimated using an established method [59]. Horizontal gene transfer was analyzed according to a published procedure [60]. Protein sequences that were shorter than 150 amino acids were excluded. The BLASTP algorithm was used for similarity searches involving the bacterial and viral protein databases of NR (1e-5, identity > 40%). Moreover, phylogenetic trees with fewer than 10 species were excluded. Multiple sequences were then aligned with MUSCLE [57] MEGA v6.06 (https://www.megasoftware.net/) was used to construct Neighbor-joining tree with default parameters and 2,000 bootstrap replicates. With which the iTOL v4.0 (https://itol.embl.de) was employed to improve the display, annotation and management of phylogenetic trees.

RNA sequencing
Two independent biological replicates of inoculated Valencia orange fruits were used to construct 10 shortfragment libraries with the Illumina TruSeq RNA library construction kit (version 2) (Illumina, CA, USA) for an RNA-seq analysis. After low-quality raw reads were discarded, the remaining high-quality reads were aligned to the P. italicum GL_Gan1 and Valencia orange gene sequences with SOAP2 [61]. The RNA-seq data were analyzed and gene expression levels (as FPKM) were calculated with the RSEM software package [62]. The NOIseq method was applied to screen for differentially expressed genes between two groups [63]. An FDR ≤ 0.001 and an absolute value of the log 2 ratio ≥ 1 were used as the criteria for identifying differentially expressed genes.

qPCR analysis
The PrimeScript™ RT reagent Kit with gDNA Eraser (Takara, Otsu, Japan) was used to synthesize cDNA. Gene-specific primer pairs (Table S5) were designed with the Primer Express software (Applied Biosystems, Foster City, CA, USA). G3PDH and Actin were used as reference gene. The qPCR analysis was completed with three biological replicates. The output data were generated with the Sequence Detector program (version 1.3.1) (Applied Biosystems) and then analyzed according to the 2 −ΔΔCt method [64].

Metabolomics analysis
Valencia orange peel tissues infected with P. italicum were collected at 3 dpi for a GC-TOF-MS analysis as described by Dunn et al. [65]. Metabolites were extracted from 1.0 g citrus samples with a methanol: chloroform (3:1) solution, with 100 μl adonitol (0.2 mg/ ml) as an internal standard. After derivatization, the extracts were added with fatty acid methyl esters (C8-C24) as retention index markers derivatized and then were transferred to a 7890 gas chromatograph system (Agilent Technologies, USA) connected to a Pegasus HT time-of-flight mass spectrometer (LECO Corporation, USA). Additionally, a DB-5 MS capillary column was coated with cross-linked 5% diphenyl/95% dimethyl polysiloxane (30 m × 250 μm inner diameter, 0.25 μm film thickness; J&W Scientific, Folsom, CA, USA). The Chroma TOF 4.3X software (LECO, St. Joseph, USA) and LECO-Fiehn Rtx5 database were used for the raw peak extraction, baselines correction, deconvolution analysis and peak identification. Both of retention time index (RT) and mass spectral similarity were considered in metabolites identification. If the similarity is >700, we deem the metabolite identification is reliable. If the similarity is < 200, the compound name is defined as "analyte", and the similarity between 200 and 700, the compound name was considered as a putative annotation.
Principal component analysis (PCA) and orthogonal projections to latent structures-discriminant analysis (OPLS-DA) were used to display the similarity and difference of the origin data with the SIMCA-P 13.0 software package (Umetrics, Umea, Sweden).
To refine this analysis, the first principal component of variable importance projection (VIP) was obtained.
The VIP values exceeding 1.0 were first selected as changed metabolites. In step 2, the remaining variables were then assessed by Student's T test (T-test), P > 0.05, variables were discarded between two comparison groups.
Of the differentially accumulated metabolites, we quantified trehalose, salicylic acid, and D-sphingosine by UPLC-QQQ-MS. The standard chemicals were purchased from Sigma Chemical Co. (St. Louis, USA) and prepared as 100 µg/ml solutions. A 500-mg citrus peel sample was prepared and analyzed with the 6460 triple quadrupole LC/MS system with an electrospray ionization source (Torrance, CA, USA) as described by Chen et al. [66]. Samples were examined with six repetitions. Additionally, the Pearson correlation coefficient was used to assess the relationship between the metabolite changes and gene expression levels. Furthermore, a heat map was prepared with the R program and the PathPod mapping system as described by Hsu et al. [67].

Statistical analysis
Data are presented herein as the mean value of three biological replicates ± standard deviation. The significance of any differences among samples was calculated with SPSS (version 7.5) (SPSS, Inc., Chicago, IL, USA).

Data availability
The draft P. italicum GL_Gan1 genome sequence was deposited in the GenBank database (accession number LWEC00000000). The transcriptome sequence was deposited in the GenBank database (accession number SRP073474).