In silico prediction of type I PKS gene modules in nine lichenized fungi

Abstract The novel biologically active molecules could play a significant role in the treatment of human diseases. Natural products have been and continue to be a major source of pharmaceuticals, and lichen secondary metabolites emerge as never-ending potential for bioactive molecules with a variety of pharmacological activities. Polyketides, which are synthesized by enzymes encoded by PKS genes, constitute the major group of these secondary metabolites. To date, there is a lack of information about identification of PKS gene modules. Functional validation studies in lichens are difficult because of the slow growth rates of lichens, the symbiotic partners of lichens cannot be cultured in the laboratory or the fact that most of them cannot be grown in culture. Consequently, the importance of genomic mining approach is increasing as a unique tool for natural product discovery studies. Here, we bioinformatically investigated the type I PKS module candidates in nine publicly available lichen-forming fungi genomes through the use of the in silico screening tools. We also predicted putative secondary metabolites produced in these lichens which indicated the pharmaceutical potential of these nine lichen-forming fungi by bioinformatics tools.


Introduction
Lichens are mutualistic symbionts that occur as an association between a fungus mycobiont and an alga or cyanobacterium phycobiont. They represent unique characteristics and are valuable resources for bioactive secondary metabolites [1,2]. Lichens produce more than 1000 secondary metabolites and most of these compounds are unique to lichens. These metabolites have bioactive properties and are used as natural drug molecules [3]. In recent years, researchers are increasingly focussed on lichens which are a continuous source of several unique bioactive compounds. According to the numerous studies, the most important use of lichens and their constituents lies in their biopharmaceutical effects [4][5][6][7].
One of the extensively studied groups of biocompounds are polyketides, which are thought to originate from the biosynthetic activity of the fungal partner of lichens [8]. Polyketides are low-molecularweight substances and are used for their antibiotic, antimycobacterial, antiviral, anti-inflammatory, analgesic, antipyretic, antiproliferative, cytotoxic effects and pharmaceutical activities [9].
The fungal polyketides are synthesized by polyketide synthase (PKS) enzymes in the acetyl-polymalonyl pathway and biosynthesized by sequential reactions [10]. Structural changes in polyketide synthases are due to the number of carbon chain extensions during the reduction, the type of carbon extender unit used and the mode of aromatization [11]. Studies have shown that there are three architecturally different PKSs present in prokaryotic and eukaryotic organisms. Type I and II are found in bacteria and fungi, type III is present in higher plants. Type I systems consist of multifunctional proteins that can be non-iterative (e.g. erythromycin, rifamycin, etc.) or iterative. Fungal PKSs are mainly iterative type I enzymes consisting of multiple functional domains, and produce particular polyketides via using their active sites repeatedly [12]. This enzyme system is organized modularly [13,14]. In the structure of PKS enzyme, ketoacylsynthase domain (KS), acyltransferase (AT), acyl carrier protein (ACP), ketoreductase (KR), dehydratase (DH), enoylreductase (ER), thioesterase (TE) domains, C-metiltransferase (CMeT) are present [15]. The minimum domains that should be found in fungal PKSs are KS, AT and ACP, while the other domains are optional [16].
Production of polyketides is severely influenced by environmental conditions, and lichens are well-known for the diversity of secondary metabolites that they produce [17]. Lichens grow very slowly in their natural environment and their metabolic production is low. Several genetic and bioinformatics tools have been developed for natural product research area. It has been understood that metabolic genes are expressed at low levels and not active under standard cultivation conditions. All these make it unsuitable to obtain their metabolites in laboratory conditions. The struggles in producing lichen secondary metabolites under laboratory conditions have made the experimental lichenology an unexplored science [16]. In the genomic era, in which the number of available genomes is increasing, genome mining joined to synthetic biology offers a significant help in drug discovery [18]. Therefore, in silico prediction of the genes involved in the synthesis of these unique secondary metabolites is crucial and its importance is incontrovertible [19][20][21][22]. Due to shorter time and reduced costs, in silico methodologies have become a principal step of deciphering the molecular basis of discovery and characterization of gene clusters over the past decade [18,20,21,23].
In this study, we aimed to identify type I PKS gene modules in nine lichen-forming fungi by in silico analyses. We also performed bioinformatics predictions of the putative secondary metabolites produced by these clusters. The potential roles of the modules in secondary metabolite production were evaluated. Although there are studies focussed on PKS gene modules in a lichen-forming fungal genome, there is still insufficient information about multiple genome analysis so far to allow comprehensive analysis [24]. To our knowledge, this is the first study on the PKS gene cluster identification of nine lichen-forming fungal genomes together. Since the genome sequences of these nine fungi were assembled and publicly available, we chose these genomes to evaluate the PKS gene modules. Analyzing all the nine genomes together allows us to compare the architecture of PKS modules that give each organism its unique characteristics which helps us to further understand what PKS gene modules relate to the production of secondary metabolites. The other significance of the present study is that some known secondary metabolites were predicted in silico for the first time. The results indicated that some metabolites produced in these nine lichenized fungi are unique. Exploration of the genes encoding the enzymes that produce these metabolites allows us to understand the biosynthetic potential of these nine lichen-forming fungi.

Lichen-forming fungi genomes
To obtain a comprehensive study, we analyzed nine available genomes of lichenized fungi. The whole-genome sequences were obtained from the National Center for Biotechnology Information (NCBI), Joint Genome Institute (JGI) and MycoCosm fungal genomic results sequence archives.

Prediction of PKS genes
To identify the potential PKS gene clusters from the genomes of nine lichen-forming fungi, version 5 of antiSMASH program was used [25]. The antiSMASH program is a tool for searching a genome sequence in order to identify secondary metabolite biosynthetic gene clusters. Using rule-based detection approach, the default parameters were set for the analysis with relaxed detection strictness and with the Cluster Finder algoritm enabled [26][27][28]. As a result of the analysis, gene clusters involved in the production of secondary metabolites and also candidate metabolites can be predicted.
We additionally performed bioinformatics analyses to confirm the presence of antiSMASH-defined gene clusters and the domains in these clusters. First, the presence of the PKS gene cluster was confirmed by Blastx analysis against the non-redundant protein NCBI database. Subsequently, the presence of the domains in these gene clusters was verified using the NCBI conserved domain search program and the protein sequence information found in the Pfam database.
To verify the presence of the necessary domains to be functional in PKS modules, we used the BLASTx search in the non-redundant protein NCBI database. The criteria used determining the modules according to the BLASTx results were as follows; e-value was 0.01, the alignment length was 100 aa and the identity was 35%.
We also performed the BLASTx search using the NCBI's Conserved Domain Database and the Pfam Database to confirm the presence of the domains. The criteria were as follows; e-value was 0.01, the alignment length was 100 aa and the identity was >35%.

Determination of the secondary metabolites produced by predicted PKS modules
The antiSMASH program was also used to predict the metabolites produced by the PKS modules. The possible secondary metabolites were identified for each genome.

Prediction of polyketide biosynthetic cluster genes
In fungi, the genes synthesizing secondary metabolites are typically located next to each other in the genome. Therefore, a 3,000 base region upstream of the PKS enzymes was extracted and the annotation of the sequences was achieved by BLAST2GO program (Figure 1).

PKS gene modules of the genomes
We predicted a total of 89 PKS gene modules. Of them, 10 gene sets for Cladonia grayi, 13 gene sets for Cladonia metacorallifera, 14 gene sets for Cladonia macilenta, 4 gene sets for Lasallia pustulata, 12 gene sets for Endocarpon pusillum, 12 gene sets for Umbilicaria muehlenbergii, 8 gene sets for Gyalolechia flavorubescens, 7 gene sets for Ramalina peruviana, 9 gene sets for Xanthoria parietina were found in different locations of the genomes (Table 1). In addition to antiSMASH analysis, Blast and conserved domain analyses verified the presence of PKS domains.
Among the PKS domains, KR, DH and ER domains are required to reduce secondary metabolites. Depending on the presence of these enzymes, iterative type I PKSs are grouped into non-reducing PKS (NR-PKS), partially reducing PKS (PR-PKS) and highly reducing PKS (HR-PKS) groups [29,30]. We examined the PKS modules under these three types. As expected, the NR-PKS group was the most common type for most of the fungal genomes. However, the PR-PKS was the most dominant type in C. metacorallifera and X. parietina. Although NR-and HR-PKS groups were predicted in each genome, PR-PKS was not determined in the genomes of L. pustulata, G. flavorubescens and R. peruviana.

Secondary metabolites produced by the modules
The secondary metabolites produced by the nine fungi were predicted by the antiSMASH program ( Table 2). Although many PKS gene clusters were identified for the genomes, metabolites were predicted for very few of the clusters. A total of 15 different metabolites were predicted, while each genome produces two to four different metabolites. According to our in silico analysis, some metabolites were produced by the most of the fungi; however azanigerone, chaetoviridin/chaetomugilin, compactin, depudecin, fusarubin, patulin, pestheic acid, pseurotin_a, trypacidin were synthetized by a few fungi. Emericellin was identified as the most widely produced metabolite. Except X. parietina, the other eight fungi synthetized emericellin.
Prediction of the flanking genes to the PKS gene clusters The PKS genes are affected by the genes which are adjacent to them [31]. In order to identify the genes flanking to the PKS clusters, Blast searches were performed. A total of 69 genes were identified flanking to the PKS clusters ( Table 3). The analysis revealed that the neighbouring genes are associated with a wide range of fundamental biological processes such as transcription, cell cycle, transport and binding. The flanking genes also function in many other processes like lipid metabolism, stress response, carbohydrate metabolism. Moreover, some genes were found to be homologous to genes involved in secondary metabolism like cytochrome P450, NAD-specific glutamate dehydrogenase, integral membrane proteins, shortchain dehydrogenase reductase, glycosyltransferase, glycoside hydrolase.

Discussion
Secondary metabolites play important roles in medical, industrial, and agricultural areas. Lichens produce a large number of metabolites with interesting biological functions. In recent years, researchers have shown that lichen-derived secondary metabolites have biological activity and their metabolites have advantages which become an important reason for the preference for drug discovery studies [32][33][34]. A major group of these compounds represents polyketides (e.g. anthraquinones, depsides and depsidones) biosynthesized by polyketide synthase (PKS) enzymes. Polyketides are a diverse group of natural products with a great significance. Advances in the resolving of fungal polyketide biosynthesis allow us to understand the metabolic diversity of lichen-forming fungi and to develop new drug molecules.
We used bioinformatics assays to pinpoint the PKS genes in nine fungal genomes. The domain organization of PKS genes of some fungi genomes were also identified in previous studies [9,14,35,36]. Comparing to the literature, the number of modules we identified in genomes was slightly different. Previous studies reported that C. metacorallifera contains 31 putative PKS [37], E. pusillum contains 15 PKS genes [38], U. muehlenbergii has 20 putative PKS synthase genes [39]. However, since the PKS modules identified in these genomes were not detailed in the literature, we assumed that the searching criteria were different from each other.
Iterative type I PKSs comprise three major groups. It is known that NR-PKS type is the most widespread type I PKS group. Confirming that, in our study NR-PKS type was the most abundant type for most of the genomes. On the other hand, we found that all the genomes contain NR-PKS and HR-PKS types; however, PR-PKS modules were not present in the genomes of G. flavorubescens and R. peruviana. PR-PKS module has KR domain and catalyzes the production of small aromatic molecules such as 6-methylsalicylic acid. Moreover, some studies indicated that the PR-PKS module does not contain Product Template (PT) or TE domains [40], while some other studies stated that this module contains only the KR domain [29]. However, our analysis showed that the lichen-forming PR-PKS modules contain only DH domain. It is known that fungal polyketides show huge differences as a result of variations in PKS structure [41]. Interestingly, the module organizations for these nine organisms are different from each other and also the ones known so far. Besides it is known that lichen-derived secondary metabolites are diverse among the lichens. All these results suggested that these nine lichenized-fungi possibly produce special polyketides which are different from the known ones. Confirming this, we predicted only a few secondary metabolites using homologybased bioinformatics tools. However, we were able to predict some known secondary metabolites for the first time. Production of Emericellin, Terreic acid, Compactin, Trypacidin, Patulin, Pseurotin, Aflatoxin,   Pestheic acid, Asperfuranone, Azanigerone, Brefeldin, Chaetoviridin, Solanapyrone, Depudecin, Fusarubin was reported for the first time in these nine lichenforming fungi using bioinformatics tools. All of these metabolites show several important biological activities such as antimicrobial, antifungal and antitumor, antiviral, antimitotic and cytostatic activities. Since the predicted PKS gene modules were different from the known structures, the results indicated that the metabolites produced in these nine lichenized fungi are unique and need to be identified. In recent years, scientists have succeeded in isolating, culturing and maintaining mycobionts and photobionts, but there are many difficulties in the experimental stage and the experimental process is time consuming. Therefore, it will be difficult to follow this long process to perform functional analysis of the bioinformatics study based on the lichen species. In this study, we also aimed to open new insight into the bioinformatics evaluation of PKS genes in the genomes of nine lichenforming fungi. Further studies should be carried out with detailed investigations and verifications.
Since the PKS gene classes may be affected by the genes found in neighbouring regions; the flanking regions may have special interest. Our analysis revealed that a number of neighbouring genes encode enzymes that are involved in secondary metabolite metabolism such as cytochrome P450, NAD-specific glutamate dehydrogenase, integral membrane proteins, short-chain dehydrogenase reductase, glycosyltransferase, glycoside hydrolase [42][43][44][45][46]. Additionally lipid metabolism related genes were also identified within the flanking regions. It is known that polyketides are composed of simple fatty acids [47]. Therefore the analysis indicates that these regions might be part of the biosynthetic cluster.

Conclusions
This study addressed the in silico identification of PKS genes and the flanking regions in the genomes of nine lichen-forming fungi for the first time. In a novel approach, a bioinformatics based approach could be used to understand the effect of PKSs on secondary metabolite production. In order to further investigate even higher potential of these lichens, identification of PKS gene clusters represents a valuable source.