Genome-wide analysis of MATE transporters and response to metal stress in Cajanus cajan

ABSTRACT Multidrug and Toxic Compound Extrusion (MATEs) is one of the characteristic transporter families, which plays a key role in the detoxication of endogenous secondary metabolites and exogenous agents in both animal and plant cells. In this study, we identified a total of 67 MATE genes (CcMATEs) from the pigeonpea genome, on which we performed bioinformatics analysis and we group them by phylogenetic analysis. Finally, eight represented CcMATE genes were selected for further qRT-PCR analysis of tissue specificity and response to metal stress in pigeonpea. The results showed that both CcMATE34 and 45 genes were significantly up-regulated and the CcMATE4 gene was only up-regulated in the roots under the stress of Al, Mn and Zn. We speculated that the function of CcMATE34 and 45 might be related to the transport of alkaloids and harmful substances and the function of CcMATE4 might be related to the delivery of flavonoids.


Introduction
MATE (Multidrug and Toxic Compound Extrusion) transporters are secondary transporters of highly conserved cations and are widely present in archaea, bacteria and eukaryotes (Brown et al. 1999). Most members of this family have 450-550 amino acid residues; a few members can reach 700 with 9-12 transmembrane helices in the sequence (Borges- Walmsley et al. 2003). All MATE proteins have approximately 40% sequence similarity. AtALF5 (Arabidopsis thaliana aerrant lateral root formation 5) is the first MATE protein identified in plants (Diener et al. 2001), after which the first multi-specific MATE transporter AtDTX1 is identified in Arabidopsis thaliana in 2002 (Li et al. 2002). In plants, the functions of the MATE protein family are diverse. It involved plants' numerous growth and development function. The other functions are participating in root detoxification by transporting heterologous harmful substances, transporting salicylic acid, alkaloids, antibiotics and other toxic compounds, maintaining the balance of iron ions in plants, maintaining plants by regulating the development of lateral organs, enhancing the aluminum tolerance of plants by enhancing the secretion of citric acid, directly or indirectly involving in the detoxification of toxic compounds or metals, etc. (Nesi et al. 2001;Li, He, et al. 2002;Rogers and Guerinot 2002;Furukawa et al. 2007;Zhou et al. 2013;Li et al. 2014;Wu et al. 2014).
Pigeonpea (Cajanus cajan) is a tall woody perennial leguminous diploid (2n = 22) crop and is a member of the tribe Phaseoleae. Pigeonpea is predominantly cultivated in tropical and subtropical areas and is regarded as an important food legume (or pulse) crop. Compared with other grain legumes in planting area and yield, pigeonpea ranked sixth in the world (Salunkhe et al. 1986;Varshney et al. 2012), Moreover, its planting scope covers India, which is the most populous region in the world (Burns et al. 2001). In addition to the medicinal value, the function of pigeonpea are varied. It has been reported that extracts or the active ingredients of pigeonpea has therapeutic effects on diabetes in India (Grover et al. 2002), hepatitis and measles in Africa and dysentery in South America (Amalraj and Ignacimuthu 1998;Grover et al. 2002). In addition, pigeonpea leaves is use as medicinal material in traditional Chinese medicine have functions such as arresting blood, relieving pain, and killing worms. Pigeonpea has a tenacious vitality to endure drought as well as cold temperatures. It has the ability to adapt to various types of soil and is suitable for planting in soils with a pH range from 5 to 7. As the only woody leguminous plant, pigeonpea is widely grown in acidic soil, its good aluminum resistance has bought it much attention.
According to previous studies, the MATE gene plays an important role in the growth and development of organisms, which has been found in Arabidopsis thaliana (Diener et al. 2001;Li, He, et al. 2002;Rogers and Guerinot 2002). However, there is little research on these genes in pigeonpea. In this study, we conducted a genome-wide search in the pigeonpea database to obtain all possible MATE transporters. In a total of 67 MATE genes were identified in the genome. We then studied the chromosomal distribution, phylogenetic relationships, as well as the structure and basic information of genes and proteins. Subsequently, we analyzed the tissue specificity and its response to metal stress on certain genes selected from different subgroups of pigeonpea. These results provided insights into the function of the MATE transporters, and the molecular mechanism of response to metal stress in pigeonpea.

Identification of MATE proteins in C. cajan and gene basic information
In Phytozome v12.1 (https://phytozome.jgi.doe.gov/pz/ portal.html), we obtained a total of 57 MATE (Pfam: PF01554) proteins sequences in Arabidopsis (Goodstein et al. 2012). The 57 protein sequences in Arabidopsis were used as probes to perform BLASTP search in the database of C. cajan in NCBI (https://www.ncbi.nlm.nih.gov/) to obtain 870 putative MATE protein sequences of C.cajan. We further filtered these hypothetical MATE sequences using the conservative MATE domain (pfam01554) by the Pfam (Finn et al. 2014) (http://pfam.xfam.org/) and NCBI Conservative Domain (CD)-search tool (http://www.ncbi. nlm.nih.gov/Structure/cdd/wrpsb.cgi). And we identified a total of 67 proteins with MATE domains. The theoretical isoelectric point (pI) and molecular weight (MW) of the pigeonpea MATE protein were calculated using the EXPASY proteomics server (Artimo et al. 2012) (http://expasy.org/) database with automatic mode. At the same time, the number of amino acids was also obtained from the above website.

Phylogenetic analysis and structure prediction of MATE proteins in C. cajan
To further analyze the alignment accuracy, we used the online website PHYRE2 for homology modeling (Kelley et al. 2015) (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index). The full protein sequence of all 67 C.cajan MATEs and previously reported 31 MATEs from other plants were aligned using multiple sequences in MEGA 6.0 by MUSCLE. Next we built the unrooted phylogenetic tree using the Neighbor-Joining (NJ) algorithm with 1000 bootstraps in MEGA 6.0. The amino acid substitution model was p-distance and the missing data was handled by Pairwise deletion ).

Chromosome location and structural analyses of CcMATEs
The position information of the final 67 pigeonpea MATE on their chromosomes and the protein prediction function were obtained on NCBI. The chromosonmal locations of all CcMATEs were plotted by using the tool MapDraw (Shitan, Minami, et al. 2014). All of the 67 MATE protein motifs were analyzed by MEME website (Bailey et al. 2009) (http://meme-suite.org/), and 10 was setted as the maximum number of motifs.

Tissue-specific expression of C. cajan
From each subfamily, one Arabidopsis protein, whose functions were reported was selected and used as a probe to conduct a BLASTP search in the database of Cajanus cajan in NCBI. Then the first protein in each of the lists was selected. Based on these eight CcMATE genes, we verified the tissue specificity of pigeonpea by quantitative real-time PCR (qRT-PCR).
In this research pigeonpea (ICPL87119) was used as the experimental material. The one-year-old seedlings of pigeonpea, which planted in soil at room temperature, were selected to investigate the tissue specificity of the MATE gene family of pigeonpea. The roots, stems, leaves, and flowers of pigeonpea were collected and immediately frozen in liquid nitrogen and stored at −80°C.

Plant stress treatment conditions
The pigeonpea seeds were disinfected with 75% alcohol for 30 s, disinfected with sodium hypochlorite for 6 min and then soaked in sterile distilled water 5 times at 30 s intervals. The pigeonpea seeds were then planted in a medium containing MS base salts and agar in a greenhouse. The cultivation conditions in the greenhouse were light/dark cycles with a temperature of 24°C and an illumination of 5,000 luxes for 14 h to 10 h. After 3 weeks the pigeonpea were treated with metal stress. We selected the final treatment concentration of three metal stresses by observing the elongation of roots of pigeonpea seedlings at different metal concentrations (Table S3). For aluminum treatment (Yokosho, Yamaji, Fujii-Kashino, et al. 2016) three groups of pigeonpea seedlings (derived from three plants per group) were incubated with 50 μM AlCl 3 . For manganese treatment (Khan et al. 2000) three groups of pigeonpea seedlings (derived from three plants per group) were incubated with 2 mM MnSO 4 . For zinc treatment (Pineau et al. 2012) three groups of pigeonpea seedlings (derived from three plants per group) were incubated with 150 μM ZnSO 4 . Samples of roots and stems were collected at 0, 6, and 12 h after stress treatment. All samples were immediately frozen in liquid nitrogen and store at −80°C for RNA isolation.

RNA isolation and qRT-PCR
Each sample of pigeonpea was ground to a fine powder in liquid nitrogen, and CTAB method was used to extract RNA in all samples as described by Meng et al (Meng et al. 2014). One milligram of total RNA was reverse-transcribed to cDNA using the PrimeScript RT reagent kit with gDNA Eraser (Takara). Every qRT-PCR was performed in 3 replicates on CFX connect (Bio-Rad, California, USA) using the SuperReal PreMix (Probe) (TIANGEN BIOTECH, BEIJING, China) according to the instruction manual. The design of the primers was performed on the website (http://www.sangon. com/login). The primers used for qRT-PCR analyses are listed in Table S1 and are available as Supplementary Material to this paper.

Identification of MATE proteins in C. cajan and gene basic information
A BLASTP search was performed on the NCBI website based on the 57 MATE protein sequences (collected from Phytozome v12.1) in Arabidopsis thaliana. A total of 67 putative protein sequences were identified from the whole genome, which could encode the MATE gene of the C. cajan. These putative MATE sequences were filtered using the Pfam database based on whether or not there is a conserved MATE domain. Finally, a total of 67 CcMATE genes were identified, and named consecutively CcMATE1 to CcMATE67 according to where they are physically located (Li, He, et al. 2002). The number of amino acids in the proteins encoded by these CcMATE genes ranged from 283 to 573 and the average number of amino acids was 499. The molecular weight ranged from 30.92 to 61.93 kDa with an average of 54.48 kDa. The predicted isoelectric point values were from 5.14 to 9.68 with an average of 7.75. Detailed information of all 67 pigeonpea CcMATE proteins including the length, molecular weight, isoelectric point (pI), predicted chromosome location were listed in Table 1. Compared to MATE proteins in Arabidopsis, which lengths ranged from 400 to 700 amino acids, the MATE family in Pigeonpea was smaller with shorter proteins (Li, He, et al. 2002).
To further analyze the alignment accuracy we used the online website PHYRE2 for homology modeling (http:// www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index).
To increase the alignment accuracy, the structure of the 9 most highly homologous MATE proteins was set to >90% confidence when modeling. As Shown in Figure 2, CcMATE10,11,16,17,24,37,39,40, and 60 had the highest homology among the 67 CcMATEs. The secondary structure contained α helices and β sheets. To further explore the function of the MATE protein, model laid the initial foundation.

Chromosome location and structural analyses of MATE proteins
In order to explore the physical relationship of the CcMATEs in Pigeonpea chromosomes, we used MapChart software to build chromosome location. A total of 51 MATE gene members were located on 8 chromosomes of pigeonpea ( Figure 3). However, there were 16 MATE genes that were not found on any particular Pigeonpea chromosome therefore they were tagged on the chromosome named unplaced scaffold. Chromosome 7 contained 15 genes, CcMATE23 to CcMATE37, which was the chromosome with the highest number of MATE genes; the number of genes on chromosomes 3 and 5 is the smallest, there is only one gene respectively. Among the other 34 CcMATE genes, 11 CcMATE genes were situated on chromosomes 1; 9 CcMATEs were located to chromosome 6; 6 CcMATE genes were mapped on chromosome 8; 3 CcMATE genes were situated on chromosomes 10; 5 CcMATEs were found on chromosome 11. The conserved motifs in CcMATE proteins were predicted using MEME software. 10 conserved motifs were identified in CcMATE proteins (Figure 4). The pattern and sequence of the first two subfamilies were similar, but the third subfamily was significantly different from them. Observing the picture we found that all CcMATE proteins contained domain 5, and the motifs observed in subfamily 3 were the least.

Expression patterns of CcMATE genes in different pigeonpea tissues
We selected one of the Arabidopsis proteins that have been reported from each subfamily and used these proteins as probes to perform BLASTP in the NCBI database. Then the first protein in each of the lists was selected. To investigate the tissue-specific expression of the CcMATE genes, we selected eight CcMATE genes as described above and analyzed their expression differences in root, stem, leaf, and flower of pigeonpea using qRT-PCR. The fluorescence quantitative data of this part were listed in Table S2. As shown in Figure 5, CcMATE45 and CcMATE55 were highly expressed in the root, but lowly expressed in the flower. CcMATE6 and CcMATE32 were highly expressed in stems, especially CcMATE32. For flower tissue, CcMATE1 and CcMATE55 showed higher expression levels than other tissues. Of all eight genes, only CcMATE1 and CcMATE45 were expressed in all four tissues. This suggested that these two genes may play a role in plant growth and development. These results indicated that most CcMATE genes had differential expressions in different tissues, which could help us to explore the functional diversity of genes.

Expression of CcMATE genes response to metal stress
In order to detect the CcMATEs involving in the metal stress different concentrations of ZnSO 4 , AlCl 3 and MnSO 4 , solutions were treated with 3-week-old pigeonpea seedlings. The results showed that most CcMATE genes were up-regulated under aluminum stress treatment, more so in the roots ( Figure 6). Only CcMATE4 and CcMATE32, were downregulated in the stem after the 6 h treatment. We also found that in both leaves and roots expression levels of CcMATE1 and 34 showed the same expression trend under aluminum treatment in both organs. In roots and stems the CcMATE34 gene was up-regulated after 6 h treatment, again up-regulated at 12 h, which was significantly less than the 6 h treatment. The CcMATE4 gene was up-regulated to a high degree only in the roots during the entire aluminum stress treatment.
Under manganese stress, we found that expressions of all CcMATE selected genes were up-regulated in the roots in response to 12-hour treatment. CcMATE34 and CcMATE61 were up-regulated to varying degrees after the 6 and 12 h treatment in roots. In stems, the CcMATE1, 34, 45, 55 and   Under zinc stress, the expression of the gene CcMATE4 was up-regulated 12-fold compared to controls in roots and down-regulated in stems after 6 h treatment. In stems, the expression of the CcMATE1, 4, and 32 genes decreased under zinc treatment, the CcMATE6, 34, 45, 55 and 61 genes were up-regulated. During the entire metal stress treatment, both CcMATE 34 and 45 genes were significantly upregulated. The CcMATE4 gene was only up-regulated in roots, and the above results indicate that these CcMATE genes might be related to the relevant pathways against aluminum, manganese, and zinc stress.

Discussion
Organism have evolved many defense and removal mechanisms in order to resist the invasion of harmful substances from outside, to prevent the accumulation of toxin compounds produced in its own metabolism (Shitan, Kato, et al. 2014;Liang et al. 2018). One of the main methods for prokaryotic and eukaryotic organisms to excrete exogenous and endogenous toxin compounds that accumulate in cells is the use of membrane trans-membrane transporters to transport toxins out of cells. Because these proteins can excrete different types of toxic compounds they are called Multidrug efflux pumps (Piddock 2006). The multidrug and toxin extrusion (MATE) protein family belongs to one of the five families of multidrug efflux pumps that have been discovered. The MATE family has been analyzed on many plants such as tomato, soybean, poplar and Arabidopsis to name a few (Li, He, et al. 2002;Liu et al. 2016;Li et al. 2017;Santos et al. 2017). Pigeonpea is a high-protein crop, and its presence has far-reaching implications for the Indian people. The study of MATE transporters in pigeonpea plays an important role in the breeding of pigeonpea, but there are not many studies in this field at present. Analysis of the MATE family in pigeonpea helps us to clarify the molecular genetic basis of pigeonpea genetic improvement, and to provide a basis for transgenic research.
In this study, we identified a total of 67 MATE-encoding genes from the pigeonpea genome, which can be divided into three major groups and eight subgroups (Figure 1). The length of the pigeonpea MATE protein ranged from 283 to 573 amino acids. In Arabidopsis, the MATE protein ranged in length from 400 to 700 amino acids and in Populus the MATE protein ranged in length from 120 to 608 amino acids. These PtrMATE genes show significant regularity properties in gene structure and protein motifs, which means conservation in the Populus MATE family. Although pigeonpea has a larger genome the number of MATE genes is similar to that in Arabidopsis (Li, He, et al. 2002). At least four rounds of (WGD) events have been discovered in the evolutionary history of Arabidopsis (Vision et al. 2000), which may help us understand this phenomenon.
At present, the MATE gene in rice and Arabidopsis has been predicted, the MATE family genes are divided into 4  major subfamilies according to the topological structure and self-priming value of the protein sequence: MATE I to MATE IV . It has also been studied where 70 MATE genes randomly distribute in 13 chromosomes from Raymond cotton were divided into three subfamilies, MATE I, MATE II and MATE III, by phylogenetic analysis (Guo et al. 2017). In this study, we divided the MATEs in pigeonpea into three large groups, and eight subgroups based on the phylogenetic tree. Subgroup 1-1 contains 24 sequences, including 18 CcMATE proteins and 6 MATE proteins from other plant species. Among them AtFFT is thought to be a flavonoid transporter, which regulates flavonoid levels in A. thaliana. With the absence of this protein the normal growth and development of A. thaliana may be affected (Gomez et al. 2009;Seo et al. 2012). VvAM1 and VvAM3 are considered to be an acylation anthocyanin transporter (Gomez et al. 2011), MtMATE2 is considered as a vacuolar anthocyanin transporter . Subgroups 1-2 contains 12 sequences including 5 MATE proteins from other plant species previously reported. AtTT12 is affected at the flavonoid biosynthetic pathway level, and further demonstrates its ability to control vacuolar segregation of flavonoids in the seed coat endothelium ). NtMATE1 and NtMATE2 may have the effect of transporting tobacco alkaloids in the cytosol to vacuoles (Shoji et al. 2008). BrTT12 is related to differences in yellow seed traits (Chai et al. 2009). MtMATE1 as a membrane Figure 5. Expression profiles of 8 CcMATE genes in four pigeonpea tissues. The color bar represents the expression value. Each group has three biological replicates. Figure 6. Relative expression levels of 8 CcMATE genes in response to three metal stress, as determined by quantitative real-time PCR. Eight representative CcMATE genes were selected from the eight subgroups. The expression patterns of these CcMATEs in pigeonpea roots and stems were examined under aluminum, manganese and zinc stress. The roots and stems were harvested for 6 and 12 h after treatment. Each group has three biological replicates.
transporter in the procyanidins biosynthetic pathway in pod coats (Zhao and Dixon 2009). Overall this MATE subfamily may play a related role in the transport pathway of secondary metabolites. In subgroup 2-1 the AtALF5 had the effect of increasing opposition to toxins (Diener et al. 2001). AtDTX1 in subgroup 2-2, whose role is to mediate the efflux of endogenous or exogenous toxic compounds from plant cytoplasm (Li, He, et al. 2002). NT-JAT1 (Morita et al. 2009) is responsible for unloading aerial parts, and alkaloids deposited in vacuoles. AtADS1 in subgroups 2-3 plays an important role in plant disease resistance. Overexpression of AtZF14 can increase leaf initiation speed, and to participate in the steady state regulation of iron. In subfamily 3 most of the identified MATE protein functions are associated with aluminum detoxification or iron translocation. There are at least 6 reported MATE transporters with related functions including AtFRD3, AtMATE, OsFRDL1, SbMATE, TaMATE1 and ZmMATE1 (Maron et al. 2010;Garcia-Oliveira et al. 2014;Charlier et al. 2015;Chang et al. 2017;Doshi et al. 2017;. Based on the analysis of the reported protein function in each subfamily we found that the MATE genes of the same subfamily have the same or similar functions, different subfamily genes have completely different functions. This is consistent with Wang's findings (Wang, Qian, et al. 2016). Therefore the grouping of MATE in this study provided the foundation for the study of the functions of the following MATE genes.
In chromosomal mapping, we found that 67 MATEs genes were not distributed uniformly (Figure 3). Excluding the genes that were not located on chromosomes. Up to 15 genes were distributed on chromosome 7. In contrast, there is only one gene on chromosomes 3 and 5, respectively. This might be caused by expansion of the MATE family, and the pigeonpea genome (Jacquemin et al. 2014). There are many genes in the chromosomal location that are not mapped to the chromosome. This phenomenon may be due to the fact that the genome of pigeonpea in NCBI is not complete. Each subgroup includes at least one MATE motif (Figure 4). In the MATE gene family, the conserved motifs of the genes in the same subgroup are basically identical.
Most plant MATE genes exhibit a clear tissue-specific expression pattern. For example, Nt-JAT1 is expressed in leaves, stems and roots (Morita et al. 2009), whereas NtMATE1 and NtMATE2 are expressed only in the roots (Shoji et al. 2008). Arabidopsis thaliana AtFFT (flower flavanone transporter) is expressed in almost all tissues including flowers and seeds (Thompson et al. 2010). Therefore we selected one gene from each of the sub-groups of pigeonpea and used qRT-PCR to detect its expression in the roots, stems, leaves and flowers of pigeonpea to analyze its tissue specificity ( Figure 5). As shown CcMATE4, 6, 32, 34, 55, 61 genes in pigeonpea have obvious tissue specificity, CcMATE32 is expressed almost exclusively in stems and roots. Based on these findings we can speculate that the CcMATE32 gene may have a specific role in roots and stems. CcMATE1 and CcMATE45 genes were expressed in all four tissues suggesting that these two genes may play a role in plant growth and development.
The CcMATE transporter, which was first isolated from plants is involved in the detoxification of foreign compounds. Cellular function analysis of E. coli transformed with AtDTX1 revealed that the transporter was able to excrete toxic substances such as antibiotics and cadmium, as well as two plant alkaloids (berberine and tetrandrine) (Li, He, et al. 2002). The other CcMATE transporter, AtALF5 identified in Arabidopsis is considered to have the ability to excrete exogenous substances just like AtDTX1 (Diener et al. 2001).
In Figure 6 CcMATE45 and CcMATE34 gene expression were up-regulated under three metal stress treatments, which explained the two genes play a role in the anti-metal stress of pigeonpea. The CcMATE34 and ATDTX1, CcMATE45 and AtALF5 had higher homology in phylogenetic analysis respectively. We speculate that under metal stress treatment, that the CcMATE34 gene might have the same function as AtDTX1, which can efflux harmful substances or alkaloids. Meanwhile, CcMATE45 might be similar to AtALF5 in the role of transporting heterologous harmful substances, thus indirectly participate in relieving metal stress. Under the three metal stress treatments, the CcMATE4 gene was up-regulated only in the roots, but it was downregulated or not significantly fluctuating in the stems. Indicates that CcMATE4 might be specific in the roots. The CcMATE4 was highly homologous to MtMATE in phylogenetic trees. The MtMATE had a function of transporting citric acid in the roots to resist aluminum stress, we hypothesized that CcMATE4 may have the same function because CcMATE4 was specific expressed in roots, just like MtMATE (Divya et al. 2001). And, what is interesting is that according to the subfamily functional consistency the CcMATE32 gene in subfamily 3-3 didn't show high expression, which may be related to the specificity of pigeonpea. However, the specific functional mechanisms of these CcMATE genes in the resistance of pigeonpea to metal stress require further exploration. Our conclusions provide meaningful clues for these next explorations.
In conclusion, our results show that the CcMATE genes may play a role in resistance to metal stress, our analysis of the whole genome and gene expression pattern of CcMATE factors provide important genetic resources for future research. Dong Meng is an associate professor at the Forestry College of Beijing Forestry University and the Beijing Advanced Innovation Center for Tree Breeding by Molecular Design.
Zhihua Song is a graduate student at the Forestry College of Beijing Forestry University.
Litao Wang is a graduate student at the Forestry College of Beijing Forestry University.
Yue Jian is a graduate student at the Forestry College of Beijing Forestry University.
Xiaohong Fan is a graduate student at the Forestry College of Beijing Forestry University.
Mingzhu Dong is a graduate student at the Forestry College of Beijing Forestry University.
Qing Yang is a lecturer at the Forestry College of Beijing Forestry University.
Yujie Fu is a professor at the Forestry College of Beijing Forestry University, Key Laboratory of Forestry Plant Ecology of Northeast Forestry University, and Beijing Advanced Innovation Center for Tree Breeding by Molecular Design.