Bioinformatics analysis reveals the landscape of immune cell infiltration and immune-related pathways participating in the progression of carotid atherosclerotic plaques

Abstract Atherosclerosis is a systemic disease associated with inflammatory cell infiltration and activation of immune-related pathways. In our study, we aimed to uncover immune-related changes and explore novel immunological features in the development of carotid atherosclerotic plaques. First, we applied integrated bioinformatics methods, including CIBERSORT and gene set enrichment analysis (GSEA). The gene expression matrices GSE28829, GSE41571, and GSE43292 were obtained from the Gene Expression Omnibus (GEO) dataset. After a series of data pre-processing steps, the resulting combined expression matrices were analysed using the CIBERSORT, GSEA, and Cluster Profiler packages. After the comparison and analysis between the carotid atherosclerotic plaques in the early and advanced stages, we discovered that there is a higher percentage of activated memory CD4 T cells and a lower percentage of resting memory CD4 cells in advanced-stage plaques. Moreover, activation of memory CD4 T cells can promote the development of carotid atherosclerotic plaques. Additionally, FOXP3+ Treg cell maturation can also participate in the progression of carotid plaques.


Background
Carotid artery disease, caused by a buildup of atherosclerotic plaque inside the arterial wall, is one of the most important causes of cerebrovascular disease [1]. During carotid plaque formation, early-stage lesions may be asymptomatic, but the intermediate and advanced stages are more severe and comparatively more likely to rupture, leading to thrombus formation in the carotid artery, a frequent cause of ischaemic cerebrovascular events [2]. Many factors, such as oxidative stress and inflammation, have been found to participate in the progression from early to advanced stage plaque [3]. The infiltration of inflammatory cells and activation of immunerelated pathways are critical features in plaque progression [4]. Several types of immune cells, such as lymphocytes [5], natural killer T cells [5], and T cells [6,7] are involved in plaque progression. In addition, inflammatory cytokines such as TNF-a, IFN-c, IL-1b, IL-6, and IL-8 have been reported to play a critical role in the development of plaques [6]. However, the changes in immune cell types from the early to advanced plaque stages are still unclear. In addition, the underlying mechanisms regulating the development of carotid atherosclerotic plaques are still unclear.
Many researchers have attempted to illustrate this progression. For instance, leukocyte cathepsin C was shown to promote atherosclerotic lesion progression by selective tuning of innate and adaptive immune responses [8]. Moreover, PELATON, as a monocyte-and macrophage-specific lncRNA, was also proven to have an important role in plaque progression [9]. The critical role of particular immune cells and immune-related pathways in the progression of carotid atherosclerotic plaques was confirmed by these molecular and cellular experiments, but fewer studies have been conducted to explore the correlation of genes and immune cells or overall landscape in progressionrelated big data. Hence, an overall description of the immune-related landscape of the early stage and advanced stage should be conducted.
In recent years, advances in gene chip technology have enabled the analysis of changes in mRNA levels between different samples, which has helped identify novel and important genes when studying disease mechanisms. However, traditional microarray analysis is often restricted to differentially expressed genes (DEGs). Since carotid artery disease is closely associated with immune cells and the inflammatory system, a more comprehensive bioinformatics analysis of immune-related pathways should be applied. In this study, two microarray datasets downloaded from the Gene Expression Omnibus (GEO) were used to screen for enrichment in immune-related pathways and differently infiltrated immune cells. Gene Set Enrichment Analysis (GSEA) was used to analyse enriched gene sets, Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathways, and immunological signatures account for all expression data rather than DEGs alone. Finally, CIBERSORT was used to reconstruct the state of immune cell infiltration in carotid atherosclerotic plaques at different stages.
Through an in-depth analysis of expression array datasets taken from plaque samples in the early and advanced stages, our study aimed to explore the immunological mechanism of carotid atherosclerotic plaques comprehensively, and identify new immunological targets for preventing the progression of carotid atherosclerotic plaques.

Microarray datasets
In this study, the raw gene expression profiles GSE28829, GSE41571, and GSE43292 were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). The GSE28829 expression profile consists of 16 advanced and 13 early-stage plaques, detected by the Affymetrix Human Genome U133 Plus 2.0 Array. The GSE41571 expression profile consists of six advanced and five early-stage plaques, detected by the Affymetrix Human Genome U133 Plus 2.0 Array. The GSE43292 dataset consists of 32 advanced and 32 early-stage carotid atherosclerotic plaques detected by the Affymetrix Human Gene 1.0 ST Array. Carotid atherosclerotic plaque in the advanced stage is defined as a plaque with a fibrous cap, whereas carotid atherosclerotic plaque in the early stage is identified by intimal thickening and intimal xanthoma [10].

Data pre-processing
To obtain the critical information from the two gene expression matrices, the merging and pre-processing of raw data were conducted using a Perl script and the sva package [11] from R software (R-project.org). Then, the probe ID for every gene was converted to a gene symbol using a Perl script. If a gene symbol corresponded to multiple probe IDs, the mean expression level of the probes was calculated as the representative expression level of that gene.

CIBERSORT
Immune cell infiltration in carotid plaques was evaluated using the CIBERSORT algorithm [12]. CIBERSORT is an analysis tool that uses gene expression data to represent the cell composition of complex tissues based on pre-processed gene expression profiles. The LM22 gene file of CIBERSORT was used to define 22 immune cell subsets and analyse earlystage plaque tissue and advanced stage plaque tissue data, which were obtained from the CIBERSORT web portal (http:// CIBERSORT.stanford.edu/).
As described in a previous study, p-values and root mean squared errors were counted for each expression file in CIBERSORT. This default signature matrix of 100 permutations was used in this algorithm. Only data with a CIBERSORT p value < .05 was filtered and reserved for the following analysis. Thus, the output was directly integrated to generate an entire matrix of immune cell fractions. The visualisation of the results from CIBERSORT was carried out using the R packages, corplot, vioplot, and ggplot2 [13].

Gene set enrichment analysis (GSEA)
After data pre-processing, the enrichment analyses for Gene Ontology (GO) terms, KEGG pathways, and immunologic signature gene sets were analysed with the GSEA software [14], and the merged gene matrices were uploaded to the software. The number of permutations was set to 1000, and the permutation type was set as "phenotype". In addition, gene set databases used during this analysis were downloaded from the Molecular Signatures Database (http://software.broadinstitute.org/gsea/ msigdb/index.jsp) [15].

Detection and functional enrichment analysis of DEGs
The DEGs between advanced stage samples and early stage samples were determined using the limma package in R [16]. The thresholds were log 2 (fold change)>1 and adjusted p value < .05. DEG visualisation was conducted using a heatmap using the pheatmap package [17].
The Cluster Profiler R package [18] was used to functionally analyse the enriched pathways of DEGs in GO terms and KEGG pathways. The p value was corrected using the Benjamini method or false discovery rate (FDR) for multiple testing calibrations. The threshold was set at p < .05. The results were visualised using the ggplot2 package [13].

Principal component analysis (PCA) of infiltrated immune cells
PCA is a multivariate regression analysis algorithm, which we used to define the differences in infiltrated immune cells between two different groups [19,20]. The PCA graph was rendered using the ggplot2 package in the R software [13]. The PCA graph could classify infiltrating immune cells as variables and describe the difference between carotid atherosclerotic plaque samples in the early and advanced stages.

Composition of infiltrated immune cells in early and advanced plaque tissues from the GEO expression array dataset
The fraction of infiltrated immune cells was first investigated between plaque tissues in early and advanced stages using GEO expression array data. From a total of 104 samples, 47 early and 53 advanced samples were eligible for CIBERSORT (p < .05). As shown in Figure 1, the fraction of immune cells varied significantly among the samples and groups. M2 macrophages, M0 macrophages, naive B cells, CD8 T cells, and resting CD4 memory T cells represented the top five highestinfiltrating fractions in carotid atherosclerotic plaques. Inversely, CD4 naive T cells and eosinophils were present in lower quantities. Meanwhile, compared with early plaque tissues, advanced tissues generally contained a higher fraction of CD4 memory-activated T cells and M0 macrophages (p value <.05). However, advanced plaques contained a lower fraction of CD8 T cells, CD4 memory resting T cells, T cell regulatory (Tregs), activated natural killer (NK) cells, monocytes, and resting dendritic cells (p value <.05) ( Figure 2).
Additionally, we conducted a correlation analysis of infiltrated immune cells in plaques and found multiple pairs of positively and negatively related immune cells ( Figure 3). The score represents the degree of correlation. This result suggests that activated mast cells and follicular helper T cells showed the most synergistic effect. Meanwhile, CD8 T cells and M0 macrophages showed the most competitive effect.
The PCA results showed that the fractions of immune infiltration could help us to distinguish advanced plaques from early plaques ( Figure 4).

Kegg pathway and GO analysis showed high enrichment of the immunological pathway and gene sets
In order to explore the possible pathways and gene sets associated with immune functions, all expression data were divided into plaques in the early and advanced stages and then subjected to GSEA analysis. The exported results showed a high correlation with immune cells and immunerelated functions, as expected.
Compared with early-stage plaques, the top 20 KEGG pathways in advanced stage plaques were mainly enriched in primary immune-related pathways such as the Toll-like receptor signalling pathway, antigen processing and presentation, natural killer cell-mediated cytotoxicity, and primary immunodeficiency; pathways associated with immune-related diseases such as autoimmune thyroid disease, asthma, graft versus host disease, and systemic lupus erythematosus; pathways associated with cytokine activation, including the chemokine signalling pathway and cytokine-cytokine receptor interaction (Table 1). Ten representative immune-related pathways are visualised in Figure 5.
To explore the immune-related gene function in the progression of carotid plaques, GO analysis was conducted using GSEA. The top 20 results of GO analysis showed that variations in gene function between the early and advanced plaques were mainly enriched in interferon-associated gene sets, including response to type I interferon and response to interferon gamma. The functions of inflammatory cells consisted of regulation of leukocyte migration and positive regulation of leukocyte migration as well as other immune responserelated gene sets (Table 2). Ten representative immunerelated GO terms are shown in Figure 5.
Moreover, the GO and KEGG analyses based on DEGs between plaques in the early and advanced stages showed results similar to those of GSEA (Figures 6 and 7). Gene Ontology was mainly enriched in the regulation of inflammatory response, regulation of immune effector process, regulation of leukocyte degranulation, regulation of myeloid leukocyte-mediated immunity, leukocyte migration, and positive regulation of response to external stimuli. The KEGG pathway was mainly enriched in pertussis, Staphylococcus aureus infection, the PPAR signalling pathway, complement and coagulation cascades, and phagosome and viral protein interaction with cytokines and cytokine receptors.
The immunological signature enrichment analysis by GSEA shows that maturation of FOXP3 1 treg cells participate in the progress of atherosclerotic plaque To characterise the in-depth signature of immunological function associated with the progression of atherosclerotic plaque, GSEA was also used to obtain the biological processes enriched in immunologic signature gene sets. All expression data were imported into GSEA. Then, 1886 functional gene sets were enriched (FDR <25%; nominal p value < 1%). The gene sets in CD8 T cells, peripheral blood mononuclear cells (PBMC), FOX3P þ Treg cells, dendritic cells (DCs), and monocytes were mainly enriched. The top 20 enriched gene sets are listed in Table 3.
Besides the gene sets associated with Foxp3 þ Treg, different maturity subsets were significantly enriched, including GSE42021 CD24hi vs. CD24low Tconv thymus down; GSE42021 Treg PLN vs. CD24int Treg Thymus down; GSE42021 Treg PLN vs. CD24lo Treg thymus down and GSE42021 Treg PLN vs. CD24lo Treg thymus down, which indicates that the process of the maturation of FOXP3 þ Treg cells could promote the deterioration of atherosclerotic plaque ( Figure 8).

Discussion
Carotid atherosclerotic plaques are critical for the progression of cerebrovascular disease. Carotid atherosclerotic plaques at an early stage can become advanced and unstable and provoke a cerebrovascular event within several years [21]. However, the exact mechanism involved in plaque progression is not known, but it involves immune cells and immunerelated pathways [22]. Hence, understanding the molecular mechanism of carotid artery atherosclerosis development is critically important for the diagnosis and therapy of cerebrovascular disease.
Since the advent of microarrays, they have provided thousands of gene expression datasets. Microarrays have been widely used to predict the underlying targets for the prevention and therapy of carotid atherosclerotic plaques. In this study, infiltrating immune cells and their immunological gene functions and pathways in plaque carotid artery tissues were analysed using the gene matrices GSE28829, GSE41571, and GSE43292 obtained from the GEO dataset. Only inflammation-related atherosclerotic carotid plaque datasets were included. According to the pathological morphological staging method of the American College of Cardiology, the early-stage lesions were characterised by scattered macrophage foam cells, fatty streaks, intimal thickness, and resident stability. However, advanced plaques are characterised by deep ulceration, disruptions of the lesion surface, haematoma or haemorrhage, and thrombotic deposits [23]. The results from CIBERSORT showed that the infiltrated immune cell subsets were similar in early and advanced plaques. M2 macrophages, M0 macrophages, naive B cells, CD8 T cells, and CD4 memory resting T cells represented the top five infiltrating fractions in carotid artery atherosclerosis. Inversely, there were fewer CD4 naive cells and eosinophils. The results of the composition of carotid atherosclerotic plaques from CIBERSORT is consistent with previous studies that used flow cytometry and single cell sequencing [24][25][26][27]. Single cell sequencing of human carotid atherosclerotic plaques in a mouse atherosclerosis model showed that T cells and macrophages dominate the immune landscape of atherosclerotic plaques.
Atherosclerosis is an inflammatory disease. Plaque progression is primarily mediated by cells of the monocyte/macrophage lineage. However, the contribution of other immune cells is still unknown. The differences between early and advanced stage plaques could reveal the immune cells participating in the progression of carotid atherosclerotic plaques. The present analysis shows a higher fraction of activated CD4 memory T cells and a decreased fraction of CD8 T cells, resting CD4 memory resting T cells, Tregs, activated NK cells and monocytes and resting dendritic cells are the main immune cell composition changes that occur during the progression of carotid plaques. Multiple studies have shown that the monocyte-macrophage system contributes to atherosclerosis development, thereby uncovering important molecular mechanisms [28,29]. However, previous studies have rarely focussed on the changes in the number and contribution of other immune cells such as memory CD4 T cells and NK cells. Several experimental studies have shown that T cells play a critical role in the immune response observed during the process of atherogenesis [30,31]. In the artery, cholesterol accumulation followed by vascular inflammation promotes adaptive T cell response [32] and causes the differentiation of naive CD4þ T cells to effector or memory cells of specialised T cell subsets in secondary lymphoid organs and in the chemokine-driven recruitment of specific T cell and monocyte subsets in atherosclerotic plaques [33][34][35]. During atherogenesis, memory CD4 T cells especially experience expansion and antigen-experienced T cells, which could result from antigen reexposure and further elicit stronger and more sustained immune response [36]. However, their clinical confirmation in humans remains limited. Our study provided clinical evidence of the potential role of memory CD4 T cells activation in the progression of plaques.
Previous studies have shown that immature dendritic cells already exist under the endothelium in healthy arteries. However, the majority of dendritic cells in advanced plaques seem to be activated and rapidly expand during the progression of atherosclerotic lesions. In the sub-endothelial space of the aorta, dendritic cells can efficiently accumulate lipids and therefore contribute to the initiation and further progression of the disease [37,38]. Moreover, activated dendritic cells can produce a number of pro-inflammatory cytokines [39]. As previous studies have shown, CD8 T cells played a similar role in the development of atherogenesis, as they significantly increase as human lesions progress and become vulnerable to rupture but show a decrease in healed plaque ruptures and fibrotic calcified plaques [40]. However, our study showed inconsistent results with previous studies on the percentage of dendritic cells and CD8 T cells, which may be explained by the following reason: a more significant change in other immune cells, such as M0 macrophages, decreased the relative percentage of these kinds of cells.  NK cells are a critical part of the innate immune system. NK cells reside in peripheral lymphoid organs and develop independently of the thymus. NK cell activity may be activated through stimulation by lipid antigens presented by the MHC-I-like molecule CD1d [41]. Previous studies have suggested that the number of circulating NK cells in patients is related to serious atherosclerosis [42]. In addition, upregulation of NKG2C þ NK cells in peripheral blood is related to a higher risk of plaque rupture in patients with cytomegalovirus infections [43]. However, studies exploring the accurate   function of NK cells in atherosclerosis have been complicated by the lack of reasonable animal models. LDLR -/mice with defective NK cells generated by transferring bone marrow from transgenic mice overexpressing Ly49A under the control of a granzyme A promoter showed reduced atherosclerosis, showing the morbigenous role of NK cells in atherosclerosis [44]. Through the use of NK celldepleting antibodies, we found that NK cells could aggravate atherosclerosis. Perforin and granzyme B-mediated by transferring NK cells deficient in perforin or granzyme B into lymphocyte-deficient ApoE -/mice could be their main cytolytic mechanism [45]. Strangely, NK cell activation was decreased in advanced plaques, which may indicate that NK cells play a different role in advanced atherosclerotic plaques. In addition, our results ( Figure 3) also show the synergistic correlation between activated mast cells and follicular helper T cells as well as a competitive correlation between CD8 T cells and M0 macrophages. However, few studies have reported a possible mechanism for these relationships. We speculate that activated mast cells and follicular helper T cells can mutually enhance their effects in this process. However further investigation into the correlation between these immune cells needs to be conducted.  Functional annotation demonstrated that the DEGs were mainly involved in immune-related functions. The KEGG pathways from all expression array data were mostly enriched in immune and inflammatory response-related pathways, including primary immune-related, with immune disease-related, and cytokine activation-associated pathways. Simultaneously, the enrichment of GO terms showed that variations in gene sets between the early and advanced plaques were mainly enriched in interferon-associated, cellular response to interferon gamma, and functions of inflammatory cells and other immune-related gene sets. Moreover, the functional annotation based on DEGs between the advanced and early-stage plaques showed results similar to those of GSEA. A series of studies have reported that the activation of primary immune cells and the production and release of a mass of cytokines and complements, including interferon gamma, play critical roles in all stages of atherosclerotic plaques. However, other pathways and biological processes, such as pathways associated with immune-related diseases, have not been reported frequently in the progression of plaques. For these reasons, we speculate that these pathways could have an underlying critical role in the development of carotid atherosclerotic plaques.
Unlike CIBERSORT, immunological signature enrichment analysis was used to explore the function rather than the composition of immune cells in the development of carotid atherosclerotic plaques. The immunological signature enrichment analysis conducted by GSEA showed that gene sets in CD8 T cells, PBMCs, Foxp þ Treg cells, DCs, and monocytes were mainly enriched, which may indicate that the functions of these kinds of cells were changed significantly, and may regulate the progression of plaques. More importantly, four gene sets associated with Foxp3 þ Treg subsets of different maturity were significantly enriched, which suggests that they could play different roles in the progression of carotid atherosclerotic plaque. The forkhead box transcription factor FOXP3 has been identified as the key lineage marker and master switch in the regulation of Treg development and function [46]. FOXP3 is now accepted as the "gold standard" for defining thymic-derived Tregs [47]. Previous studies have shown that a low percentage of Foxp3 þ regulatory T cells was detectable in all progressive stages of human atherosclerotic lesions and proved their protective effects in atherosclerotic lesions. However, the exact Foxp3 þ Treg subsets should be researched in greater depth. Our study provides a new perspective on Foxp3 þ Treg subsets of different maturity.

Conclusions
To conclude, we described the immune landscape in detail, revealing the underlying immune infiltration patterns of carotid atherosclerotic plaques at different stages. Our work advances the understanding of the immune response and provides valuable insight into the immune mechanism of the progression of carotid atherosclerotic plaque.

Disclosure statement
No potential conflict of interest was reported by the author(s).