Integrative Analyses of Mendelian Randomization and Transcriptomic Data Reveal No Association between Leptin and Chronic Obstructive Pulmonary Disease

Abstract As a key adipokine, leptin has been extensively investigated for its potential role in the pathogenesis of chronic obstructive pulmonary disease (COPD). However, concordant conclusions have not been attained. In this study, we investigated the relationship between leptin and COPD using an integrative analysis that combined a Mendelian randomization (MR) study with transcriptomic data analysis. Here, the MR analysis was performed on the online platform MR-Base, and the bioinformatics analyses were performed with the aid of R Bioconductor packages. No evidence was found by the integrative analysis to support the association of the two attributes. All methods detected a null causal effect of leptin on COPD in the MR analysis. In particular, when the genetically predicted leptin level increased one unit, the risk of developing COPD was estimated as 0.999 (p = 0.943), 0.920 (p = 0.516), 1.002 (p = 0.885), and 1.002 (p = 0.906) by the Inverse Variance Weighted (IVW), MR-Egger, weighted median, and weighted mode method, respectively. Furthermore, no leptin-associated genes except one were identified as being differentially expressed between COPD and control in bioinformatics analysis. The observed association between leptin and COPD in previous observational studies may be attributable to unmeasured confounding effects or reverse causation.


Introduction
Chronic obstructive pulmonary disease (COPD) is a progressive inflammatory lung and complex heterogeneous disease [1].It is characterized by chronic bronchitis, small airway dysfunctions, and emphysema that may result in airflow limitation, according to the Global Initiative for Chronic Obstructive Pulmonary Disease 2023 report (https:// goldcopd.org/2023-gold-report-2/). Another evident feature of COPD is systemic inflammatory processes in the respiratory tract, which may be accompanied by non-marginal changes in the synthesis of adipokines, the peptides participating in immune processes [2] and is being shown to correlate with numerous complex diseases such as COVID-19 [3].Specifically for COPD, as an example, a recent study [4] found an evaluated level of adiponectin (one major adipokine) in patients with COPD compared with healthy controls.Therefore, understanding the systemic effects that adipokines have on COPD may not only give valuable insights into its pathogenesis but also provide insightful clues for its prevention and intervention.
Leptin has been extensively investigated for its potential role in the pathogenesis of COPD as a key adipokine.Although dozens of observational studies and experiments have been conducted to examine the relationship between leptin and COPD [5][6][7][8][9], no concordant conclusions have been reached.Some studies suggested the elevated concentration of circulating leptin is positively related to the risk of COPD, while some demonstrated no association between leptin and COPD.For example, a study based on the data from the National Health and Nutrition Examination Survey III [8] found no significant difference in the serum leptin levels in subjects with COPD compared to the controls.Notably, both the unmeasured confounders and reverse causation are typically present in a conventional epidemiological study.
Conversely, Mendelian randomization (MR), a genetic epidemiological method commonly used in the field, can address these two issues effectively.Genetic variants are used as proxies/instrumental variables (IVs) in MR analyses to unravel the direction of causality between exposure and disease [10].MR can be used to disentangle the effect that an exposure imposes on the risk of developing a certain disease from the reversed effect, where the development and progression of a disease may result in an exposure change.Additionally, MR can minimize the confounding effect of unobserved confounders because the genetic variants were determined at conception.However, MR has certain limitations.One of the major limitations is weak instrument bias, which indicates that the genetic variant elucidates little variation of the exposure.Another disadvantage of an MR study is horizontal pleiotropy.It refers to the possibility that a single-nucleotide polymorphism (SNP) used as an IV can influence other phenotypes than the exposure.The presence of horizontal pleiotropy invalidated the IV assumptions, resulting in an inappropriate inference of causality.
Transcriptomic data analysis, such as microarray or RNA-Seq data, can reveal the role of specific genes in complex diseases.Complementing MR with transcriptomic data analysis may provide synergic values of the underlying mechanisms for disease onset and progression, thus pinpointing the potential therapeutic regimes.To the best of our knowledge, studies that integrate transcriptomic data analysis and MR analysis are currently scarce.To fill this gap, we performed both MR and transcriptomic data analyses in this study.Consequently, we anticipate that the relationship between leptin and COPD can be comprehensively and exclusively investigated.

MR analysis
A two-sample MR analysis using the online platform MR-Base (https://app.mrbase.org)[11] was undertaken to explore if leptin had a causal impact on the risk of developing COPD, wherein the concentration of leptin and COPD were regarded as the exposure and outcome, respectively.Multiple SNPs (low linkage disequilibrium R 2 <0.001; thus, are regarded as independent from each other) associated with leptin at the genome-wide significance (p < 5 × 10 −8 ) from the NHGRI-EBI GWAS catalog [12] were selected.For the leptin GWAS study specifically, 35,292 individuals of European ancestry were included.For the outcome of the GWAS study, the summary data were from the GWAS pipeline using Phesant-derived variables from the UK Biobank, which was provided by the Medical Research Council Integrative Epidemiology Unit (MRC-IEU), University of Bristol.It was referred to as the MRC-IEU consortium, comprising 1,658 COPD cases and 112,583 controls.
The Inverse-Variance Weighted (IVW) method was employed in the main analyses to evaluate the causal relationship between leptin and COPD.IVW is a weighted regression of SNP outcome on SNP-exposure associations under the constraint of the regression intercept being zero.Additionally, alternative MR methods, including weighted median [13], MR-Egger regression [14], and weighted mode [15], as a part of sensitivity analysis, were employed to assess the relationship between COPD and leptin further.The three alternative methods are commonly believed to be more robust to horizontal pleiotropy.

Experimental data
Pro-processed expression data of the COPD cohort were downloaded from the Gene Expression Omnibus (GEO: https://www.ncbi.nlm.nih.gov/geo/)repository (under the accession number: GSE76925).This microarray experiment was conducted on the platform of Illumina HumanHT-12 V4.0 expression bead chip, comprising 111 patients with COPD and 40 health smokers [16].

Statistical methods
Moderated t-tests were performed to identify differentially expressed genes (DEGs) using the R limma package.Moreover, to investigate the intervention of the identified DEGs across three different comparisons, a Venn diagram was employed.Two-dimensional scatterplots based on the Uniform Manifold Approximation and Projection (UMAP) method [17], which can embed high-dimensional features into nonlinear representations with a manageable scale and is preferable over conventional feature extraction methods such as principal component analysis (PCA), were diagrammed to visually inspect the discriminative ability of leptin-associated genes.The protein-to-protein interaction network of the identified genes was constructed, which was followed by Gene Ontology (GO) [18] functional analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) [19] pathway enrichment analyses performed in the Search for the Retrieval of Interacting Genes/Proteins (STRING) software [20].
The biological relevance of identified genes and their enriched pathways was investigated by surveying the GeneCards knowledgebase [21] and literature mining in PubMed.All bioinformatics analyses, unless described specifically, were performed in the R software, version 4.1.0(http://www.r-project.org/).

Results of MR analyses
The formal MR analysis neither found pleiotropic effects (the intercept of MR-Egger regression was 0.003, p = 0.518) of five leptin-associated SNPs nor identified obvious heterogeneity (Cochran's Q value = 4.323, p = 0.229 for the MR Egger method, and Q value = 5.092, p = 0.278 for the IVW method) among the five SNPs.All MR analyses demonstrated a concordant null effect of leptin on COPD using either IVW or alternative methods.Specifically, the risk of developing COPD when the genetically predicted leptin level increased one unit was estimated as 0.999 (p = 0.943), 0.920 (p = 0.516), 1.002 (p = 0.885), and 1.002 (p = 0.906) by IVW, MR-Egger, weighted median, and weighted mode methods, respectively (Figure 1a).According to leave-one-out sensitivity analysis and the funnel plot, no extreme SNPs or outliers were observed (Figures 1b  and 1c).
Considering that impaired lung function measures such as forced expiratory volume in one second (FEV1) and forced vital capacity (FVC) are evident in COPD, respective MR analyses with FEV1 and FVC as outcomes were also conducted.The outcome of the GWAS study for both FEV1 and FVC, which included 307,638 participants, was set as the Neale Lab Consortium (essentially the UK Biobank data) in 2017.Both MR analyses found no causal correlations between leptin levels and the respective outcomes; however, they revealed the existence of heterogeneity among SNPs.In both analyses, one SNP, rs780093, was identified as an outlier.After removing this outlier, a negative causal effect of leptin on FVC was found by two MR methods, namely, IVW and weighted median.The negative association direction is consistent with what is expected (data not shown).In contrast, for FEV1, only the IVW method suggested a negative association (p = 0.023); however, it was insignificant after an adjustment using the Bonferroni method (the threshold of significance level decreased to 0.05/3 = 0.017 since three outcomes, COPD, FEV1, and FCV were considered).To summarize, neither COPD nor impaired lung function measures were found to be causally influenced by circulating leptin levels.

Results of bioinformatics analyses
The nearby genes for the five SNPs were retrieved from the dbSNP database [22], including GCKR, COBLL1, SNORA70F, FTO, and LINC02029.Three genes in the neighborhood of leptin-associated loci, namely LEP, SLC32A1, and CCNL1, identified by a relevant genome-wide meta-analysis [23], were also added to this list.
In addition to LEP, the GeneCards Knowledgebase indicated that both CCNL1 and FTO directly link to COPD.A study [24] replicated the significant association between rs8050136, located in the first intron of the FTO gene, and BMI, specifically for FTO.Moreover, that study demonstrated that FEV1 varied significantly by FTO genotype and suggested a potential role of the FTO locus in the determination of anthropomorphic measures associated with COPD.In contrast, the confidence score (representing the confidence regarding the underlying association; the higher the score, the stronger the correlation is) for CCNL1 on GeneCards was extremely low, and the PubMed mining returned no additional literature to support their association.
Setting the cutoff values for fold change (FC) and false discovery rate (FDR) at 1.5 and 0.05, respectively, moderated t-tests identified 259 up-regulated genes and 1053 down-regulated genes in the case group compared to the controls.Only CCNL1 of the eight leptin-associated genes was determined as an under-expressed DEG.Of note, only CCN1, FTO, COBLL1, and LEP were annotated by the microarray platform.
Moreover, the String software [20] found no enriched KEGG pathways and GO terms using these eight genes.To explore the association between leptin and COPD extensively, based on the union of the DEGs and the four leptin-associated genes, we performed the Weighted Gene Co-expression Network Analysis (WGCNA) [25].The hyper-parameter β in WGCNA was chosen at six, under which the resulting network is scale-free (Figure 2a).In addition, the minimal size of a module was reduced to 15 to get multiple modules for a meaningful comparison.According to the results of WGCNA analysis (Figure 2b), CCNL1, FTO, and COBLL1 belonged to the same module (colored in brown), which was the largest module comprising 1,153 genes, while LEP could not be classified into any modules.Nevertheless, the correlations of either FEV1 or the group status with the eigen-gene of the brown module were not strong when compared with the other four modules.Meanwhile, these three genes are not the hub genes in the brown module.
Finally, the two-dimensional UMAP scatterplot (Figure 3), in which COPD cases and controls are mixed together as a whole, elucidated that the leptin-associated genes have no discriminative ability to distinguish between the two groups.Overall, the bioinformatics analysis results provided solid evidence for a null association between leptin and the risk of developing COPD.

Discussion
Observational studies have shown that adipokines correlate with numerous complex diseases, including COVID-19 [3], type 2 diabetes mellitus [26], and COPD [27].Nevertheless, no deep investigation on whether this association is causal (a certain adipokine such as leptin is a driver factor for developing a certain disease) has been conducted.Considering its significant clinical implication (possibly facilitating a personalized nutrition intervention to leverage those adipokines, which may be simple and easy to implement), a causal inference on the association is highly desirable.Therefore, specifically focusing on leptin, this study aims to investigate the causal association between leptin and COPD using gene expression data mining and MR analysis.No significant association between leptin and COPD was identified in the real-world gene expression data analysis.Moreover, there was no evidence of genetic correlation between them.Our results corroborated with previous studies, such as Ref. [28], showing no support for a clear relationship between serum levels of leptin and COPD.Nevertheless, these studies are specific to stable COPD, and thus, possible influences of leptin during COPD exacerbations should be further explored in the future.
This study has two major limitations; one is specific for the MR analysis, and the other for the gene expression analysis.First, the issue of weak instrument bias was particularly relevant to the current MR study, considering that patients with COPD were less likely to be recruited to a GWAS study, which would result in a conservative estimation of the null.In addition, only five leptin-associated SNPs were integrated for the MR analysis, possibly only adding a marginal value to a single SNP.Of note, this MR study may be slightly underpowered, in which case no conclusive evidence for the null effects observed in the current analyses is present.
The second limitation was specific to the microarray analysis, considering the experiment was designed as a cross-sectional study.Similar to an observational study, it was also under the influence of unmeasured confounding effects and reverse causation.Furthermore, the expression value of a certain gene, e.g., LEP in the lung or airway, and its circulating concentration may not be well correlated.Moreover, the sample size of the microarray study was also moderate, and half of the leptin-associated genes were not annotated by the microarray platform.

Conclusions
As one of the first studies to assess the association between leptin and COPD through an integrative analysis combining both MR analysis and transcriptomic data mining, this study found no evidence to support causality.Consequently, the observed association between leptin and COPD in previous observational studies may be attributable to unmeasured confounding effects or reverse causation.

Figure 1 .
Figure 1.mendelian randomization analysis exploring the effect of circulating leptin level on the risk of developing the COPD.(a) forest plot showing the effects of individual snPs and the overall estimates.(b) leave-one-out sensitivity analysis testing heterogeneity among snPs.(C) funnel plot detecting potential outliers.

Figure 2 .
Figure 2. Dendrogram showing how the differentially expressed genes plus the leptin-associated genes were divided into various modules by the WGCna method.(a) selection of hyper-parameter making the resulting network scale-free.(b) WGCna analysis results.

Figure 3 .
Figure 3. two-dimensional umaP scatterplot showing leptin-associated genes have no or minimal discriminative ability to distinguish patients with COPD and controls.