Discovery of novel serum metabolic biomarkers in patients with polycystic ovarian syndrome and premature ovarian failure

ABSTRACT Several widely recognized metabolites play a role in regulating the pathophysiological processes of various disorders. Nonetheless, the lack of effective biomarkers for the early diagnosis of polycystic ovarian syndrome (PCOS) and premature ovarian failure (POF) has led to the discovery of serum-based metabolic biomarkers for these disorders. We aimed to identify various differentially expressed metabolites (DEMs) through serum-based metabolic profiling in patients with PCOS and POF and in healthy individuals by using liquid chromatography–mass spectrometry analysis. Furthermore, heatmap clustering, correlation, and Z-score analyses were performed to identify the top DEMs. Kyoto Encyclopedia of Genes and Genomes enriched pathways of DEMs were determined using metabolite-based databases. Moreover, the clinical significance of these DEMs was evaluated on the basis of area under the receiver operating characteristic curve. Significantly dysregulated expressions of several metabolites were observed in the intergroup comparisons of the PCOS, POF, and healthy control groups. Furthermore, 6 DEMs were most frequently observed among the three groups. The expressions of these DEMs were not only directly correlated but also exhibited potential significance in patients with PCOS and POF. Novel metabolites with up/downregulated expressions can be discovered in patients with PCOS and POF using serum-based metabolomics; these metabolites show good diagnostic performance and can act as effective biomarkers for the early detection of PCOS and POF. Furthermore, these metabolites might be involved in the pathophysiological mechanisms of PCOS and POF via interplay with corresponding genes.


Introduction
Polycystic ovary syndrome (PCOS) is a common endocrine disorder that is generally diagnosed in women of childbearing age [1]. PCOS is characterized by the presence of small, round cysts in the ovary; however, it is much more complex than the mere presence of cysts [2]. PCOS-related metabolic irregularities include anovulation, infertility, hyperandrogenism, insulin resistance, hyperinsulinemia, and abnormal hair growth [3]. Long-term PCOS increases the risk of type 2 diabetes, cardiovascular disease, and metabolic syndrome [4]. Patients with PCOS require regular reproductive support; nevertheless, during gestation, they are at risk of developing complications that might compromise fetal outcomes, such as pre-eclampsia and gestational diabetes [4,5]. Early diagnosis of PCOS is challenging because of its variable nature and the different diagnostic criteria [6].
Premature ovarian failure (POF), also known as premature ovarian insufficiency, is a condition characterized by loss of ovarian function, premature follicular depletion or absence of menarche, and cessation of manstruation and folliculogenesis before the age of 40 years [7]. Metabolic disorders such as galactosemia; autoimmune adrenal and thyroid diseases; genetic factors such as chromosomal abnormalities; infectious diseases such as mumps; oxidative stress; iatrogenic factors such as chemotherapy and radiotherapy; type 2 diabetes; and ovarian granulosa cell apoptosis are commonly implicated in the pathophysiology of POF [8][9][10].
Detection of circulating biomarkers can facilitate the screening of cancer or other diseases, comprehension of disease biology, and early detection of recurrence accompanied by minimum invasion [11]. Recently, circulating biomarkerbased studies have gained tremendous attention because of the discovery of many serum-based molecules, such as miRNAs, metabolites, and proteins [12]. Liquid chromatography-mass spectrometry (LC-MS) is a powerful tool with various applications, and LC-MS-based metabolomics analyses have been performed to identify circulating biomarkers for multiple disorders [13][14][15]. Different studies have been conducted to determine single or multiple biomarkers derived from tissue samples or body fluids, such as serum, plasma, urine, and saliva, that can be utilized to detect and diagnose diseases [16,17]. Several metabolic biomarkers might be implicated in the etiology of PCOS and POF, playing crucial roles in the occurrence and progression of these diseases [18,19].
Various metabolites have been widely recognized and found to play a role in regulating the pathophysiological processes of various disorders. Nonetheless, the lack of effective biomarkers for early diagnosis of PCOS and POF has led to the discovery of serum-based metabolic biomarkers in patients with PCOS and POF.

Collection and pre-treatment of serum samples
Serum samples were collected, stored at −80°C, and thawed at 4°C for subsequent analysis. Each sample (100 µL) was transferred into 2 mL centrifuge tubes, following which 400 µL methanol (−20°C) was added and the mixture vortexed for 60 s. Afterward, the sample tubes were centrifuged at 12,000 rpm for 10 min at 4°C, and the supernatant was transferred to new centrifuge tubes. The samples were concentrated to dryness in a vacuum. Following this, the samples were dissolved in 150 µL 2-chlorobenzalanine (4 ppm) methanol (80%) solution, and the supernatant was filtered through a 0.22 µm membrane to obtain the initial samples for LC-MS analysis. Before the analysis, 20 µL of each sample was used for quality control (QC) [20][21][22].
Electrospray ionization (ESI)-tandem mass spectrometry analyses were performed using the Thermo Q Exactive HF-X mass spectrometer (Thermo Fisher Scientific Inc., Massachusetts, USA) with a spray voltage of 3.5 kV in positive mode and −2.5 kV in negative mode. The capillary temperature was 325°C. The analyzer was scanned over a mass range of m/z 81-1000 for a full scan with a mass resolution of 60,000. Data-dependent acquisition MS/MS analyses were performed via higher energy collisional dissociation scan. Dynamic exclusion was performed in the MS/MS spectra to remove unnecessary information [20][21][22][23].

Data analysis
R (version 3.3; Boston, MA, USA) and SPSS (20.0 version; SPSS Inc., Chicago, IL, USA) software were used for all bioinformatics and statistical analyses, and data were expressed as mean ± standard deviation. Bioinformatics tools were used to analyze clustering heatmaps, bar plots, correlation matrix, and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment pathways related to the top DEMs. The top DEMs were identified using fold change, p values, variable importance in projection scores, and one-way and two-way analysis of variance (ANOVA)-based t-tests. One-way ANOVA was performed for between-group or multiple-group comparisons. Pearson's correlation was used for correlation analysis. The clinical significance of the top DEMs in serum samples was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC). Significance was set at P < 0.05.

Results
The study aims to discover noval serum metabolic biomarkers by LC-MS-based metabolomics, and provide clinical significance in patients with PCOS, POF and heathy controls. Herein, we retrospectively evulated 100 participants of three groups and carried out hypothesis.

Profiling of metabolites
The general extracted ion chromatograms from the two ESI (positive and negative) modes are shown in the base peak chromatogram in Supplementary Figure S1 (A,B). Additionally, the QC samples in positive and negative modes were clustered together in the principal component analysis (PCA) score plots that verified the quality of the samples (Supplementary Figure S2 (A,B)). Relative standard deviation peaks with a coefficient of variation showed that data obtained from QC samples were robust and reproducible, as shown in Supplementary Figure S2 (C,D). Moreover, three different methods of multivariate analysis, ie, PCA, partial least squares discriminant analysis, and orthogonal partial least squares discriminant analysis, were used to identify the top significant DEMs among the PCOS, POF, and healthy control (CTRL) groups on the basis of certain threshold criteria, such as 1)p-value ≤ 0.05 and variable importance in projection (VIP) ≥ 1; 2)p-value ≤ 0.05 and fold_change ≥ 1.5 or ≤0.667; 3) p-value ≤ 0.05 (Multiple Groups) ( Figure 1

PCOS vs CTRL metabolites
The top 10 significant DEMs identified in the PCOS vs CTRL group comprised those with upregulated expressions including D-glucuronic acid, alpha-ketoisovaleric acid, 11dehydrocorticosterone, hepatonic acid, and picrotin and those with downregulated expressions including beta-guanidinopropionic acid, L-cystine, all-trans retinoic acid, folic acid, and 3beta,5beta-ketodiol (Table 1). Moreover, these DEMs were identified via clustering heatmap ( Figure 3a) and further analyzed using Z-score ( Figure 3b); additionally, positive and negative correlations were determined using correlation coefficient analysis (Figure 3c). Furthermore, the top 5 significantly enriched KEGG pathways for PCOS vs. CTRL group metabolites included cyclic adenosine monophosphate signaling pathways; cancer pathways, such as those in prostate cancer cells; synaptic vesicle cycle pathway; central carbon metabolism pathways in cancer; and protein digestion and secretion pathways. These pathways were identified via KEGG pathway-based MetPA tool ( Figure 3d and Table 2).

POF vs CTRL metabolites
The top 10 significant DEMs identified in the POF vs CTRL group comprised those with upregulated expressions including 18-hydroxycorticosterone, 2-arachidonoylglycerol, rimantadine, pentostatin, mirtazapine, and taurocyamine and those with downregulated expressions including L-4-hydroxyphenylglycine, hydroquinone, retinol, and dicyclomine (Table 1). Additionally, these DEMs were verified via clustering heatmaps ( Figure 4a) and further analyzed using Z-score (Figure 4b). The positive and negative correlations of these DEMs were determined using correlation coefficient analysis ( Figure 4c). Moreover, the top 5 significantly enriched KEGG pathways for POF vs. CTRL group metabolites included ABC transporter-dependent pathways, protein digestion and absorption pathways, central carbon metabolism pathways in cancer, mineral absorption pathways, and cortisol synthesis and secretion pathways; these pathways were identified using the MetPA tool ( Figure 4d and Table 2).

PCOS vs POF metabolites
The top 10 significant DEMs identified in the PCOS vs. POF group comprised those with upregulated expressions including L-4-hydroxyphenylglycine, dicyclomine, uracil-5-carboxylate, myo-inositol, and retinol and those with downregulated expressions including ethylmethylacetic acid, 2-archidonoylglycerol, D-fructose, 18-hydroxycorticosterone, and gamma-L-Glutamyl-L-2-aminobutyrate (Table 1). Furthermore, these DEMs were verified via clustering heatmaps and subsequently analyzed using Z-score. The positive and negative correlations of these DEMs were determined using correlation coefficient analysis ( Figure 5(a-c)). In addition, the top 5 significantly enriched KEGG pathways for PCOS vs POF group metabolites included protein digestion and absorption pathways, ABC transporter-dependent pathways, central carbon metabolism pathways in cancer, aminoacyl-tRNA biosynthesis pathway, and prostate cancer pathways; these pathways were identified using the MetPA tool (Figure 5d and Table 2).

Discovery of the most common DEMs among the three groups
On the basis of threshold criteria, the top 44, 163, and 181 DEMs were identified in the comparisons of the PCOS vs CTRL, POF vs CTRL, and PCOS vs POF groups, respectively. Six DEMs were most frequently identified among the three groups using Venn diagram; these included monomethyl sulfate, riboflavin, oxoglutaric acid, 4-hydroxybenzoic acid, N-acetyldemethylphosphinothricin, and L-cysteine (Figure 6a). The intensity of each of these DEMs was measured across the three groups via bar plots (data not shown). The ROC of monomethyl sulfate yielded a significantly high AUC range of 0.77-0.957 (P < 0.05), with a sensitivity of 77.4%-90.7% and specificity of 76.9%-88.5%, whereas the ROC of riboflavin showed an AUC range of 0.702-0.932 (P < 0.05), with a sensitivity of 58.1%-96.8% and specificity of 84.5%-92.3% in the comparison among the three groups ( Figure 6 (b,c), Supplementary Figure S4 and Figure S5).
The ROC of oxoglutaric acid yielded an AUC range of 0.676-0.878 (P < 0.05), with a sensitivity of 74.2%-83.7% and specificity of 61.5%-80.8%, whereas the ROC of 4-hydroxybenzoic acid yielded an AUC range of 0.658-0.905 (P < 0.05), with a sensitivity of 71%-96.8% and specificity of (57.7%-92.3%) (Figure 7(a,  b), Supplementary Figure S4 and Figure S5). The ROC of N-acetyldemethylphosphinothricin showed an AUC range of 0.663-0.935 (P < 0.05), with a sensitivity of 73.1%-93.5% and specificity of 61.5%-96.2%), whereas the ROC of L-cysteine yielded an AUC range of 0.678-0.828 (P < 0.05), with a sensitivity of 67.4%-80.6% and specificity of 61.5%-73.1% (Figure 7(c,d), Supplementary Figure S4 and Figure S5). Monomethyl sulfate, riboflavin, 4-hydroxybenzoic acid, and N-acetyldemethylphosphinothricin expressions were most upregulated in the patients with PCOS and downregulated in those with POF. Furthermore, oxoglutaric acid and L-cysteine expressions were the most upregulated in the patients with POF and downregulated in those with PCOS. These results demonstrate the efficacy of these 6 metabolites in accurately distinguishing between patients with PCOS and POF using serum samples, showing the significant potential of these metabolites in the diagnosis of PCOS and POF.

Discussion
PCOS is a complex endocrinopathy and a leading cause of infertility due to anovulation or oligoovulation [20]. Conversely, POF is a highly heterogeneous disorder and mostly occurs after treatments for autoimmune and neoplastic diseases [21]. Despite the soaring incidence of these disorders, their underlying pathophysiological mechanisms remain unclear [20,21]. Thus, the treatment of PCOS or POF is multidimensional and takes into account aspects such as genetics, symptoms of infertility and hyperandrogenism, insulin resistance, and their metabolic reactions [22]. Therefore, it is crucial to ascertain the underlying pathophysiological mechanisms of these syndromes by determining significant biomarkers using noninvasive, next-generation technologybased methods. Currently, the application of metabolomics, which is an emerging but powerful tool, represents one such method [23,24]. The discovery of novel significant serum biomarkers for the screening and diagnosis of PCOS and POF, especially in early stages, has recently become a critical goal. Nevertheless, not many biomarker candidates have been clinically applied because of inadequate study cohorts/participants or diagnostic efficacies. In our study, a total of 100 participants constituting the PCOS, POF, and healthy CTRL groups were enrolled from a single center. We used LC-MS-based metabolomics to identify the biomarkers. Few undiscovered metabolites might act as biomarkers; therefore, these metabolites need to be determined. In the present study, we focused on discovering numerous metabolites. On the basis of threshold criteria and univariate and multivariate analysis results, six biomarkers including monomethyl sulfate, riboflavin, oxoglutaric acid, 4-hydroxybenzoic acid, N-acetyldemethylphosphinothricin, and L-cysteine were discovered and their up/downregulated expressions verified in the PCOS, POF, and CTRL groups. Furthermore, these serumbased metabolic biomarkers could significantly distinguish between PCOS and POF with an AUC of 0.65-0.95, at a sensitivity of 58.8%-96.8% and specificity of 57.9%-96.2%. These six   novel biomarkers exhibited high diagnostic efficacy and accuracy and marked complementarity to differentiate between PCOS and POF. PCOS and POF are accompanied by various systemic metabolic alterations, such as dysregulated glucose metabolism and insulin resistance, that might affect ovarian follicles; furthermore, these metabolic abnormalities might lead to alterations in the composition of body fluids such as follicular fluid and serum or plasma [25,26].
Reportedly, dysregulated glucose metabolism and insulin resistance affect multiple energy pathways, manifesting as altered follicular fluid concentrations of different biomolecules such as amino acids, lipids, and ketone bodies [27,28]. Moreover, concentrations of certain free fatty acids in the follicular fluid and serum are altered in patients with PCOS [29,30]. Studies have reported that insufficient availability of methyl groups may induce critical hypothalamic- pituitary-ovarian axis-related gene regulatory mechanisms implicated in PCOS progression, and metabolism regulates methyl group transfer, which is critical for homocysteine homeostasis [31,32] However, homocysteinemia is positively associated with PCOS and other diseases [33]. The imbalance in methyl group metabolism could be the main pathophysiological mechanism underlying the occurrence and progression of PCOS. Patients with POF exhibit high homocysteine concentrations, which are, in turn, related to elevated follicle-stimulating hormone and low serum estradiol levels [34]. L-cysteine, N-acetyldemethylphosphinothricin, and oxoglutaric acid, discovered as biomarkers in the present study, are types of amino acids that are involved in the metabolic pathways of amino acids [35][36][37]. Women with PCOS are deficient in riboflavin (vitamin B 2 ); furthermore, vitamins (watersoluble) play important roles in the therapy of women with PCOS and POF by reducing the antioxidative stress and low-intensity inflammation caused by several factors, in addition to chronic infection [38]. Monomethyl sulfate and 4-hydroxybenzoic acid are chemicals that may act as metabolites [37,39]. In the present study, we discovered six metabolites with up/downregulated expressions in serum samples from the PCOS, POF, and healthy CTRL groups, similar to previous metabolomics and proteomics studies identifying novel biomarkers for PCOS, POF, and other neoplastic diseases [18,19,25,[40][41][42][43][44]. Taken together, these 6 biomarkers may not only affect the pathogenesis of PCOS/POF but also accurately differentiate patients with PCOS or POF from healthy individuals. Therefore, their clinical application can be considered after validation in larger cohorts, which may also guide future studies on this subject.
However, this study has few limitations. First, the metabolic profiling of the discovered metabolites was performed in a single center-based cohort, which might represent biased samples or findings. Second, the validation of these common metabolites was not performed; therefore, multicenter studies with a larger cohort are warranted for validating these metabolites. Third, the six different metabolites were evaluated and compared using only serum samples, and plasma or other body fluid-based samples were not used. Therefore, further studies are needed to detect these metabolites in other fluid samples and evaluate their association with various corresponding genes, which may play a role in the occurrence and progression of PCOS and POF.

Conclusion
Novel metabolites with up/downregulated expressions can be discovered in patients with PCOS and POF using serum-based metabolomics; these metabolites show good diagnostic performance and can act as effective biomarkers for the early detection of PCOS and POF. Furthermore, these metabolites might be involved in the pathophysiological mechanisms underlying the occurrence and progression of PCOS and POF via interplay with corresponding genes.

Research highlights
1. Discovered six novel metabolites in patients with PCOS and POF 2. Representing good diagnostic performances as serum metabolites 3. May take part in the pathophysiological mechanism of PCOS and POF Disclosure statement