Identification of spinal tuberculosis subphenotypes using routine clinical data: a study based on unsupervised machine learning

Abstract Objective The identification of spinal tuberculosis subphenotypes is an integral component of precision medicine. However, we lack proper study models to identify subphenotypes in patients with spinal tuberculosis. Here we identified possible subphenotypes of spinal tuberculosis and compared their clinical results. Methods A total of 422 patients with spinal tuberculosis who received surgical treatment were enrolled. Clustering analysis was performed using the K-means clustering algorithm and the routinely available clinical data collected from patients within 24 h after admission. Finally, the differences in clinical characteristics, surgical efficacy, and postoperative complications among the subphenotypes were compared. Results Two subphenotypes of spinal tuberculosis were identified. Laboratory examination results revealed that the levels of more than one inflammatory index in cluster 2 were higher than those in cluster 1. In terms of disease severity, Cluster 2 showed a higher Oswestry Disability Index (ODI), a higher visual analysis scale (VAS) score, and a lower Japanese Orthopedic Association (JOA) score. In addition, in terms of postoperative outcomes, cluster 2 patients were more prone to complications, especially wound infections, and had a longer hospital stay. Conclusion K-means clustering analysis based on conventional available clinical data can rapidly identify two subtypes of spinal tuberculosis with different clinical results. We believe this finding will help clinicians to rapidly and easily identify the subtypes of spinal tuberculosis at the bedside and become the cornerstone of individualized treatment strategies.


Introduction
tuberculosis is still a global public health problem and poses a serious threat to human health [1]. spinal tuberculosis accounts for around 50% of all bone and joint tuberculosis. it is one of the most severe and common extrapulmonary tuberculosis. as the disease progresses, bones are often severely damaged, causing scoliosis, affecting neural function, and severely affecting the quality of life of patients [2,3]. Unfortunately, a one-size fits all management and treatment approach is still implemented in clinical practice, which ignores the heterogeneity of spinal tuberculosis patients [4]. inadequate treatment and management are one of the reasons for poor prognosis [5]. in addition, phenotypic heterogeneity is a major obstacle to tuberculosis management and personalized treatment. completely understanding the inherent heterogeneity of tuberculosis is essential to formulate efficient intervention strategies [6]. Because the pathogenesis of spinal tuberculosis has not been elucidated, it is difficult to explain and predict the characteristics of patients with spinal tuberculosis.
Machine learning algorithms are widely used in clinical practice [7,8], especially in the diagnosis of tuberculosis [9]. Machine learning has shown strong effectiveness, as evidenced by the study conducted by Orjuela-cañón et al. which indicates that Ml algorithms can serve as effective diagnostic tools for tuberculosis, especially in settings with limited healthcare infrastructure [10]. aguiar Fs et al. have also developed models based on artificial neural networks for classifying hospitalized patients and risk allocation in environments with high tuberculosis prevalence [11]. and cluster analysis is a typical unsupervised machine learning method, which can effectively, accurately, and reasonably identify phenotypic heterogeneity according to the characteristics of patients' diseases and classify heterogeneous queues [12]. among these, the K-means clustering analysis is a good clustering method and is widely used in clinical practice [13,14]. For instance, Koo et al. successfully identified five phenotypes of pulmonary tuberculosis through K-means cluster analysis. Patients with these five phenotypes had significant differences in their symptoms and microbiological and radiological examination results. thus this analysis provides a hierarchical medical method and has become the cornerstone of individualized treatment strategies [15]. in addition, K-means clustering analysis has been successfully applied to identify the subphenotypes of spinal tumors, sepsis, and other diseases [16,17]. however, no useful classification tool has been developed to identify the heterogeneity of spinal tuberculosis. therefore, we proposed a K-means clustering method based on which only the routine available clinical data collected by patients within 24 h after admission can be used to identify subphenotypes of spinal tuberculosis. Finally, we compared the differences between the clusters in terms of clinical characteristics, surgical efficacy, and postoperative complications, and verified the accuracy of clustering.

Patient
We reviewed and analyzed the perioperative clinical data of patients who received surgical treatment for spinal tuberculosis in the First affiliated hospital of Guangxi Medical University from June 2012 to June 2021. inclusion criteria were [1] clinical symptoms consistent with spinal tuberculosis: these encompass chronic back pain, progressive spinal deformity, weight loss, fatigue, and nocturnal sweating, etc [2]. Radiological manifestations consistent with spinal tuberculosis: these encompass vertebral body osteolysis, and the formation of abscesses, etc [3]. lesions confirmed through percutaneous biopsy or postoperative pathological examination, showing pathological features of spinal tuberculosis such as caseous necrosis and granuloma, and further validated through culture to establish the presence of Mycobacterium tuberculosis [4], complete clinical data [5], no surgical history affecting the spine. the exclusion criteria were [1] pathological diagnosis after the operation is unclear [2], complicated with tumour or other immune-related diseases [3], incomplete clinical information, and [4] history of surgery affecting the spine. a total of 422 patients were included in the study (253 males and 169 females). in addition, the general information of patients, preoperative laboratory examination results, surgical conditions, postoperative complications, etc., were collected from the electronic medical record system. the study was approved by the ethics committee of the First affiliated hospital of Guangxi Medical University.

Data collection
General information about the patient collected included age, gender, body mass index (BMi), Oswestry Disability index (ODi), Japanese Orthopedic association (JOa) scores, and visual analog scale (Vas). Using the clinical data of patients, ODi, JOa, and Vas scores were jointly evaluated by two senior specialists for each patient. Patient's laboratory test results, including c-reactive protein (cRP), erythrocyte sedimentation rate (esR), white blood cells (WBc), haemoglobin, platelets, neutrophils, lymphocytes, monocytes, total protein (tP), albumin, monocyte count to lymphocyte count ratio (MlR), platelet count to monocyte count ratio (PMR), platelet count to lymphocyte count ratio (PlR), neutrophil count to lymphocyte count ratio (NlR), platelet count to neutrophil count ratio (PNR), c-reactive protein to albumin ratio (caR), and systemic immune-inflammation index (sii) were collected. sii was calculated using the following formula: (neutrophil count × platelet count)/lymphocyte count [18]. Patients' surgical data, including operation time (Ot), bleeding volume (BV), blood transfusion, postoperative drainage volume (PDV), length of hospital stay (lOs) and postoperative complications, were collected. Postoperative complications were defined as surgical wound infections or systemic infections, internal fixation failures, thrombosis, respiratory failure, cerebrospinal fluid leakage, and other surgery-related diseases.

Cluster analysis
We performed the K-means cluster analysis based on the preoperative age, gender, BMi, WBc, haemoglobin, platelets, neutrophils, lymphocytes, monocytes, tP, albumin, esR, MlR, PMR, PlR, NlR, PNR, sii and caR of patients with spinal tuberculosis. K-means clustering can classify the data of unknown labels into different groups according to data characteristics. it is a clustering algorithm based on division, where each group of data is also called a "cluster, " and the center point of each cluster is called a "centroid. " the sample points close to the cluster centroid can be divided into the same cluster by calculating the euclidean distance between the sample point and the cluster centroid [19]. the similarity between the two samples is measured by the euclidean distance between them. as the distance between the two samples increases, it decreases the similarity between them [20]. Firstly, we used the scale function in the "factoextra" package to standardize the data [21], and calculate the hopkins statistics using the get cluster density function to evaluate the clustering trend of the dataset. then perform K-means clustering analysis with the following specific steps [1]: K initial centroids are randomly selected, then calculate the distance from each sample point to the initial centroids and assign it to the nearest initial centroid. this will generate K clusters [2]. For each cluster, calculate the average distance of all sample points assigned to that cluster as the new centroid [3]. Repeat this process until the centroid positions remain unchanged. Finally, use the silhouette coefficient (sc) to find the optimal number of clusters (K value) [19,20]. the specific formula for calculating sc is as follows: in this formula, a(i) represents the average distance between the sample point and all other points in the same cluster, while b(i) represents the average distance between the sample point and all points in the next nearest cluster. For each cluster, the intra-cluster difference is small, while the inter-cluster difference is large, which is what the K-means clustering algorithm pursues, and sc is the key indicator to describe the intra-cluster and inter-cluster differences. From the formula, we can see that the value range of sc is (−1, 1). When sc approaches 1, the clustering effect is better; the closer it is to −1, the worse the clustering effect [22]. this process is achieved through the "Fpc" package. all processes are performed using the R software (version 4.2.1)

Statistical analysis
sPss (iBM version 26.0) and R statistical software (version 4.2.1) were used for statistical analysis. a t-test or Mann-Whitney U test was used for continuous variables, and the chi-square test or Fisher's exact test was used for categorical variables. Pearson's test was used for correlation analysis of normally distributed data, whereas spearman's test was used for non-normally distributed data. For normally distributed continuous variables are expressed as mean ± standard deviation (sD). For non-normally distributed continuous variables are expressed as the median (percentiles). a p < 0.05 was defined as a statistical difference.

Cluster analysis results
to understand the correlation between variables, a correlation matrix (Figure 1(a)) was built to identify relationships between the variables, indicating that most variables have correlations between them. the cluster analysis results revealed the value of hopkins statistics (0.815) and ordered dissimilarity matrix, which indicated that the dataset was significantly clusterable (Figure 1(B)). sc is a key indicator to describe the difference between inside and outside clusters. through comparison, we found that when the clustering with K = 2 was found to have a higher silhoutte score of 0.24, and the clustering effect was the best (Figure  1(c)). therefore, 422 patients with spinal tuberculosis were finally clustered into clusters 1 and 2 (Figure 1(D)).

Studying patients' characteristics by K-means clustering
a comparative analysis of preoperative variables between clusters revealed that the age of cluster 1 was lower than that of cluster 2 (p = 0.001). haemoglobin, lymphocytes, albumin, PMR and PNR of cluster 1 were higher than those of cluster 2 (all p < 0.01). however, the cRP, WBc, platelets, neutrophils, monocytes, and esR indexes of cluster 2 were higher than those of cluster 1 (all p < 0.001). in addition, the MlR, PlR, NlR, caR, and sii indexes of cluster 2 were higher than those of cluster 1 (all p < 0.001). there was no significant difference in gender, BMi, and tP between the two clusters (all p > 0.05) (table 1). the difference in preoperative variables between the two groups was well displayed on the radar chart ( Figure 2).

Comparison of disease severity among clusters
a comparison and analysis of the scores of ODi, JOa, and Vas among clusters revealed found that the scores of ODi and Vas in cluster 2 were significantly higher than those in cluster 1 (p < 0.001 and p < 0.05), whereas the scores of JOa in cluster 1 were significantly higher than those in cluster 2 (p < 0.001) (Figure 3). it showed that cluster 2 had a higher disease severity. correlation analysis revealed that multiple indicators were related to the severity of the disease. among them, PlR, MlR, age, and sii had a strong positive correlation with ODi and Vas scores, and a strong negative correlation with JOa scores (all p < 0.05), indicating that age, PlR, MlR, and sii were positively related to the severity of the disease (Figure 4).

Comparison of surgical and postoperative variables among clusters
a comparison and analysis of the differences in surgical and postoperative variables between clusters revealed that the incidence of postoperative complications in cluster 2 was higher than that in cluster 1 (p < 0.05). Further analysis revealed that cluster 2 had a higher incidence of surgical wound infections than cluster 1 (p < 0.05). in addition, the hospitalization time of cluster 2 was longer than that of cluster 1 (p < 0.05) (table 2). the operative and postoperative variables, such as operation time, bleeding volume, blood transfusion, drainage volume, pulmonary infection, pleural effusion, gastrointestinal reaction, thrombosis, respiratory failure, and other complications, were similar among the clusters (p > 0.05) (table 3). the radar map showed the differences in surgical and postoperative variables among clusters ( Figure 5).

Discussion
identification of different subphenotypes is a key component of personalized medicine. identification of different subphenotypes of spinal tuberculosis will lead to better risk stratification and treatment decisions. however, one of the biggest challenges of subphenotype identification is how to translate research into clinical practice [14]. therefore, we only used the patient's age, gender, BMi, and 16 routinely available preoperative laboratory examination results as factors to ensure that the study adhered to clinical practice guidelines and had higher clinical significance. in addition, we could accurately identify the subphenotypes of spinal tuberculosis through K-means cluster analysis.  Figure 2. The radar chart of preoperative variables of spinal tuberculosis patients in two clusters. The K-means clustering algorithm normalized preoperative variables were compared between two clusters. spoke lengths represent the average of each variable after the K-means clustering algorithm is normalized. significance levels are presented with asterisks. **p-value < 0.01, ***p-value < 0.001. clustering analysis is typical unsupervised learning, which can reveal the inherent properties of samples and the laws of their relationships. it is widely used in different fields, including clinical medicine and bioinformatics, one of which is used for disease classification [23]. among several clustering analysis methods, K-means clustering is one of the commonly used clustering analysis algorithms [24] because it can maximize the separation of clusters and provide the largest range [25] for identifying different groups of patients. it has been successfully used to identify subtypes of sepsis [26], pulmonary tuberculosis [15], and cervical spondylotic myelopathy [27]. therefore, we selected the K-means cluster analysis and successfully identified two phenotypes based on the conventional available natural characteristics of patients with spinal tuberculosis rather than prior knowledge, which enabled us to further study these characteristics and highlight those related to medical research assumptions. this method provides   a more meaningful description and the distinction between patient groups in the queue [28]. comparative analysis revealed that cluster 2 had higher disease severity. in the postoperative outcome, the incidence of complications in cluster 2 was significantly higher than that in cluster 1, especially wound infections and a longer hospital stay. in conclusion, this finding could be used as a significant reference for the prognosis stratification of patients with spinal tuberculosis in clinical practice. esR and cRP are commonly used indicators to evaluate the infection degree of inflammatory diseases [29]. a multicenter retrospective cohort study reported that the elderly and the increased esR after treatment were the key factors for poor surgical prognosis of patients with spinal tuberculosis [30]. a study reported MlR as an inflammatory marker of tuberculosis, which is related to its severity [31]. similarly, chen et al. showed that MlR is an independent factor for the severity of spinal tuberculosis [32]. Monocytes can promote the release of inflammatory mediators after pathogen invasion. they transform into macrophages to participate in immune responses [33]. Research has shown that a low lymphocyte count is intricately related to inflammation [34], which could cause an MlR imbalance in inflammatory diseases. in addition, PlR and sii are important markers of inflammation which are significantly expressed in several diseases and are intricately related to the prognosis of diseases [35][36][37]. albumin and haemoglobin are important nutrients for the human body [38]. chen et al. reported that albumin is an important predictor of surgical site infection in patients with spinal tuberculosis. a lower albumin value is related to a higher risk of surgical site infections [39]. in the two sub-phenotypes identified, the level of age, cRP, esR, monocytes, MlR, PlR, and sii in cluster 2 was significantly higher than that in cluster 1, whereas haemoglobin, lymphocytes and albumin were significantly lower than that in cluster 1.  to summarize, patients in cluster 2 had more serious diseases and worse prognoses than those in cluster 1. We used a classification method based on routinely available clinical data to further understand the subphenotypes of spinal tuberculosis. this can evaluate the severity of spinal tuberculosis and the differences in prognosis or treatment in clinical practice. however, this method for classifying patients with spinal tuberculosis requires additional external validation before its clinical implementation.
this study had several limitations: firstly, k-means is a widely used algorithm in different fields. however, it has some disadvantages such as being sensitive to outliers, hard-working with categorical variables, initialization issues, and election of number of the clusters, among others. secondly, although we strive to minimize the potential impact of collinearity by standardizing the data. however, collinearity is still an issue that cannot be ignored, which may have an impact on the distance measurement between variables, thereby affecting the clustering results of the K-means algorithm. thirdly, the sample size of this study was small and it was a single-center, retrospective study, which could have resulted in inevitable selection bias. in the future, the sample size should be increased and further verified by a multicenter, prospective study. in addition, the surgeon's preferences and experience could affect the results of the study.

Conclusion
K-means clustering analysis based on conventional available clinical data can rapidly identify two subtypes of spinal tuberculosis with different clinical results. We believe this finding will help clinicians rapidly and easily identify the subtypes of spinal tuberculosis at the bedside. thus, it has the potential to become the cornerstone of individualized treatment strategies.

Acknowledgment
We are grateful to Dr. Xinli Zhan (spine and Osteopathy Ward, the First affiliated hospital of Guangxi Medical University) for his kindly assistance in all stages of the present study.

Ethics approval
this study was approved by the ethics committee of the First affiliated hospital of Guangxi Medical University.

Author contribution
sW, YY, and XZ designed the study. ch, sF, cZ and JZ analyzed the data. BZ, ll, sW and ZM processed the digital visualization. sW wrote and revised the manuscript. cl and XZ revised the manuscript. all authors read and approved the final manuscript. all co-authors participated in the laboratory operation. all authors read and approved the final manuscript.

Consent form
informed consent was obtained from all participants and/or their legal guardians.

Disclosure statement
the authors declare that they have no conflicts of interest.
Funding this work was supported by grants from the National Natural science Foundation of china (81560359 and 81860393).

Data availability statement
the original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.