Raman spectroscopy of fibroblast cells from a Huntington's disease patient

ABSTRACT We present a Raman spectroscopy case study of living fibroblast (skin) cells from a patient who developed Huntington’s disease, with fibroblasts from a healthy volunteer as a control. Spectra were processed to remove cosmic rays, had a spectrum of the quartz substrate subtracted, and were flattened to remove cellular autofluorescence. We achieved an accuracy of 95% in discriminating individual cells, and assign spectral differences to (i) the reduction of cholesterol, (ii) the reduction of lipids, and (iii) an increase in beta-sheet proteins for fibroblasts with Huntington’s disease. All these biochemical changes have been previously measured by other methods. Averages over all the cells in this study yield a difference which is extremely statistically significant [p < 0.0001].


Introduction
Huntington's disease (HD) is a genetic neurodegenerative disease which affects 5-10 people per 100,000 population, [1] and onset of the disease can occur at any age. The initial symptoms include involuntary movements and incoordination, which are followed by cognitive dysfunction. The combination of motor and cognitive impairment gradually worsens, [2] with the typical time between diagnosis and death reported as 20 years. [3] Huntington's Disease is caused by a mutation in the huntingtin (HTT) gene, [4] which in turn causes changes in the expressed huntingtin protein. [5] The protein from the mutated gene forms aggregates in neurons, which are the basis for the disease. [6] This mutated protein is also expressed throughout the body, [7] so can in principle be sensed in skin cells (fibroblasts).
Genetic testing is available, mainly for patients whose relatives have developed the disease. However, less than 5% of those who may have inherited the HTT gene choose such a genetic test. [2] This may in part be due to the imprecise diagnosis resulting from the genetic test: the test only reveals the number of (cytosine-adenine-guanine (CAG) repeats -more than 40 repeats tends to produce full penetration of the disease, and 35 or fewer repeats tends not to lead to the disease. [8] But, the predictive nature of the test is not exact, and the number of CAG repeats is a poor indicator of the age of onset.
Diagnosis of the onset of disease involves testing reflexes and behavior. [2,[9][10][11] However, the onset of disease is only clear after a collection of symptoms become apparent and unambiguous, so early diagnosis is delayed with these physical tests. Any single behavioral change is not necessarily related to the onset of HD. [12] So, there is a clear clinical need for a quantitative and accurate test of the onset of the disease, which would provide earlier diagnosis.
An earlier diagnosis of the onset of disease could be beneficial to the patient for two main reasons. Firstly, earlier diagnosis tends to translate into a better outcome for the patient -in this case due to the earlier treatment of symptoms, and better management of the disease. [13] Secondly, possible future therapies [14,15] would most likely benefit patients when employed sufficiently early.
Raman spectroscopy [16][17][18][19] is a label-free, non-destructive technique, which uses laser light to excite vibrations in molecules. It is sensitive to the presence of molecular bonds, and records a biochemical 'fingerprint' within the laser focal spot. This spectral fingerprint can be used to (i) discriminate between cells of different phenotype, and (ii) to reveal biochemical differences between cells. We applied Raman spectroscopy to living cells with Huntington's disease, and to control cells from a healthy patient, to determine the accuracy of discrimination of the cell type. This is the first step in revealing whether Raman spectroscopy can be used as a diagnostic test. As Raman spectroscopy is insensitive to changes in the genetic sequence, it was thought that it could offer a diagnosis for the phenotypic change brought about by the production of huntingtin aggregates, to complement the existing genetic tests and physical tests of reflexes.
The shortcomings and inaccuracies of physical and genetic tests suggest that if a sensitive and quantitative test for the onset of the disease were available, it could greatly benefit patients. Raman spectroscopy can be used for disease diagnosis, [20,21] but such optical techniques are limited in terms of penetration depth into the body, especially through the skull. Investigation of brain tissues is possible but a fiber-based endoscopy system would have to access the brain, and require surgery. A far better idea is to avoid the brain and try to develop a skin test -given that the CAG repeats and expression of the huntingtin protein occur in all cells within the body. A Raman spectroscopy skin test would be rapid, give instant results, be relatively inexpensive, and be painless. The use of Raman spectroscopy to measure a different biochemistry in patients with the disease would act as a label-free biomarker for the disease (and possibly open the door for other neurodegenerative diseases which is to be diagnosed in such a way). For example, Alzheimer's disease has been successfully diagnosed in platelets in blood from mice, [22] and in blood serum from humans [23] with a 95% accuracy. In separate Raman studies of rat brain hippocampus [24] and human brain tissue, [25] in both cases an increase of beta-sheet proteins (relating to amyloid-beta) was observed in spectra.
Raman spectroscopy diagnosis of diseases in tissues has mostly been directed at cancer. Nonmelanoma skin cancer was diagnosed with an accuracy of 95% (for a sensitivity of 100% and a specificity of 91%). [26] Colorectal cancer was diagnosed with an accuracy of 94% (for a sensitivity of 93% and a specificity of 95%) by one group [27] and with an accuracy of 99.8% (for a sensitivity of 100% and a specificity of 99.7%) by another group. [21] There are no published Raman spectra on Huntington's disease cells or tissues, so this study is novel in that respect.
Raman spectroscopy has previously been applied in vitro to human fibroblasts and related cells. One study compared living human metastatic melanoma cells (SK-Mel-2) and normal skin fibroblast cells (BJ), which were identified with an accuracy of around 95%. [28] A second study allowed Raman spectroscopy-based separation of primary human skin fibroblasts, keratinocytes, and melanocytes, as well as immortalized keratinocytes (HaCaT cells). [29] Raman spectroscopy has been applied to human skin in vivo where it is able to measure small biochemical changes and diagnose diseases. [30,31] The experiments performed in this initial study are on fibroblast cells from a patient with HD, and control fibroblast cells from a healthy patient. The aim of the study is to determine whether there is a measureable difference between cell types, and whether this difference relates to biochemical changes known to occur as a result of the disease. This study will test the following hypothesis -can chemical changes in Huntington's brain tissue, also be measured in skin cells?

Materials and methods
Primary fibroblasts with Huntington's Disease (GM04281, 20 year old female, onset at 14 years old) were acquired from the Coriell Cell repository, and control fibroblasts (90011801, 1BR3) were acquired from the European Collection of Cell Cultures (ECACC). Cell culture conditions were identical for both cell types, and are described elsewhere. [32] Briefly, cells were incubated at 37°C with 5% CO 2 , in Eagle's Minimum Essential Medium (EMEM) with Earle's salts and non-essential amino acids (NEAA) with 15% Fetal Bovine Serum (FBS-Perbio Science UK Ltd.) and 2 mM l-glutamine. Cells were grown onto quartz discs (SPI supplies, PA) which had previously been incubated with 1% poly-l-lysine to aid adhesion.
Raman spectroscopy was performed on individual living cells cultured on quartz, by a Renishaw inVia Raman microspectrometer (785 nm, 40 mW at the sample). We under-filled a phase contrast objective lens (Leica HCX PL FL 40x/0.75 PH2) with a narrow diameter beam, so it illuminated with a numerical aperture of 0.25. This produced a large focal spot of 2 µm (FWHM) laterally, and 20 µm axially, to reduce photodamage and sample from a more representative volume within the cell. Spectra were acquired by positioning the focal spot at the boundary between cytoplasm and nucleus (clearly defined by a phase contrast image) to achieve a representative spectrum from an individual cell -more representative than picking a random position within the cell which could be entirely within the nucleus or cytoplasm and thus give rise to an unwanted variation between spectra. The spectral resolution of the Raman system was measured as 7 cm À 1 (FWHM). The spot was focused, so that its center was around 2 µm above the quartz substrate, so much of the 20 µm deep spot acquired a signal from the quartz substrate. For this reason, the acquisition time was required to be greatly increased from a matter of seconds for thick tissue, to 300 seconds and averaged 8 times, in order to maintain a high signal to noise ratio. We acquired phase contrast images before and after Raman spectroscopy, and observed no visible photodamage to the cells. A previous study with laser light of more than 50 times higher calculated intensity at a wavelength of 800 nm, observed a halving of cloning efficiency after 10 minutes exposure, [33] so we anticipate a far smaller level of photodamage in our measurements. A background spectrum was acquired by displacing the sample laterally to a nearby region of bare substrate with no cells, and then this spectrum was subtracted from the spectrum acquired from the cell, to remove the presence of quartz from the spectrum. The spectral range acquired is 603-1718 cm À 1 , which covers the most important part of the vibrational spectrum -commonly known as the fingerprint region -for cells and tissues.
Raman spectra were processed as follows. Firstly, the quartz background spectrum was subtracted from that of the cell. The background spectrum was acquired at the same focal distance above the substrate, but laterally displaced from any cells. This spectrum contains contributions from the quartz substrate and the cell media. Resulting example spectra are presented in Fig. 1, clearly showing that more processing is required because artifacts like cosmic rays and variations in baselines would dominate over small changes in peak heights from changes in chemical composition due to the disease. Secondly, the resulting spectra were flattened to remove the effect of cellular autofluorescence using a small-window moving average automated baseline correction (SWiMA) procedure. [34] The Matlab code was kindly provided by the author Schulze. This procedure first applies a window of three points along the wavenumber (x) axis, starting at the left end. A low-pass filter is applied to this raw data, with a Savitzky-Golay (polynomial) fit to the window, which is moved along the y-axis of the spectrum one pixel at a time. This filtered spectrum is then compared with the raw data to find the lowest data (y-axis) values for all x-axis values. The lowest values are used in the output spectrum, which replaces the original raw data. The process is repeated as the window is increased by two pixels. This complex, iterative procedure removes the peaks from the spectrum to reveal just the baseline. This baseline is in turn subtracted from the raw spectrum to reveal a flattened spectrum. For improved flattening, we adapted this SWiMA method by first cutting the spectrum into six equal sections, and applying the SWiMA process to each of these smaller spectra separately. The resulting six flattened spectra were then simply stitched back to form one full spectrum.
Thirdly, cosmic ray spikes were removed if present, using another automated algorithm by Schulze. [35] The resulting spectra were flattened once more, in the same way as previously described, in case the first flattening procedure was performed on a spectrum with a cosmic ray. Fourthly, all 40 spectra were normalized to a 'zscore' distribution, where the mean value of the Raman signal (the y-axis in Fig. 2) is offset to zero, and the standard deviation is scaled to 1. Normalizing all y-values in this way means that variations between y-values from all 40 cells are equal along the x-axis. This type of normalization of Raman spectra before performing PCA avoids several types of artifact, [36,37] and provides the most significant differences between groups rather than between members of the group. Variations in molecular concentration or laser power would be highlighted and may be wrongly used to identify differences between different cellular compositions. Furthermore, it assumes no previous knowledge of data. Finally, principal component analysis (PCA) [38,39] was performed on an equal number of cells (20 for each group). PCA is the general name for a technique which uses sophisticated underlying mathematical principles to transform a number of possibly correlated variables into a smaller number of variables called principal components. The first principal component loading, PC1, is a spectrum which contains the most variance between the set of all Raman spectra, and subsequent component spectra (PC2, PC3 …) contain ever decreasing variance between the set of Raman spectra.
Each spectrum can be expressed as a linear combination of other principal component loadings, a.PC1 þ b.PC2 þ c.PC3 þ d.PC4 … . All the individual spectra from different cells can be simplified by plotting pairs of values of a, b, c, d … against each other on a 2D scatter plot for example a vs b [PC1 vs PC2], c vs ε [PC3 vs PC5]. From all the produced scatter plots, we chose the one which gave the best separation of the two groups of cells, HD and FB, which was for a vs d. To calculate the accuracy of the technique, a straight line is drawn on the scatter plot between the two groups in a way which achieves the best separation of the groups.

Results
Average spectra for the group of 20 HD fibroblasts and for the group of 20 control fibroblasts, are shown in Fig. 2. Both spectra show a high similarity to a previous study on fibroblast cells, [28] which assigned spectral peaks to specific   , which is the major spectral difference between the groups of fibroblast control (FB) cells and Huntington's disease (HD) cells. Also plotted is the simple difference between average Raman spectra for fibroblast controls (FB) and Huntington's disease fibroblasts (HD) from Figure 1, whose magnitude has been scaled by a factor of 20 for comparison with the other curve. The upper curve (PC4) has been offset by 5 for clarity. The y-axis for the difference spectrum is the normalized Raman signal, and for PC4 is the principal component scores. Annotations 'B', 'C' and 'P' refer to beta-sheet proteins, cholesterol and phospholipids respectively, using the peak assignments in Table 1. biomolecules. Significant differences are visible between the average spectrum of HD fibroblasts and the average spectrum of the control fibroblasts in Fig. 2. The difference spectrum (control fibroblast spectrum minus HD spectrum) is plotted in Fig. 3 to reveal the biochemical origin of the spectral differences between the two cell types. The biochemical assignment of most of these peaks at specific vibrational frequencies, is achieved by comparison with a well cited database specific to classes of biomolecules. [40] These are listed in Table. PCA [38,39] was performed on the spectra from the two groups of cells -HD fibroblasts and control fibroblasts -to remove similarities between spectra from individual cells and highlight the differences. Graphs of all possible combinations of two components (PC1 vs PC2, PC1 vs PC3, etc.) revealed that the best separation between both cell types occurs for PC1 vs PC4. Values of a and d are plotted for every individual cell, on a scatter plot (Fig. 4). Assignment of Huntington's Disease cells from control fibroblasts was achieved with an accuracy of 95% (with a sensitivity of 100% and specificity of 90%) by considering which measurements are on the correct side of the dividing line in Fig. 4. Alternatively, an automated classifier can be created from the data. By calculating the mean (x,y) value of each group of cells, and plotting a line equidistant to these mean values, an accuracy of 92.5% is achieved (with a sensitivity of 95% and a specificity of 90%).
The PC4 loading (spectrum) was plotted in Fig. 3 alongside the simple difference spectrum, showing a very high degree of similarity. The two groups of cells (HD and control) are indeed different, as demonstrated by Welch's unpaired t-test which showed that the mean values of d within each group are different to an extremely statistically significant level [p < 0.0001], with a probability of a null hypothesis at 2.3 � 10 À 6 . The following biomolecular assignments are only included where there is a peak in both the simple difference spectrum, and the PC4 loading spectrum. The component PC4 contains the vast majority of the variance between the two groups. This is confirmed by inspection of the PCA scatter plots for all combinations of principal components up to PC5, which are shown in Fig. 5.  The biochemical differences revealed by the Raman spectrum show that more beta-sheet proteins are present in HD cells at 1220 cm À 1 . [41] This increase in beta-sheet protein was expected because the aggregation of the protein from the mutated huntingtin gene, is known to take a beta-sheet form. [42] Two other peaks which are assigned to beta-sheet proteins, but these are at 980 cm À 1 [43] (and this mode is very weak in the spectrum so a difference would be hard to identify); and within the general broad protein peak centered close to 1660 cm À 1 [43] (so is also hard to identify). There are a variety of other biochemical differences assigned in Table 1, notably a reduction of the amount of lipids in HD cells (peaks at 717-19, 1302, and 1437-51 cm À 1 ). However, the most significant difference is a large reduction in cholesterol for HD cells -all six peaks relating to it in the database are listed in Table 1 and are visible in Fig. 3. This significant reduction in cholesterol levels was also measured in areas of the brain affected by Huntington's disease, [44][45][46] and the ability to synthesize cholesterol is reduced in human fibroblasts taken from HD patients compared with controls. [47] Fatty acid synthesis was also measured to be reduced in HD cells, [48] hence lipid peaks should be lower in HD cells. And finally, in Huntington's disease the huntingtin protein is known to have a beta-sheet structure. [48] Discussion This first Raman spectroscopy study of Huntington's disease cells or tissue, addresses the following hypothesis: can chemical differences known to occur in Huntington's disease brain tissue, be measured in other cells in the body? We suspected that as protein aggregation occurs in all cells, it should also be measured in skin cells -as well as other chemical changes resulting from the disease. The mutated protein has a large effect on brain function as a part of the disease, but a smaller effect on other organs. However, in order to diagnose the onset of disease, it would be highly advantageous to be able to perform a Raman spectroscopic diagnosis in a far more accessible part of the body than the brain. Skin is the most easily accessed part of the body for such, given the sub-millimeter penetration depth of visible or near-infrared light in tissue. Although skin contains far more components than fibroblast cells, an initial study on fibroblasts is appropriate before a full patient tissue study is undertaken.
The differences between Raman spectra in this case study may relate solely to the expression of Huntington's disease, but we cannot exclude the possibility that other factors may be responsible for the biochemical differences measured by Raman spectroscopy. The patient with HD is of different gender, and may be of different age to the control patient. Although cell culture conditions and experimental methods were designed to be identical, this can also not be completely guaranteed. However, the correct assignments of reduced cholesterol, increased beta-sheet proteins and reduced lipids, suggest that there is reason to be optimistic that the measured differences are indeed due to the development of the disease. To confirm this, a clinical study on skin from a number of patients is required. Raman spectra from skin [49] show a large amount of collagen, [50] in addition to components only found in cells, such as DNA.
This in vitro study was envisaged as the precursor to a Raman-based skin test for the disease. A Raman-based skin test could potentially produce a quantitative phenotypic diagnosis to replace the current qualitative physical assessment of the patient. This would also be far more convenient than characterizing brain tissues, which are not easily accessible for optical spectroscopy or other tests. The total Raman acquisition time would in future be reduced to a matter of seconds without a quartz substrate, and with a more powerful laser. The added benefits would be the non-invasive and instant nature of the diagnosis, but it may also offer a more subtle diagnosis than the rarely used genetic tests which only measure whether the patient has the HTT gene but do not measure the extent of the disease.
The accuracy we achieved for individual cells (95%) would be further improved by averaging over a larger area hence more cells -this is confirmed by the p-value of the full sets of 20 cells from each patients [p < 0.0001]. Accuracy can also be improved by reducing the parts of the spectrum used to just those which relate to the major biochemical differences. Partial least squares fitting of spectra can produce quantitative concentrations of these biochemical components within the Raman spectrum. [51] Raman spectroscopy is also well suited to in vitro studies, so provided that the biochemical differences are indeed down solely to the onset of Huntington's disease, then this technique would be applicable to in vitro studies of potential therapies to aid in the refinement, reduction, replacement of animals in medical research. If a therapy is able to revert the cellular changes back to the levels from healthy control cell, then it would be a suitable candidate for human trials. Raman spectroscopy can also be applied to stem-cell derived tissue, and in vivo with mouse models of Huntington's.
Although Raman spectroscopy can detect subtle changes in cells and tissue, Raman imaging -the acquisition of a Table 1. Assignment of vibrational modes [40] in the difference Raman spectrum in Fig. 3. Known biochemical changes in Huntington's disease -a reduction in lipids, reduction in cholesterol, and increase in beta-sheet proteins -are highlighted in bold. Raman frequency (wavenumbers, cm À 1 ) Biomolecules more abundant in control Fibroblasts 608 Cholesterol [56] 621 Phenylalanine (proteins) [57] 640,645 Proteins [43,56,58] 702 Cholesterol [56]

1669-1674
Cholesterol [56,66] Raman frequency (wavenumbers, cm À 1 ) Biomolecules more abundant in HD fibroblasts 788 DNA [43] 1220 Beta-sheet proteins [41] Raman spectrum at each imaging pixel -is slow. However, high speed versions, notably coherent anti-Stokes Raman scattering (CARS) microscopy [52,53] and stimulated Raman scattering (SRS) microscopy [54,55] are able to image the intensity of a Raman peak, hence the amount of a chemical substance such as beta-sheet proteins or cholesterol, both in vitro and in vivo.

Conclusions
This study sought to determine whether the known chemical changes associated with Huntington's disease in brain tissue, are detectable with Raman spectroscopy in skin (fibroblast) cells -with a future goal of being able to diagnose the onset of the disease by a Raman spectroscopy skin test. No previous study of Raman spectroscopy had been performed on Huntington's disease cells or tissue. We clearly demonstrated that individual living fibroblast cells from a Huntington's disease patient can be distinguished from control fibroblast cells from a patient without the disease, with a high accuracy, using PCA. Comparison of the average Raman spectra for each patient, correctly identified the three reported biochemical differences due to the disease -an increase in beta-sheet proteins for HD cells, and a reduction in both lipids and cholesterol.