Gene expression of bacterial collagenolytic proteases in root caries

ABSTRACT Objective: It is unknown whether bacteria play a role in the collagen matrix degradation that occurs during caries progression. Our aim was to characterize the expression level of genes involved in bacterial collagenolytic proteases in root biofilms with and without caries. Method: we collected samples from active cavitated root caries lesions (RC, n = 30) and from sound root surfaces (SRS, n = 10). Total microbial RNA was isolated and cDNA sequenced on the Illumina Hi-Seq2500. Reads were mapped to 162 oral bacterial reference genomes. Genes encoding putative bacterial collagenolytic proteases were identified. Normalization and differential expression analysis was performed on all metatranscriptomes (FDR<10-3). Result: Genes encoding collagenases were identified in 113 bacterial species the majority were peptidase U32. In RC, Streptococcus mutans and Veillonella parvula expressed the most collagenases. Organisms that overexpressed collagenolytic protease genes in RC (Log2FoldChange>8) but none in SRS were Pseudoramibacter alactolyticus [HMPREF0721_RS02020; HMPREF0721_RS04640], Scardovia inopinata [SCIP_RS02440] and Olsenella uli DSM7084 [OLSU_RS02990]. Conclusion: Our findings suggest that the U32 proteases may be related to carious dentine. The contribution of a small number of species to dentine degradation should be further investigated. These proteases may have potential in future biotechnological and medical applications, serving as targets for the development of therapeutic agents.


Introduction
Root hard tissues (cementum and dentine) become vulnerable to demineralization once root surfaces are exposed. These tissues are less mineralized than enamel and are composed of high proportions of organic materials such as collagen [1,2]. From a clinical point of view, the development of caries in root hard tissues may be considered a two-stage process: the first stage is characterized by mineral dissolution and the second by the degradation of the organic matrix of the root surface [3]. Microbial invasion of cementum and dentine tissues has been reported even in the first stage of the caries process, whereas in enamel caries, dentine is invaded only once enamel is destroyed [4,5]. This fact has an impact on the microbiome associated with the caries process in root hard tissues.
The function of bacteria in the demineralization stage of caries development is well known. Root hard tissue demineralization may develop in the presence of a rich and diverse microbiota, and the acidification of the microenvironment selects some species that are able to survive at low pH and produce high amounts of organic acids [6]. Root dentine biofilms are composed of a variety of saccharolytic, aciduric, and acidogenic organisms, as well as proteolytic bacteria, which can produce acids or ammonia from the catabolism of nitrogenous substrates that are available exogenously or from the dentine organic matrix [3,7]; thus, they can affect the biofilm pH in several ways. In addition to demineralization, bacteria may be involved in matrix degradation. Collagen is resistant to most common proteases and can be degraded by only a few types of proteases from mammals or bacteria [8], including some metalloproteases and serine proteases. It has been suggested that host collagenases from dentine are associated with collagen matrix degradation during caries progression [9,10], representing a response of the host tissues to caries attack under acidic conditions. These proteases, which include matrix metalloproteinases (MMP-2, 3, 8, 9, and 20) and cysteine cathepsins (B and K), are present in the dentinal organic matrix and become activated once the cementum is degraded [3,[9][10][11][12][13].
Recently, a tissue-dependent hypothesis for dental caries suggested that some bacteria could promote dentine degradation and caries development [14]. This hypothesis is based on the discovery of overexpression of genes related to proteolytic activity, as well as bacterial collagenases in dentinal caries from coronal lesions [14,15]. These studies showed for the first time that microbial proteolytic activity might contribute to dentinal protein degradation. Microbial collagenolytic activity has been demonstrated in a few oral bacteria [3]; however, a real contribution of bacteria to the degradation of the organic part of root dentine remains questionable. Protease PrtC from Porphyromonas gingivalis ATCC 53977 is one of the most reported microbial collagenolytic proteases produced by oral bacteria. It is part of the U32 protease family and contains 1,002 bp encoding a 333-residue PrtC protein. It can degrade soluble and reconstituted fibrillar type I collagen (the most common in root hard tissues) at body temperature or below [8,16]. Due to the relationship between the periodontal biofilm and the biofilm that cause root caries (RC), this protein could be involved in root dentine degradation.
The collagenase activity-dependent ability to degrade the dentinal collagen matrix could be an important virulence trait of plaque biofilms. In this study, we evaluated bacterial collagenolytic protease gene expression within natural biofilms from RC compared with supragingival biofilms of RC-free individuals by RNA-seq data analysis. The terminology 'bacterial collagenolytic proteases' was used to refer to all proteases that can degrade at least one type of collagen according to Zhang et al., including true collagenases and other proteases with collagenolytic activity [8]. These data may help clarify the role of bacteria in collagen matrix degradation in RC.

Materials and methods
Sample collection was carried out as described by Damé-Teixeira et al. [17]. Briefly, 10 volunteers with an exposed root surface and no RC lesion were included in the sound root surface group (SRS). Supragingival biofilms were collected from all exposed root surfaces. All participants recruited for the RC group had one primary cavitated root lesion in need of restorative treatment. All lesions presented characteristics of activity (soft and yellow dentine). Biofilm and carious dentine samples (soft and infected tissue) were collected from 30 patients during the restorative treatment.
Upon collection, samples were placed in a nuclease-free microtube containing 1 mL of RNAprotect reagent (Qiagen Inc.). Total RNA was extracted from all samples using the UltraClean® Microbial RNA Isolation (Mo-bio, San Diego, CA, USA) using oncolumn DNAse digestion (Qiagen, Inc.). The extracted RNA was quantified using the Quant-iT™ RiboGreen® RNA Assay Kit (Invitrogen), and samples with total RNA concentration <30 ng/RNA were pooled, leading to a final sample count of 10 SRS and 9 RC. The Ribo-Zero™ Meta-Bacteria Kit (Epicentre, Illumina) was used for mRNA enrichment, and Illumina®TruSeq™ library prep protocols (Illumina, SD) were used for library preparation and paired-end sequencing with the Illumina HiSeq2500.
Read sequences for each sample were quality trimmed using cutadapt and imported into the CLC Genomics Workbench v8 software (CLC Bio, Qiagen). The genomes of 162 bacteria and their associated information were downloaded from the DNA Data Bank of Japan, NCBI, the Broad Institute, and the HOMD database and mapped against the shortreads sequences (for the list of genomes, see [17]). The data produced are available from the National Center for Biotechnology Information (NCBI) Sequence Read Archive, under the accession numbers SRS779973 and SRS796739. Read count data for all potential collagenases were manually extracted from the 162 genomes, with particular focus on the U32 family proteases [8] due to the implication of this family as virulence factors in oral bacteria and its abundance. However, peptidolytic or gelatinolytic proteases were not included in this study's analysis.
The number of genes with no activity was stated as 'number of reads = 0'. The relative median expression level for genes from bacterial collagenolytic proteases was calculated for each of the sample groups, as described previously [18] within the R package 'DESeq' [19], and considered as the 'gene expression value'. Graphs were generated within the R package 'plotly' [20].
Statistical analysis for inferring differential gene expression between sample groups was also carried out using the R package DESeq2 [21]. The cut-off for designating a gene as being differentially expressed was a change in transcript levels of at least 1-log fold change (two times difference, negative values = up-regulated in SRS and down-regulated in RC and positive values = down-regulated in RC and up-regulated in SRS) and Benjamini-Hochberg adjusted p-value (padj) of less than 10 −3 [22]. This high cut-off was chosen in order to avoid false-positive results and identify only true differences.
This study was approved by the ethics committee of the Federal University of Rio Grande do Sul (process n°427.168) and by the Yorkshire & The Humber -Leeds West National Research Ethics Service Committee (protocol n°2012002DD). Volunteers to the study were patients who attended dental clinics for any dental treatment in two centres: Faculty of Dentistry, Federal University of Rio Grande do Sul, Porto Alegre, Brazil; and the School of Dentistry, Dental Translational Research Unit, University of Leeds, Leeds, UK. All volunteers consented to participate and donate samples after receiving the information about the study.

Results
A total of 201 genes coding for bacterial collagenolytic proteases were identified in 113 bacterial species; 24 from Prevotella spp. and 20 from Streptococcus spp. Table 1 describes genes encoding bacterial collagenolytic proteases identified in the metatranscriptome analysis of root biofilms, showing that a majority expressed genes for the peptidase U32 family (basically protease PrtC).
Overall, bacterial collagenolytic proteases showed low levels of expression. The higher proportion of reads assigned to the bacterial collagenolytic proteases was 0.1% of total reads (RC_7). The other samples had an average of proportion of reads assigned to the bacterial collagenolytic proteases of 0.04% for SRS and 0.05% for the RC group, and no statistically significant differences were found (t test; p = 0.2) (Figure 1(a,b)). However, the number of collagenase genes with no expression (number of reads = 0) was SRS = 73.1 ± 9.6 (36.4%) and RC = 109.1 ± 23.7 (54.3%) (t test; p = 0.000). Thus, in spite of similar number of reads in RC and SRS, the number of genes encoding collagenases in RC was lower than in SRS.
The heatmap showing the distances between the samples (n = 19) is represented in Figure 2. It takes into account the level of expression of the genes that code bacterial collagenolytic proteases within the sample for each group. There was less sample-to-sample variation between the SRS samples than the RC samples (RC_8, RC_D and RC_E differ from the other RC samples). The diversity of gene expression patterns in the RC samples could represent differences in the lesion characteristics, such as caries stages and lesion sizes. Figure 3 shows the median expression value of collagenolytic proteases in RC lesions, i.e. the median of the normalized read numbers. Eight collagenolytic proteases had a median of expression value higher than 100, including those from S. mutans, Veillonella parvula, V. dispar, Leptotrichia buccalis, Olsenella uli, and Scardovia inopinata. It is important to point out that in two RC samples, S. inopinata had the highest collagenolytic protease expression value (RC_A = 14,838.83 and RC_B = 3,305.65), although the median was lower than other species. Three collagenolytic proteases had expression values higher than 200, meaning that these were very highly expressed in RC: SMU_761 and SMU_759 from S. mutans and RS05935 from V. parvula. S. mutans possessed collagenolytic proteases with the highest gene expression in RC, while L. buccalis possessed collagenolytic proteases with the highest gene expression in SRS.

Discussion
The current understanding of the microbial functions in RC and dentine caries remains limited compared with enamel caries. In a recent review of caries ecological hypotheses, it was proposed that bacteria play a role in the degradation of the organic components of teeth [3]. Although a lot of bacteria are found to secrete collagenolytic proteases, their roles and the mechanisms involved in cariogenic processes are still largely unknown [8]. This is the first study showing bacterial collagenolytic proteases gene expression within the metatranscriptome of clinical dental biofilms with and without RC. Our findings show that a few species were responsible for high expression of genes that code for bacterial collagenolytic proteases in RC, namely S. mutans, V. parvula, V. dispar, and S. inopinata.
The progression of caries lesions involves the degradation of the collagen matrix in the root hard tissues. The collagen protein family is characterized by the presence of the proline-rich tripeptide 'Gly-X-Y', forming a triple helix of polypeptide chains in which the glycine residue is positioned in the centre [41]. Collagen type I, the most common in dentine, has a heterotrimer structure. The collagen structure contributes to the molecular stabilization and mechanical properties of dentine. Only a specific group of proteases, the collagenases, are able to degrade collagen. The triple-helix is interrupted in its internal structure by digesting the triple-helix three-quarters of the way from the terminal amino group 'Gly-Leu' bond. This may cause intramolecular flexibility and allow specific proteolytic cleavage [41]. Bacterial collagenolytic proteases include some metalloproteases of the M9 family and some serine proteases. These are distributed in the S1, S8, and S53 families and also some members of the U32 family, mainly from pathogenic bacteria [8]. In this study, protease PrtC was detected to have a relatively low gene expression levels. Other protease families were   The microorganism associated with each gene annotation is indicated in the first column along with the corresponding reference when available. Taxonomical and protein assignments were identified in the metatranscriptome analysis of root biofilms. not detected in the genomes annotation, and these still remain to be investigated (i.e. the M9, S8, and S53 families). Dental caries occurs not by continuous demineralization but by alternating demineralization and remineralization. According to a recent theory proposed by Takahashi and Nyvad (2016), the exposed collagen is broken down and the collagen content may be denatured during a second stage of RC. The theory suggests that collagen matrix degradation could only be possible after demineralization because the substrate is not accessible by collagenases in the mineralized tissue. Some endogenous collagenases have been shown to be involved in this process [9,10,42]. MMPs, zinc-dependent endopeptidases, are able to cleave denatured collagen. They function in tissue development and repair and in pathological processes as well [43]. It has been found that bacterial collagenases have no activity during demineralization in an acid environment (pH 4.3) [43,44], and it was shown that collagenase works during the remineralizing phase and predominantly attacks the organic matrix of the root after demineralization [44]. However, collagen degradation products are known  to be released from dentine when treated with lactic acid and bacterial collagenase or trypsin [45]. Therefore, acids from bacterial metabolism may render dentinal collagen more susceptible to host and microbial proteases such as those of the U32 family.
It has been reported that S. mutans is not associated with collagen matrix degradation in cavitated RC [46,47]. However, in this study, we detected high expression of genes SMU_761 and SMU_759 (S. mutans UA159). Both genes encode collagenase-like protease, PrtC family (peptidase U32 family) [48]. SMU_761 codes for a 428 aa protein, while SMU_759 encodes a 308 aa protein. S. mutans is widely known as an important aetiological agent of dental caries, due to its involvement in biofilm formation and its aciduricity and acidogenicity. Furthermore, most culture-based studies have shown a strong relationship between RC and these bacteria, which have higher isolation frequencies and/ or higher proportions on carious root surfaces [49][50][51][52][53]. Our results suggest that the collagenase activity could also be an important virulence factor of S. mutans in RC. These proteases were also elevated under conditions of glucose excess in another in vitro transcriptome study [54].
Along with S. mutans, two species of Veillonella (V. parvula and V. dispar) showed high collagenase gene expression levels in RC. These species have been implicated in dentinal caries due to their overexpressed functions in caries lesions, inferring a role in disease [18]. Other species such as P. alactolyticus, S. inopinata, and O. uli had high differential expression in RC when compared to SRS. These species have been included in the complex microbial community of coronal caries [15] and RC [53,[55][56][57][58], but their roles and functions have been underexplored.
A higher level of gene expression of some bacterial collagenases was observed in samples from the control group of this study (supragingival biofilm -SRS). Periodontopathogens, such as Prevotella intermedia, showed high differential expression in SRS. The SRS group included patients in preventive periodic maintenance for periodontal disease: the U32 proteases explored here have been previously related to periodontal disease [16]. So this result could be linked to collagen degradation of periodontal tissues.
It is important to acknowledge that we cannot state that there is activity of bacterial collagenolytic proteases in the degradation of dentine because our data are based on gene expression and the enzymes could be inactive in vivo. It is also important to note that other organisms not included as reference genomes in this analysis could be expressing collagenases, as the analysis presented here relies on the current reference databases and other not yet identified collagenases (for example, those currently identified as hypothetical proteins) may play an important role in collagen degradation. This work represents a preliminary screening of transcripts coding for collagenases using clinical data and the validation is being planned in further investigations. However, it is important to point out that the level of protease transcripts observed in this study may indicate the importance of this function within the RC biofilm communities, considering that the transcription of irrelevant genes would be a waste of energy to the microorganisms.
The results suggest that the U32 proteases could be related to RC lesions (carious dentine). The contribution of some species in dentine degradation should be further investigated, such as S. mutans, V. parvula, and V. dispar (high gene expression level in RC), as well as P. alactolyticus, S. inopinata, and O. uli (high differential expression in RC when compared to SRS). Our results provide novel insights into the collagenase activity of some bacterial species in RC. These studies lay the foundations for further investigations involving the use of proteomic tools, to better understand the aetiology of RC, and microbial metabolic activities leading to disease progression. These proteases may have potential for future biotechnological and medical applications serving as targets for the development of therapeutic agents.