Metal-backed versus all-polyethylene tibial components in primary total knee arthroplasty

Background and purpose The choice of either all-polyethylene (AP) tibial components or metal-backed (MB) tibial components in total knee arthroplasty (TKA) remains controversial. We therefore performed a meta-analysis and systematic review of randomized controlled trials that have evaluated MB and AP tibial components in primary TKA. Methods The search strategy included a computerized literature search (Medline, EMBASE, Scopus, and the Cochrane Central Register of Controlled Trials) and a manual search of major orthopedic journals. A meta-analysis and systematic review of randomized or quasi-randomized trials that compared the performance of tibial components in primary TKA was performed using a fixed or random effects model. We assessed the methodological quality of studies using Detsky quality scale. Results 9 randomized controlled trials (RCTs) published between 2000 and 2009 met the inclusion quality standards for the systematic review. The mean standardized Detsky score was 14 (SD 3). We found that the frequency of radiolucent lines in the MB group was significantly higher than that in the AP group. There were no statistically significant differences between the MB and AP tibial components regarding component positioning, knee score, knee range of motion, quality of life, and postoperative complications. Interpretation Based on evidence obtained from this study, the AP tibial component was comparable with or better than the MB tibial component in TKA. However, high-quality RCTs are required to validate the results.


Background and purpose The choice of either all-polyethylene (AP) tibial components or metal-backed (MB) tibial components in total knee arthroplasty (TKA) remains controversial. We therefore performed a meta-analysis and systematic review of randomized controlled trials that have evaluated MB and AP tibial components in primary TKA.
Methods The search strategy included a computerized literature search (Medline, EMBASE, Scopus, and the Cochrane Central Register of Controlled Trials) and a manual search of major orthopedic journals. A meta-analysis and systematic review of randomized or quasi-randomized trials that compared the performance of tibial components in primary TKA was performed using a fixed or random effects model. We assessed the methodological quality of studies using Detsky quality scale.
Results 9 randomized controlled trials (RCTs) published between 2000 and 2009 met the inclusion quality standards for the systematic review. The mean standardized Detsky score was 14 (SD 3). We found that the frequency of radiolucent lines in the MB group was significantly higher than that in the AP group. There were no statistically significant differences between the MB and AP tibial components regarding component positioning, knee score, knee range of motion, quality of life, and postoperative complications.
Interpretation Based on evidence obtained from this study, the AP tibial component was comparable with or better than the MB tibial component in TKA. However, high-quality RCTs are required to validate the results.  The design of the tibial component is an important factor for implant failure in total knee arthroplasty (TKA) (Pagnano et al. 1999, Forster 2003, Gioe et al. 2007b, Willie et al. 2008, Garcia et al. 2009, KAT Trial Group 2009). The metal-backed (MB) design of tibial component has become predominant in TKA because it is thought to perform better than the allpolyethylene (AP) design (Muller et al. 2006, Gioe et al. 2006, 2007a. In theory, the MB tibial component reduces bending strains in the stem, reduces compressive stresses in the cement and cancellous bone beneath the baseplate (especially during asymmetric loading), and distributes load more evenly across the interface (Bartel et al. 1982, 1985, Taylor et al. 1998. However, critics of the MB tibial component claim that there are expensive implant costs, reduced polyethylene thickness with the same amount of bone resection, backside wear, and increased tensile stresses at the interface during eccentric loading (Bartel et al. 1982, 1985, Pomeroy et al. 2000, Rodriguez et al. 2001, Li et al. 2002, Muller et al. 2006, Blumenfeld and Scott 2010, Gioe and Maheshwari 2010. In the past decade, several randomized controlled trials (RCTs) have been performed to assess the effectiveness of the MB tibial component (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Hyldahl et al. 2005a, b, Muller et al. 2006, Gioe et al. 2007, Bettinson et al. 2009, KAT Trial Group 2009. However, data have not been formally and systematically analyzed using quantitative methods in order to determine whether the MB tibial component is indeed optimal for patients in TKA. In this study, we wanted (1) to determine the scientific quality of published RCTs comparing the AP and MB tibial components in TKA using Detsky score (Detsky et al. 1992) and (2) to conduct a meta-analysis and systematic review of all published RCTs that have compared the effects of AP and MB tibial components on the radiographic and clinical outcomes of TKA.

Methods
Our study conformed to the PRISMA guidelines for reporting of meta-anlyses and systematic reviews (Moher et al. 2009).
We searched PubMed (1985to February 2009), EMBASE (1988to February 2009), Scopus (1982to February 2009, and the Cochrane Central Register of Controlled Trials (Issue 2, 2009). We used the key words all-polyethylene, metalbacked, total knee arthroplasty, total knee replacement, TKA, and TKR to search the electronic database for RCTs that had evaluated and compared the performance of the AP and MB tibial components in primary TKA. We did not set any restrictions on language and on the duration of follow-up. However, we excluded all observational studies and case series. Furthermore, manual searching was done in the following 7 major orthopedic journals for the years 1990-2009: Journal of Bone and Joint Surgery (American and British), Clinical Orthopaedics and Related Research, Acta Orthopaedica, The Knee, Knee Surgery Sports Traumatology Arthroscopy, and The Journal of Arthroplasty. Two reviewers (TC and GZ) independently screened the titles and abstracts of identified papers, and full-text copies of all potentially relevant studies were obtained. The reference lists of the retrieved articles were also screened for any available information.
Methodological quality was independently assessed by two reviewers (TC and GZ) using the 21-point study-qualityassessment Detsky score (Detsky et al. 1992). Discrepancy regarding selection of studies was resolved by discussion with the senior author (XZ). The methodological quality of the RCT was assessed using Detsky score, which is a 14-item scoring system that contains the following domains: eligibility criteria, adequacy of randomization, description of therapies, assessment of outcomes, and statistical analysis.
The following variables were reviewed in all comparative studies, and statistically significant differences between treatment groups in the studies were noted: radiographic outcomes (alignment of the lower limb, implant placement, radiolucent line), and clinical outcomes (knee score, knee range of motion, quality of life, postoperative complications).

Statistics
For dichotomous outcomes, risk ratio (RR) and 95% confidence limits (CIs) were calculated. Any p-values of less than 0.05 were considered statistically significant. I 2 test for heterogeneity was conducted on the pooled results of the studies. Data from comparable studies were collated using fixed effects model unless evidence of heterogeneity across studies existed. If there were insufficient mean and standard deviation/standard error data, and meta-analysis was not possible, a systematic review was performed. Publication bias among the studies included was assessed graphically using funnel plots. The meta-analysis was conducted by one investigator (GZ) using SPSS software version 13.0 (SPCC Inc., Chicago, Illinois, USA) and RevMan software version 5.0 (Nordic Cochrane Center, Copenhagen, Denmark).

Results
In the initial search we identified 364 potentially relevant studies. After reviewing titles and abstracts and applying the inclusion and exclusion criteria, only 10 articles (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Hyldahl et al. 2005a, b, Muller et al. 2006, Gioe et al. 2007, Bettinson et al. 2009, KAT Trial Group 2009) fulfilled the inclusion and exclusion criteria in the systematic review and meta-analysis ( Figure 1 and Table 1). 2 of them Bowman 2000, Gioe et al. 2007) were reports on the same cohort at different follow-up periods. The randomization process was described and was appropriate for 5 studies (Hyldahl et al. 2005a, b, Muller et al. 2006, Bettinson et al. 2009, KAT Trial Group 2009). The authors of 4 studies mentioned randomization allocation but lacked a description of the randomization method (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Gioe et al. 2007. With respect to allocation concealment, 5 studies (Adalberth et al. 2000, Norgren et al. 2004, Muller et al. 2006, Bettinson et al. 2009) were adequate and 4 (Gioe and Bowman 2000, Hyldahl et al. 2005a, b, Gioe et al. 2007, KAT Trial Group 2009) were unclear. Blinding of surgeons and patients was impossible, as showing patients their radiographs was part of routine care. The study population, inclusion/exclusion criteria, treatment interventions, follow-up time frame, and reported results were extracted and tabulated (Table 1). The sample sizes ranged from 23 to 566, with 407 men and 998 Additional records identified through other sources n = 6 Records after duplicates removed n = 352 Records screened n = 352 Records excluded n = 268 Full-text articles assessed for eligibility n = 84 Full-text articles excluded n = 84 Studies included in qualitative synthesis n = 9 Studies included in quantitative synthesis (meta-analysis) n = 9 -review article -retrospective study -case series -outcome of interest not presented -revision total knee arthroplasty -cadaver procedures -animal studies -duplicates Identification Screening Eligibility Included women-a total of 1,405 subjects. Within each study, there were no other differences between the treatment groups in terms of age, sex, or number of subjects, or in any other demographic information preoperatively. The duration of the follow-up assessment ranged from 2 to 10 years. The raw Detsky score for the included trials ranged from 11 to 18 points. The mean standardized score and standard deviation for the overall quality of the nine studies was 14 (SD 3). Funnel plot calculation showed substantial evidence of publication bias for the complication rate (Figure 2). 7 studies (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Hyldahl et al. 2005a, b, Muller et al. 2006, Gioe et al. 2007) used conventional radiographs to compare the radiographic outcomes (the alignment of the lower limb and that of the components) between the two groups. There was no statistically significant difference between the groups with regard to the femoral mechanical axis Bowman 2000, Gioe et al. 2007) and hip-knee-ankle angle (Norgren et al. 2004, Hyldahl et al. 2005a. The authors of 2 studies (Adalberth et al. 2000(Adalberth et al. , 2001 reported that there was no statistically significant difference between the two groups with regard to the anatomic axis of the lower limb (coronal tibiofemoral angle). 5 studies (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Muller et al. 2006, Gioe et al. 2007 found that the frontal alignment of the tibial component was not significantly different between the groups. However, in the sagittal plane the alignment of the tibial component was found to be controversial in the included studies. 3 studies (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Gioe et al. 2007a found no significant difference between the groups, whereas Adalberth et al. (2001) found that the AP components were positioned with a slightly more posterior tilt as compared to the MB components. In addition, 2 studies (Gioe and Bowman 2000, Gioe et al. 2007a) evaluated femoral coronal position, change in joint line, and patellar height. The authors reported no statistically significant difference between the groups at the latest follow-up. We pooled the results from 4 studies and found that    (Muller et al. 2006, KAT Trial Group 2009, Knee Society knee score (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Gioe et al. 2007), or Hospital for Special Surgery (HSS) score (Hyldahl et al. 2005a,b). Knee range of motion (ROM) as an outcome measure was documented in 5 studies (Adalberth et al. 2000, Gioe and Bowman 2000, Norgren et al. 2004, Muller et al. 2006, Gioe et al. 2007). All studies found that these functional outcomes were not significantly different between the groups at all follow-up time points. Quality of life was measured using 3 methods: Short Form-12, Short Form-36, or EuroQol-5D. 3 studies used Short Form-12 scores (Muller et al. 2006, KAT Trial Group 2009 or Short Form-36 Bowman 2000, Gioe et al. 2007), whereas only 1 study (KAT Trial Group 2009) used EuroQol-5D. These studies found no statistically significant difference in the quality of life scores between AP and MB tibial components.

Discussion
To our knowledge, this is the first systematic review and metaanalysis of RCTs comparing AP and MB tibial implants in primary TKA. Our findings show that the AP and MB tibial components gave similar radiographic and clinical results.
In our study, the frequency of radiolucent lines observed in the MB group was statistically significantly higher than that observed in the AP group. Tibial radiolucent lines may be more clearly delineated in MB components than in AP components. This may be due to the underestimated radiolucencies around the AP tibial components (Gioe and Bowman 2000). The higher incidence of radiolucent lines in the MB group may reflect this phenomenon. Non-progressive radiolucencies of 2 mm or less appear to have little clinical significance (Ritter et al. 1981, Scuderi et al. 1989, Bach et al. 2009), which corresponds to our finding that radiolucent lines had no effect on knee scores, ROM, quality of life, or postoperative complications. These results are broadly consistent with evidence from previous studies (Apel et al. 1991, Ritter et al. 1994aet al. , b, Shen et al. 2009). In a 15-year survivorship study, Ranawat et al. (1993) reported a high incidence of tibial radiolucencies (72%), but only 2 of tibial components were loose. Although there is no direct correlation between nonprogressive radiolucencies and subsequent implant loosening (Ranawat et al. 1993, Gioe et al. 2007, progressive radiolucent lines are commonly associated with early failure (Apel et al. 1991, Ritter et al. 1994a. Metal backing of the tibial component has given lower strain and better load distribution in the proximal tibia in invitro biomechanical studies (Bartel et al. 1982, 1985, Small et al. 2010, which should theoretically reduce aseptic loosening and provide higher long-term survivorship of the implant. However, evidence from matched-pair or retrospective studies suggests that there is no difference in survival between AP and MB tibial components with medium-term or long-term follow-up (Apel et al. 1991, L'Insalata et al. 1992, Rand et al. 1993, Rodriguez et al. 2001, Udomkiat et al. 2001, Najibi et al. 2003, Dojcinovic et al. 2007). The findings of nonrandomized cohort studies are, by nature, limited-and they are often biased due to the presence of confounding factors, including the surgeon's learning curve and patient selection. Recently, two RCTs (Gioe et al. 2007, Bettinson et al. 2009 found that survivorship, with revision for any reason as the endpoint, was similar between the two designs. Bettinson et al. (2009) reported that there was no statistically significant difference between the two designs when 10-year survivorship with aseptic failure was used as endpoint. In yet another study, Gioe et al. (2007) reported that the 10-year survivorship was marginally greater in the AP component group than in the MB component group. In fact, the overall revision rate in both groups was very low. Both groups achieved good or excellent survivorship rates for revision and reoperation after TKA during the long-term follow-up.
Although our conclusions are strengthened by the standard procedures for retrieval, assessment of relevance, and statistical processing in this systematic review and meta-analysis, a number of potential limitations should be taken into account. Firstly, there were a number of methodological limitations in the literature, including poorly randomized samples in group allocation and rare blinding of assessors or patients to the group allocation. Secondly, differences in patient population, surgical technique, outcome evaluation tool, and follow-up time may account for the clinical and statistical heterogeneity of these studies. Accordingly, the conclusions made in this review should be treated with caution. Thirdly, some variables studied in the systematic review did not shed any light on the relative role of MB or AP components, but provided a comparison between the patient groups in which the two implants were used in the studies included. For example, the early systemic complication, which implant design has little or no influence on, was associated with surgical technique and patient factors. In order to provide evidence for making the optimal choice between the two components, one should concentrate on wear rates, loosening, revision, and survivorship analysis. Finally, the consideration of cost of both tibial components could not be addressed in our analysis because none of the authors of the studies that were included reported on this subject. Although costs vary according to the brand of implant and may be determined by volume and domain contracts, AP tibial components may give cost savings (Pomeroy et al. 2000). Furthermore, none of the studies analyzed showed superiority of the MB tibial design over the AP tibial design, so we encourage use of the AP tibial component due to its low cost and excellent clinical outcomes.
In conclusion, we found similar results in the two groups in terms of knee scores, ROM, quality of life, implant alignment, and postoperative complications. Although the frequency of radiolucent lines observed in the MB group was statistically significantly higher than in the AP group, we could not prove that this corresponded to a clinically important increase in implant failure. Thus, this evidence-based literature review does not support the idea that the MB tibial component may be superior to the AP tibial component.
TC and GZ both participated in planning of the study, performed the statistical analyses, and contributed equally to all parts of the manuscript. XZ initiated the review and supervised the study as head of the department. All authors read and approved the final manuscript. This work was supported by the Shanghai Municipal Health Bureau Science Fund for Young Scholars (2010QJ036A).