Comparison of bone-anchored prostheses and socket prostheses for patients with a lower extremity amputation: a systematic review.

Abstract Purpose: This study aimed to provide an overview of a) the used measurement instruments in studies evaluating effects on quality of life (QoL), function, activity and participation level in patients with a lower extremity amputation using bone-anchored prostheses compared to socket prostheses and b) the effects themselves. Method: A systematic literature search was conducted in MEDLINE, Cochrane, EMBASE, CINAHL and Web of Science. Included studies compared QoL, function, activity and/or participation level in patients with bone-anchored or socket prostheses. A best-evidence synthesis was performed. Results: Out of 226 studies, five cohort and two cross-sectional studies were eligible for inclusion, all had methodological shortcomings. These studies used 10 different measurement instruments and two separate questions to assess outcome. Bone-anchored prostheses were associated with better condition-specific QoL and better outcomes on several of the physical QoL subscales, outcomes on the physical bodily pain subscale were inconclusive. Outcomes on function and activity level increased, no change was found at participation level. The level of evidence was limited. Conclusions: There is a need for a standard set of instruments. There was limited evidence that bone-anchored prostheses resulted in higher QoL, function and activity levels than socket prostheses, in patients with socket-related problems. Implications for Rehabilitation Use of bone-anchored prostheses in combination with intensive outpatient rehabilitation may improve QoL, function and activity level compared with socket prosthesis use in patients with a transfemoral amputation and socket-related problems. All clinicians and researchers involved with bone-anchored prostheses should use and publish data on QoL, function, activity and participation level. There needs to be an agreement on a standard set of instruments so that interventions for patients with a lower extremity amputation are assessed consistently.


Introduction
In 2005 one in 190 people in the United States were living with limb loss. [1] The incidence of lower extremity amputation in the Netherlands is 18-20 per 100 000; 90-94% of the amputation are due to vascular disease, 3% to trauma and 3% to tumour resection. [2] The incidence of non-vascular lower extremity amputations is low, but the prevalence is high because this kind of amputation is most frequently in adolescents and adults below the age of 45 years. [1] There is a lack of recent data on incidence and prevalence of lower extremity amputation.
Approximately 86% of patients with a lower extremity amputation are fitted with a socket prosthesis, [2] 34-63% of these patients have chronic skin problems and pain associated with the socket. [3][4][5][6][7][8] These problems often have a severe impact on quality of life (QoL) related to problems at body function or structures (e. g., limited prosthesis use) and activity level (mobility restrictions) with as consequence limitations in participation. [3,5,[9][10][11] One approach to treat socket-related problems is to attach the prosthesis to the skeleton transcutaneously by osseointegration using an intramedullary implant. [12] Bone-anchored prostheses are an option when the cause of amputation was trauma or cancer and the patient is experiencing socket-related problems. Because of the risk of infection bone-anchored prostheses are not currently recommended for patients whose amputation had a vascular cause. [11][12][13][14] In the future, however, bone-anchored prostheses might also be an option for patients with stable vascular disease [15] and for patients without socket-related problems who wish to increase their prosthesis use. [11] At present, there are two implants for bone-anchored prosthesis which are used in humans: [16] the Osseointegrated Prosthesis for the Rehabilitation of Amputees (OPRA) [17] and Endo-Exo Femoral Prosthesis (EEFP). [11,18] Both are intramedullary implants, but OPRA is a titanium screw [12] while EEFP is a cobalt-chrome press-fit fixation. [18] The two-step surgery technique used with both implants has been well described. When OPRA [14,17,19,20] is used the two operations are separated by a minimum of six months, with EEFP [11,13,18,21,22] the inter-surgery interval is six to eight weeks. Rehabilitation of OPRA has been described extensively, [14,20] of EEFP rehabilitation there is a less detailed description. [11,18] An important difference in rehabilitation protocols is that with EEFP the load on the prosthesis can be increased more rapidly so the rehabilitation programme can be shorter.
Since the introduction of bone-anchored prostheses for patients with a lower extremity amputation using osseointegration in 1990, [12] a number of studies have investigated incidence of infection and survival of the implants. [13,18,20,[22][23][24] These studies showed frequent problems around the skin-penetrating stoma region, but a low (1.8-10.4%) frequency of disability or implant removal as result of infections or other problems such as periprosthetic fractures. The majority of treatment failures occurred in the early years of bone-anchored prosthesis surgery. [20] Recent research has shown that the mean annual costs associated with socket prostheses and bone-anchored prostheses are comparable. [25] Although bone-anchored prosthesis users were provided with more advanced costly prosthetic components, they made significantly fewer visits to the prosthesist than socket prosthesis users, resulting in lower post-surgery costs. [25] One qualitative, phenomenological study has been performed. All the participants described living with bone-anchored prosthesis as a positive, revolutionary change that went beyond the functional improvements, integrating the existential implications in the concept of QoL. [26] Osseoperception, the ability to perceive pressure, load, position and balance, [27] is one of the benefits attributed to bone-anchored prostheses. [12,27,28] In socket prosthesis users, low hip abductor strength is correlated with asymmetrical gait, [29,30] which may explain back pain and pain in other regions such as the residual limb, sound side, buttocks, hips and neck or shoulder. [31] Two studies [32,33] compared muscle function of the residual limb in bone-anchored prosthesis users with muscle function in healthy subjects. These studies showed that muscle activity patterns retained aspects of activity patterns found in healthy subjects during gait; [33] however, bone-anchored prosthesis users were unable to maintain a maximum voluntary contraction of constant amplitude. [32] There has been no research comparing muscle function in users of bone-anchored prostheses and socket prostheses. A critical issue for clinicians and funding bodies is the extent to which bone-anchored prostheses improve outcomes with respect to QoL, body function or structures (hereafter referred as function), activity and participation level relative to socket prostheses. [34][35][36] It is important for patients considering bone-anchored prosthesis surgery to have access to this information, so that they are able to make an informed choice. At present there is no systematic analysis of instruments used to evaluate functional outcomes after bone-anchored prostheses surgery and only one review [37] concerning the effects of bone-anchored prostheses compared to socket prostheses. Insight in measurement instruments is important to allow professionals to evaluate bone-anchored prosthesis surgery and the effectiveness of rehabilitation programmes.
The two aims of this study were to (a) give an overview of the instruments used to evaluate outcomes in terms of QoL, function, activity and participation level in studies comparing bone-anchored prostheses and socket prostheses for patients with a lower extremity amputation and (b) provide a systematic review of the literature comparing outcomes measured in terms of QoL, function, activity and participation level for patients with a lower extremity amputation using bone-anchored prosthesis relative to socket prosthesis.

Design
This systematic review of published, peer-reviewed articles with original data followed the guidelines of the PRISMA statement. [38] Information sources We searched the following electronic databases (last search 16 April 2016): MEDLINE (accessed via PubMed), Cochrane Central Register of Controlled Trials, EMBASE (accessed via EBSCO), CINAHL and Web of Science.

Search strategy
A search strategy based on the PICO strategy was developed by the first author (RL) with the support of a medical librarian. In order to obtain a broad range of records, the comparison and outcome elements were not included in the search string but were used at the article selection stage. There were no language or publication date criteria within the search.

Inclusion criteria
We (RL and GvH) systematically screened studies and considered them eligible for inclusion if they: (a) were conducted in humans with a lower extremity bone-anchored prosthesis; (b) studies with original data comparing bone-anchored prostheses with socket prostheses; (c) evaluated outcomes in terms of QoL, function, activity or participation level (instruments were classified in line with the review by Samuelsson et al. [39]); (d) the studies were published in English, Dutch or German. Articles were excluded if they were: (a) not quantitative studies; (b) a conference abstract, letter to the editor, textbook, article for educational purposes, case report or case series; (c) presented only implant survival or infection data. The same criteria were used to include individual studies embedded in systematic reviews, if not already detected by the search string.

Study selection
Two authors (RL and GvH) independently reviewed the retrieved studies to determine eligibility. Initially studies were screened on the basis of, title and abstract; full texts were obtained for studies considered potentially eligible. Disagreements between the reviewers were solved by discussion. We also tracked the references of all included articles ( Figure 1). Reasons for exclusion based on assessment of the title and abstract or full text were noted.

Data extraction
The first reviewer (RL) extracted data using a standard extraction form; the results were checked by GvH. Data extracted from the included articles were: (a) authors, publication year and study location; (b) study design; (c) participants; (d) intervention; (e) measurement instruments; (f) results.

Methodological quality
Two reviewers (RL and GvH) independently assessed the methodological quality of the included articles; they were not blinded with respect to either authorship or journal. The methodological quality (risk of bias) was scored using the Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for Quantitative Studies. [40,41] This critical appraisal tool was chosen because we expected to encounter different types of non-randomised studies. The EPHPP tool assesses six aspects of methodology: (a) likelihood that the study participants are representative of the target population; (b) study design; (c) control of confounding variables; (d) blinding of participants and investigators; (e) validity and reliability of the data collection tools; and (f) proportion of withdrawals and drop-outs. This tool was separately applied to all studies; studies were rated "strong", "moderate" or "weak" with respect to each aspect of methodology using standard criteria set out in the EPHPP dictionary. [40,41] Overall ratings of quality rating (global ratings) were derived from the domain ratings as follows. Studies were classed as having "strong" methodology when none of the aspects was rated weak, "moderate" when there was one weak aspect and "weak" if there were two or more aspects of methodology rated weak. [41,42] The ratings of the data collection tools was based on the EPHPP dictionary, [41] but if a gold standard tool (e.g., gait laboratory) was used this aspect of methodology was also rated as strong. The EPHPP tool has been shown to be a reliable and valid tool for critical appraising study methodology. [42,43] Disagreements between the two raters were resolved by discussion, cases of persistent disagreement were referred to a third rater (JBS). Inter-rater agreement on the EPHPP tool domain ratings was measured with linear, weighted Cohen's j coefficient [44] as proposed by Byrt. [45] Values were classified as follows: 0.41-0.60: fair agreement; 0.61-0.80: good agreement; 0.81-0.92: very good agreement; 0.93-1.00: excellent agreement.
We conducted meta-analysis in cases where three or more studies were used the same outcome measures. In cases of clinical heterogeneity (e.g., diversity of follow-up time points) we performed a best-evidence synthesis of outcomes related to QoL, function, activity and participation level, which were used in two or more studies. As we anticipated that included studies would be non-randomised, we ranked the levels of evidence per comparison and for each outcome using a method described by Yusuf et al., [46] which is a modification of the guidelines on systematic review of the Cochrane Collaboration Back Review Group. [47] This modified method was also used by Veenhof et al. [48] and ranks the level of evidence in five levels ( Table 1).

Selected studies
We retrieved 226 potential eligible studies. Thirty-six were classified as potentially eligible on preliminary screening. Assessment of full texts of these studies resulted in seven studies [11,17,19,34,[49][50][51] being rated eligible for inclusion. A flowchart of the selection process, including reasons of the articles subjected to a review of the full text, is presented in Figure 1. Reasons for exclusion of the articles screened out on the basis of the title and abstract are presented as supplementary material. Table 2 provides a profile of the included studies. Seven studies were included in the systematic review, of which five were longitudinal (before and after) cohort studies [11,17,19,50,51] and two were cross-sectional studies. [34,49] In one cohort study, [51] the intervention group was compared to a control group of healthy subjects. Our search did not retrieve any randomised controlled trials (RCTs) or Controlled Clinical Trials (CCTs) comparing boneanchored prostheses with socket prostheses. The cohort studies assessed a grand total of 110 patients with a lower extremity amputation and the cross-sectional studies assessed a total of 185 socket prosthesis users and 32 bone-anchored prosthesis users. The age at inclusion for the cohort studies ranged from 20 to 70 years; in cross-sectional studies the age at inclusion ranged from 28 to 70 years for socket prostheses users and 26 to 67 years for bone-anchored prostheses users. Time from primary amputation to inclusion in the cohort studies ranged from 10 months to 45 years. In the cross-sectional studies time from primary amputation to inclusion varied from 2 to 56 years for socket prostheses users and, from 6 to 46 years for bone-anchored prostheses users. All patients in the included studies had a transfemoral amputation and the most common cause of primary amputation was trauma. The recruitment period for all the studies covered the period from 2005 to 2014. The studies were performed in two countries, Sweden (n ¼ 6) [17,19,34,[49][50][51] and the Netherlands (n ¼ 1). [11] In Sweden bone-anchored prostheses were placed with OPRA and in the Netherlands with EEFP. The sample size in the cohort studies ranged from 18 to 51 participants; in the cross-sectional studies the number of socket prosthesis users ranged from 43 to 142 participants and, the number of the bone-anchored prosthesis users from 12 to 20. The study of Hagberg et al. [50] is not included in the above presented summary concerning the grand total of assessed patients because the sample was also analysed by Branemark et al. [19] However, we decided not to exclude the study entirely because the majority of the outcome measures were different from those used by Branemark et al. [19] To our knowledge, there is no other instance of overlap between participants in the included studies.

Methodological quality assessment
There was 93% inter-rater agreement between the two reviewers on the ratings of individual domains of methodological quality.
The estimated mean linear weighted Kappa value was 0.86 ± 0.07. The most common shortcomings were lack of adjustment for confounding variables and failure to blind assessors and participants. The few disagreements about domain ratings were due to errors of comprehension or differences in interpretation of the methodological quality criteria; in no case did they affect the global score. All disagreements were resolved by discussion. The global rating of methodological quality was weak for all seven studies. Global EPHPP scores and scores for the six domains of methodological quality are presented in Table 3.

Measurement instruments used in the studies
Ten different measurement instruments were used across the seven included studies and two studies also used separate questions about sitting comfort [49] and working status [17] (Table 4). Instruments were classified as measuring QoL, function, activity or participation level outcome measures in line with the review by Samuelsson et al. [39] The most frequently used instrument was the Questionnaire for persons with Transfemoral Amputation (Q-TFA). Three different questionnaires were used in evaluation of QoL, the Q-TFA (problem score [17,19,50] and global score [11,17,19,50]), the 36-item Short-Form health survey (SF-36) [17,19,50] and the revised 36-item short-form health survey, which assesses six dimensions of health status (SF-6D). [50] The Q-TFA was also used to assesses outcome on function level (prosthetic use [11,17,19,50]) and activity level (prosthetic mobility [17,19,50]). Range of hip motion (assessed using a goniometer [49]) and biomechanical gait characteristics (assessed using a transducer [34] and laboratory measurements [51]) were also used as indicators of function level. Four physical performance measures were used to evaluate the activity level. The 6-Minute Walk Test (6MWT) [11] and Timed Up & Go (TUG) [11] were used to assess walking ability in terms of distance covered in 6-min and time needed to get up from a chair, walk 3-m up and down a walkway and sit again. The energetic cost of walking was assessed clinically using the Physiological Cost Index (PCI) [50] or with laboratory measurements. [11] Synthesis of results/meta-analysis Owing to the diversity of instruments, follow-up time points and study designs meta-analysis of the data would not have been meaningful. The Q-TFA was the most frequently used measure of outcome, but scores were analysed differently in the various studies. For example, the Swedish studies [17,19,50] calculated the prosthetic use score in points and the Dutch study [11] in hours per week. Furthermore, two articles presented overlapping results extracted from the Q-TFA and SF-36 because they assessed an overlapping sample. [19,50] Because of this overlap, we excluded the Hagberg et al. [50] Q-TFA data (with exception of the subscales of the mobility score) and data from SF-36. A best-evidence synthesis was carried out for all outcomes used in at least two studies; these data are presented in Table 5. Table 1. Level of evidence. [46] Strong evidence Generally consistent findings in multiple high quality cohort studies. Moderate evidence Generally consistent findings in one high quality cohort study and !2 high quality case-control studies, or in !3 high-quality case-control studies. Limited evidence (Generally consistent) findings in a single cohort study, or in maximum two case-control studies, or in multiple cross-sectional studies. Conflicting evidence Less than 75% of the studies reported consistent findings. Insufficient evidence Less than two low quality studies available.

No evidence
Provided when no studies could be found. Based on the same population; Hagberg et al. [50] excluded the patients with a bilateral transfemoral amputation and the patients who were lost to follow-up (n Table 6 presents a comparison of the outcome of socket prostheses use compared to bone-anchored prostheses use.

Quality of life
Two low-quality cohort studies used the Q-TFA problem score as a measure of condition-specific QoL [17,19] and three low-quality cohort studies used the Q-TFA global score. [11,17,19] All five studies reported that condition-specific QoL improved significantly by use of bone-anchored prosthesis rather than socket prosthesis. This constitutes limited evidence for an improvement in condition-specific QoL in the first [11,19] and in the second [17,19] years after bone-anchored prosthesis surgery relative to socket prosthesis use. Two low-quality cohort studies used the SF-36 to assess general QoL. [17,19] They reported that the physical functioning score, [17,19] role physical functioning score [17,19] and physical component summary [17,19] improved significantly with use of a bone-anchored prosthesis rather than a socket prostheses. Hagberg et al. [17] reported that SF-36 bodily pain score improved significantly using bone-anchored prosthesis compared to socket prosthesis, however Branemark et al. [19] reported no change. Scores on other subscales of the SF-36 [17,19] did not change significantly as a result of replacing socket prostheses with bone-anchored prostheses. In conclusion, there is limited evidence that scores on the SF-36 measures of physical health (physical functioning score, [17,19] role physical functioning score [17,19] and physical component summary [17,19]) improved in the first [19] and in the second [17,19] years after bone-anchored prosthesis surgery relative to socket prosthesis use. There was limited evidence that the physical bodily pain subscale score did not change in the first [19] year after surgery to fit a bone-anchored prosthesis and conflicting evidence on change in the second postsurgery year. [17,19] Furthermore, there was limited evidence that the physical general health subscale and all mental health subscales (measured with SF-36) did not change in the first [19] and in the second [17,19] years after bone-anchored prosthesis surgery relative to socket prosthesis use. One low-quality study [50] used a utility instrument, [52] namely SF-6D; this study reported an improvement in general health status in the second year after bone-anchored prosthesis surgery relative to socket prosthesis use.

Function level
Three low-quality cohort studies [11,17,19] used the Q-TFA prosthetic use score to assess prosthesis wearing time. All the studies found that wearing time improved significantly with use of a bone-anchored prosthesis relative to use of a socket prosthesis.
There was limited evidence that wearing time improved in the first [11,19] and in the second [17,19] year after bone-anchored prosthesis surgery relative to socket prosthesis use. One low-quality cross-sectional study [49] compared range of hip motion with and without the prosthesis in users of bone-anchored prostheses and socket prostheses. This study showed that range of hip motion was lower when using the prosthesis for socket prosthesis users but not for bone-anchored prosthesis users; the bone-anchored prosthesis users were assessed between two and 10 years after bone-anchored prosthesis surgery. [49] The same study found Quality of life Impairments in body function or structure Activity limitations Participation restrictions SF-36 [17,19,50] -Physical functioning (PF) [17,19,50] -Role physical functioning (RP) [17,19] -Bodily pain (BP) [17,19] -General health (GP) [17,19] -Vitality (VT) [17,19] -Social functioning (SF) [17,19] -Role emotional (RE) [17,19] -Mental health (MH) [ [17,19,50] -Global score (GS) [11,17,19,50] Goniometer [49]; range of hip motion Qualisys mcu 240 [51]; kinematic data hip and pelvis during gait in sagittal plane Transducer [34]; temporal gait characteristics (cadence, duration gait cycle, swing phase and stand phase) Q-TFA [11,17,19,50] prosthetic use score (PUS) [11,17,19,50] Question sitting comfort [49] 6-Minute Walk Test (6MWT) [11] Timed Up & Go (TUG) [11] Oxygen consumption laboratory measurement during walking at self-preferred walking speed [11] Physiological Cost Index (PCI) [19] Q-TFA [11,17,19,50] -Prosthetic mobility score (PMS) [17,19,50] Question work status [17] SF that users of bone-anchored prostheses had a significant larger range of hip motion than socket prosthesis users, while wearing the prosthesis. [49] The other low-quality cross-sectional study [34] assessed temporal gait variables (cadence, duration gait cycle and duration support phase) in bone-anchored prosthesis users who were at least one year post-surgery, socket prosthesis users and healthy subjects. This study found that bone-anchored prosthesis users had a gait more similar to that of healthy subjects than did socket prosthesis users, except with respect to swing phase duration. [34] One low-quality cohort study [51] assessed gait kinematics in the sagittal plane. This study found that during the stance phase use of a bone-anchored prosthesis increased hip extension and decreased anterior pelvic tilt relative to use of a socket prosthesis. [51] These kinematics while using a bone-anchored prosthesis are more similar to that of healthy subjects relative to socket prosthesis use, but that they still differ significantly with respect to healthy subjects. [51] Activity level Two low-quality cohort studies used the Q-TFA mobility score as an indicator of mobility level. [17,19] Two low-quality cohort studies analysed the walking aid, capability and walking habit subscores of the Q-TFA mobility score. [17,50] All studies found that using a bone-anchored prosthesis resulted in significant improvements in overall mobility score, capability subscore and walking habit subscore during bone-anchored prostheses use compared to socket prostheses use, but there was no change in walking aid subscore. [17,50] In conclusion, there is limited evidence that mobility level improved in the first [19] and second [17,19] years after bone-anchored prosthesis surgery relative to mobility with a socket prosthesis. There is limited evidence that bone-anchored prosthesis surgery did not change walking aid use in the first two years after surgery relative to use with a socket prosthesis. [17,50] One low-quality cross-sectional study [49] assessed discomfort when sitting in users of bone-anchored prosthesis and socket prosthesis users, and found that using a bone-anchored prosthesis was associated with less discomfort when sitting than use of a socket prosthesis. [49] One low-quality cohort study [11] assessed walking ability and found that it improved significantly in the first year after bone-anchored prosthesis surgery relative to use of a socket prosthesis, both in terms of distance covered in 6-min and time needed to get up from a chair, walk 3-m up and down a walkway and sit again. [11] Two low-quality cohort studies [11,50] assessed energy cost of walking; both studies found that use of a bone-anchored prosthesis reduced the energetic cost of walking significantly compared with use of a socket prosthesis. In conclusion, limited evidence was found that in the first [11] and the second [50] years after surgery using a bone-anchored prosthesis reduced energy costs of walking relative to a socket prosthesis.

Participation level
One low-quality cohort study [17] assessed work situation. This study found no evidence that work situation two years after     bone-anchored prosthesis surgery was different from before this surgery. [17] Discussion This review has demonstrated that studies comparing bone-anchored prostheses with socket prostheses have used a wide variety of instruments to evaluate outcome. There was consensus within the included studies with respect to indicators of QoL, with the Q-TFA [11,17,19,50] being used to assess condition-specific QoL and the SF-36 [17,19,50] for general QoL. The Q-TFA was also used to assess function [11,17,19,50] and activity level. [17,19,50] However, the Q-TFA prosthetic use score was analysed differently in the various studies. The Q-TFA was the only indicator of function, activity and participation level used in more than one study. We also concluded that relative to use of a socket prosthesis, use of a bone-anchored prosthesis resulted in better condition-specific QoL [11,17,19,50] and better general physical health QoL in terms of most indicators. [17,19,50] The evidence on the effects on physical bodily pain of replacing a socket prosthesis with a bone-anchored prosthesis was inconclusive. [17,19] We also concluded that relative to use of a socket prosthesis, use of a bone-anchored prosthesis resulted in better function [11,17,19,34,[49][50][51] and activity level, [11,17,19,49,50] but had no effect on participation level. [17] We also noted that in 25 years of bone-anchored prosthesis surgery, [12] only seven studies have compared the outcomes of boneanchored prosthesis use relative to socket prosthesis use in terms of QoL, function, activity and participation level and all these studies were conducted in the last 10 years. Participation level after bone-anchored prosthesis surgery relative to socket prosthesis use is rarely assessed. Three articles [53][54][55] were excluded because of their study design however are worthwhile to discuss. Khemka et al. [54] described four patients with a transtibial amputation who underwent a total knee replacement combined with an osseointegrated implant. Khemka et al. [53] described three patients with a transfemoral amputation who underwent a total hip replacement combined with an osseointegrated implant. Both studies of Khemka et al. used measurement instruments included in this review, namely the Q-TFA to assess condition-specific QoL, SF-36 (physical and mental component summary) for general QoL and 6-Minute Walk Test and Timed Up & Go to assess walking ability. Furthermore, they introduced two additional measurement instruments, including K-levels to assess mobility level and a dual axis accelerometer to assess physical activity level in daily life. Noteworthy is that the Q-TFA is used differently compared to the studies included in this review. Khemka et al. used a summary score (0-100) of all four subscales of the Q-TFA to assess condition-specific QoL instead of the score of only the subscales problem and global score. Both studies reported an improvement of all outcome measures at follow-ups relative to before surgery with exception of one patient with a transtibial amputation who remained at the same K-level as before surgery and three patients with a transtibial amputation who had a stable mental component summary (SF-36). Follow-up time periods ranged from 12 to 30 months. The study of Schalk et al. [55] included one patient with a bilateral transfemoral amputation with an eight-month followup after bilateral press-fit BAP surgery. In this study, two patientreported measurement instruments, the Lower Extremity Functional Scale (LEFS) and the life habits questionnaire (LIFE-H), were used to assess activity and participation level, respectively. The activity level of this patient increased after BAP surgery relative to baseline (two-years before BAP surgery). Participation level increased on the majority of the categories such as recreation, community life, mobility, housing and fitness, however decreased on the categories work and relationships. The increased activity level is in line with the results of our review. Work is the only comparable category with our review within participation level, which showed a different trend.
The systematic review of Van Eck et al. [37] had one overlapping research question on functional outcomes with this review however they searched in less databases, reported their results less detailed and without a clear structure such as the International Classification of Functioning, Disability and Health (ICF). In addition, Van Eck et al. also reported the complications and costs of bone-anchored prostheses. As a result of this Van Eck et al. included six studies more than we did, however three relevant studies [34,50,51] concerning functional outcomes of boneanchored prostheses are lacking compared to our review. From the overlapping studies, [11,17,19,49] Van Eck et al. concluded similar to our critical appraisal that the methodological quality of the individual studies had flaws. The conclusions concerning the functional outcomes of these studies were similar to our conclusions, however Van Eck failed to report the change over time of the before-after cohort study of Hagberg et al. [17] and did not report the functional outcomes of the cross-sectional study of Hagberg et al. [49] No best-evidence synthesis was performed by Van Eck et al.

Strengths and limitations
It is important to note that it is very likely that all the patients included in this review used a bone-anchored prosthesis after a period of socket prosthesis use. This is definitely the case for the patients enrolled in the longitudinal cohort studies and is very likely in the case of patients in the cross-sectional studies as the introductions to the reports of both these studies state that a bone-anchored prosthesis is an alternative option for patients with socket-related problems with a socket prosthesis. We decided not to use the ICF categories to categorize the various aspects of QoL. In this we followed Samuelsson et al., [39] who also classified the SF-36, a utility instrument (e.g., EQ-5D) and Q-TFA problem score and global score as indicators of QoL rather than the ICF categories. We found a different utility instrument (SF-6D [50]) which is based on the SF-36, therefore we classified the outcome as indicator of QoL. It may be more correct to categorize the various aspects of QoL into the ICF categories, however than the ICF linking rules had to be used to interpret the QoL instruments. [56] We decided that this approach was beyond the scope of this review. We followed Samuelsson et al. [39] in treating the Q-TFA prosthetic mobility score as an indicator of activity level. We decided to classify the Q-TFA prosthetic use score as an indicator of function level because it assesses prosthesis wearing time rather than use in specific activities.
A discussion point in the judgement of the used critical appraisal tool is the appraisal of the study design. [41] In our opinion a longitudinal cohort (before and after design) is a good and ethical way to compare bone-anchored prostheses and socket prostheses in clinical practice; however the EPHPP tool assigns such a design low ratings for handling of confounding variables and lack of blinding, which means that overall methodological quality is rated "weak". The EPHPP tool also tends to assign crosssectional designs lower ratings than longitudinal designs. [57] A consequence of the used research designs was that the found differences between bone-anchored prostheses and socket prostheses were rated as "limited evidence" in the best-evidence synthesis. In their review Sanderson et al. [58] found that there was no obvious candidate tool for assessing the quality of observational, epidemiological studies; this highlights the difficulty of evaluating the methodological quality of observational studies. Where bone-anchored prostheses to be used as a primary prosthesis, then it would be possible to use quasi-experimental designs or propensity score-matched cohorts; which are more powerful methods of investigating the differences between boneanchored prostheses and socket prostheses.
The clinimetric properties of some of the instruments used, for example the goniometer [49] and physiological cost index, [50,59] were poor or untested in patients with a lower extremity amputation. The most frequently used instrument was the Q-TFA, this instrument was developed for use in patients with a lower extremity amputation but in our opinion the problem subscale score is not appropriate for bone-anchored prosthesis users because some question relate specifically to socket prostheses.
In four [11,17,19,50] of the five cohort studies, the results for several outcome indicators were based on asymmetrical pre-and post-surgery samples. Because the differences in participant numbers were small we consider it unlikely that this biased the results. The study of Hagberg et al. [50] may suffered from selection bias. This study used the same sample as Branemark et al., [19] but without the patients with a bilateral transfemoral amputation (n ¼ 6) and the patients who were lost to follow-up (n ¼ 6). Given the very small differences between these two studies [19,50] with respect to SF-36 and Q-TFA scores, we consider it unlikely that this biased our findings.
A number of factors decreased the generalisability of the findings of this review. Firstly, socket-related problems are the main reason for receipt of a bone-anchored prosthesis, so it is possible that the socket prosthesis patients in the cohort studies were not representative of the general population of socket prosthesis users. It is possible that the socket-related problems experienced by these patients meant that their outcomes at baseline (before bone-anchored prosthesis surgery) were inferior to those of socket prosthesis users in general, which may have resulted in an overestimation of the influence of bone-anchored prostheses on the outcomes discussed in our review, especially concerning QoL level, prosthesis wearing time, mobility level and walking ability. Secondly, only two groups of researchers from two different countries were represented in the studies included in this review (Sweden and the Netherlands), although bone-anchored prosthesis surgery is performed in several countries (e.g., Sweden, Germany, Australia and the Netherlands). However, the studies of Khemka et al. [53,54] and Schalk et al. [55] discussed above revealed studies of additional independent groups of researchers in the Netherlands and Australia. This resulted in a broader perspective of the used measurement instruments in evaluation of BAP surgery. Condition-specific QoL was assessed with the Q-TFA by three groups of researchers, although it was used in slightly different ways. General QoL was assessed with the SF-36 by two groups of researchers. Walking ability is assessed by the 6-Minute Walk Test and Timed Up & Go by two groups of researchers. Lastly, none of the studies included bone-anchored prosthesis users with a transtibial amputation, although bone-anchored prosthesis surgery is also performed in this population. [21,60] Khemka et al. [54] discussed a very specific bone-anchored prosthesis transtibial amputation population with additional total knee replacement, but published peer-reviewed bone-anchored prostheses studies with functional outcomes in the general population with a transtibial amputation are lacking.
A strong point of this review is that this is the first systematic review that provides an overview of the measurement instruments used to evaluate outcomes in terms of QoL, function, activity and participation level in studies comparing bone-anchored prostheses and socket prostheses for patients with a lower extremity amputation. Furthermore, the use of the ICF to structure the used measurements and outcomes of bone-anchored prostheses surgery is a strength. The high level of agreement between the two reviewers about ratings of methodological quality is a further strength; there was complete agreement about global EPHPP scores and a high level of agreement about domain ratings. This review has revealed the lack of consensus about choice of instruments for evaluating interventions for patients with a lower extremity amputation. Samuelson et al. [39] reached a similar conclusion, noting that 14 different measurement instruments were used to evaluate QoL, activity level and participation level across eight studies of the effectiveness studies of socket prostheses. Deathe et al. [61] identified 17 instruments that had been used to assess the outcome of lower extremity amputation rehabilitation in socket prostheses patients in terms of function, activity and participation level and further concluded that there was a lack of good evidence demonstrating their sensitivity. The overlap in instruments considered in these reviews [39,61] and this review is very small, being limited to the SF-36, [39] Q-TFA [39,61] and Timed Up & Go. [61] This shows that there is little agreement about how to evaluate the outcome of rehabilitation from lower extremity amputation, except with respect to QoL indicators. This hinders comparison of studies and reduces the generalisability of findings.

Recommendation for future research
First of all it is important to develop a standard set of evaluation instruments based on the ICF in order to provide a common method of assessing interventions for patients with a lower extremity amputation. [62] The first steps towards this have already been taken by Kohler et al. [63] In our opinion the core set should cover QoL (condition-specific and general); various aspects of function level, namely pain (e.g., residual limb and phantom limb), range of motion, muscle strength (e.g., hip abductor) and gait quality (e.g., use of compensation strategies in coronal plane); various aspects of activity level, namely walking ability (e.g., 6-Minute Walk Test and Timed Up & Go), physical activity level in daily life (e.g., accelerometer) and energy costs (e.g., physiological cost index); and indicators of participation level (e.g., return to work). Secondly, there is a need for further research into the clinimetric properties of the instruments currently in use. [61] Moreover, new instruments should be developed specifically for use in bone-anchored prostheses users, covering aspects of outcome such as stoma related problems, residual limb pain as result of reactivating muscles and sensation during terminal impact in the swing phase during gait as a result of osseoperception. There is also a need for consensus about the time points at which follow-up assessments should be carried out so that future data can be subjected to meta-analysis. We suggest that pre-operative assessment and a one, two, five, 10-and 20-year post-operative follow-ups should be used to capture short-and long-term results. There is also a need for greater understanding of factors which predict improvements in QoL, function and activity level following bone-anchored prosthesis surgery to allow medical professionals and health insurance companies to predict which patients are likely to benefit from the procedure. A cost-effectiveness study should be carried out to explore whether the combination of QoL improvement presented in this review and post-surgery costs presented by Haggstrom et al. [25] makes bone-anchored prostheses a more cost-effective intervention than socket prostheses for patients with a transfemoral amputation. The numbers lost to follow-up in the studies included in this review were low, despite the fact that in 6-11% of participants amputation was due to causes other than trauma and cancer, for example arterial embolus and infection. This suggests that the eligibility criteria for receipt of a bone-anchored prosthesis could be broadened, but a systematic review of evidence on implant survival and infection rates would be needed to confirm this.
To provide a comprehensive picture of the differences in outcome with bone-anchored prostheses and socket prostheses, it would be necessary for all clinicians and researchers involved with bone-anchored prostheses surgery and rehabilitation to publish not only their infection and survival data, but also data on outcome in terms of QoL, function, activity and participation level at both at short-and long-term follow-ups. It is also important to publish data on transtibial bone-anchored prosthesis users to facilitate evaluation of the value of bone-anchored prosthesis surgery for transtibial socket prosthesis users who suffer from socketrelated problems or limited prosthetic use.

Conclusion
This systematic comparison of outcomes with bone-anchored prostheses and socket prostheses revealed that there is consensus about how to evaluate QoL; however, there is little consistency in the instruments used to evaluate function, activity and participation level. We found limited evidence that in patients with a transfemoral amputation use of a bone-anchored prostheses use increased condition-specific and general physical QoL compared with socket prostheses use, and was associated with higher function and activity.
The findings are of clinical relevance to patients with a lower extremity amputation who are considering bone-anchored prosthesis surgery, professionals in the surgery and rehabilitation field and health insurance companies; they should help these groups to make well-informed choices. The review should also help professionals to choose instruments for the comparative evaluation of bone-anchored prostheses and socket prostheses and facilitate comparison of OPRA, EEFP and future implants. Furthermore, we hope that this review will lead to more research into bone-anchored prostheses use.