A good practice guide for translating and adapting hearing-related questionnaires for different languages and cultures.

Abstract Objectives: To raise awareness and propose a good practice guide for translating and adapting any hearing-related questionnaire to be used for comparisons across populations divided by language or culture, and to encourage investigators to publish detailed steps. Design: From a synthesis of existing guidelines, we propose important considerations for getting started, followed by six early steps: (1) Preparation, (2, 3) Translation steps, (4) Committee Review, (5) Field testing and (6) Reviewing and finalising the translation. Study sample: Not applicable. Results: Across these six steps, 22 different items are specified for creating a questionnaire that promotes equivalence to the original by accounting for any cultural differences. Published examples illustrate how these steps have been implemented and reported, with shared experiences from the authors, members of the International Collegium of Rehabilitative Audiology and TINnitus research NETwork. Conclusions: A checklist of the preferred reporting items is included to help researchers and clinicians make informed choices about conducting or omitting any items. We also recommend using the checklist to document these decisions in any resulting report or publication. Following this step-by-step guide would promote quality assurance in multinational trials and outcome evaluations but, to confirm functional equivalence, large-scale evaluation of psychometric properties should follow.


Introduction
Patient-reported measurement instruments often refer to questionnaires that are used in a clinical setting or a clinical trial, where the responses are reported directly by the patient (or proxy) and concern some personal aspect of health, quality of life or functional status. These can be for diagnostic assessment or for evaluating the clinical efficacy of an intervention.
In the field of audiology, there is broad diversity in questionnaires used for measuring change in hearing-related problems (Granberg et al. 2014;Barker et al. 2015;Hall et al. 2016;Plein et al. 2016). Those same reviews indicate that the majority of questionnaires have been developed in English-speaking countries; namely United States (US), United Kingdom (UK) and Australia. Therefore, when selecting patient-reported measures for diagnosis, therapeutic evaluation or audit, a first choice for researchers and clinicians in non-English-language-speaking countries is to modify an existing instrument and confirm its psychometric properties. In the rest of the article, we use the term ''investigators'' to encompass any clinicians and researchers who use, or might wish to use, a modified instrument to target groups other than those intended by the original developer. We use the term ''cross-cultural adaptation'' to describe the process that considers both language issues (translation) and cultural adaptation (idiom, cultural context and lifestyle) when modifying an existing questionnaire for another geographical setting or for people in a country that has diversity in languages and cultures (Epstein, Santo, and Guillemin 2015).
Cross-cultural adaptation has clear advantages over creating a new instrument (see Beaton et al. 2000;Guillemin, Bombardier, and Beaton 1993;Wild et al. 2009). First, a multiplicity of questionnaire instruments already exist and are available ready to use. Second, many of these instruments were developed using a well-established framework. Cross-cultural adaptation is not just about translation, but also about considering the conceptual, item, semantic and operational equivalences between the source-and target-language versions and this is essential for enabling international research, cross-cultural comparison and meta-analysis (Herdman, Fox-Rushby, and Badia 1998). Conceptual equivalence refers to the degree to which a concept of the instrument items exist in both cultures and the meaning is the same (Herdman, Fox-Rushby, and Badia 1998). For example, ''family'' may be thought of as a nuclear unit in one culture (parents and offspring only) and extended (with other members) in another. Item equivalence refers to relevance of questionnaire items as measures of a particular domain. For example, an item about mowing the lawn will not be appropriate in cultures where a large proportion of the population do not own a house with a garden. Item equivalence also considers the acceptability of those questions, especially whether offensive or taboo. Semantic equivalence is concerned with the sentence structure, colloquialisms or idioms which ensure preservation of meaning. An important aspect of semantic equivalence is to ensure that the level of language used is appropriate to the end users. For accessibility, the translation should use the most widely used language variant for the country. Operational equivalence refers to the similarity of format, instructions and administration. Poor attention to these matters may compromise the overall functional equivalence, meaning that the instrument does not do ''what it is supposed to do equally well in two or more cultures'' (Herdman, Fox-Rushby, and Badia 1998, pp 331).
Best practice in cross-cultural adaptation is still a developing field, and numerous guidelines have been published. A systematic review identified 31 guidelines for cross-cultural research describing a similar multi-step process that aimed to promote high-quality modification of existing questionnaires, to improve the efficiency with which they are produced, and to meet regulatory body requirements (Epstein, Santo, and Guillemin 2015). Conclusions from this review highlight that guidelines share many common elements, although there is neither universal consensus among investigators on what is essential and what is optional, nor strong empirical evidence of the superiority of one method over another that might otherwise lead to a ''gold standard''. Some of the 31 guidelines draw on expert recommendation by influential working parties (e.g. Guillemin, Bombardier, and Beaton 1993;Beaton et al. 2000;Wild et al. 2005;Acquadro et al. 2008;Wild et al. 2009) with authors representing specialised organisations (e.g. www.mapitrust.org/the patient-centred research company, and www.ispor.org/ the International Society for Pharmacoeconomics and Outcomes Research, ISPOR).
The International Collegium of Rehabilitative Audiology (ICRA) and TINnitus research NETwork (TINNET) represent international opinion leaders, many of whom have been involved in the development of hearing-related questionnaires and subsequent cross-cultural adaptations (e.g. the International Outcomes Inventory for Hearing Aids, Cox, Stephens, and Kramer 2002). Discussions over the years indicate that many of our colleagues remain unaware or uncertain about what constitutes a ''good'' cross-cultural adaptation. Hence, this synthesis of existing recommendations is recognised to have value in promoting aspects of good practice in an original form that is accessible to the Audiology community.
The objectives of this methodological article are two-fold: first, to raise awareness and propose a good practice guide for the early steps of translating and adapting any questionnaire to be used for comparisons across populations divided by language or culture, and second to encourage publishing those details, perhaps in combination with a psychometric evaluation (not described in this guide). Our recommendations are based on common elements among wellknown guidelines that have drawn in the past on expert working party recommendations for clinical trials (namely, Acquadro et al. 2008;Beaton et al. 2000;Guillemin, Bombardier, and Beaton 1993;Wild et al. 2005;Wild et al. 2009), supplemented by our own collective professional expertise. They are particularly applicable to modifications from any language to another culture or language where findings are to be interpreted or compared across countries or cultures.
Selection of the precise method will eventually depend on the competences, resources and timelines of the project. But our guidelines indicate minimum standards for any application domain, including clinical audits. For every step, we provide a short description of what is involved, with minimum standards where possible and we illustrate with examples (Supplemental File 1). Tables provide ''risk indicators'' to support informed decisions about the potential consequences of omitting certain steps when resources or expertise are limited. In addition, a set of editable documents are provided to guide, facilitate and boost good translation practices for future work in this field (Supplemental files 2-5). These can be modified by end users, as required.

Scenarios requiring cross-cultural adaptation
Being faithful to an original measure is not performing a ''word for word'' translation but a ''world for world'' translation. Guillemin, Bombardier, and Beaton (1993) suggest two different contextual scenarios for when attention should be paid to cross-cultural adaptation; another country speaking another language, and new immigrants in the source country but who cannot speak the source language. The first scenario is the most common in clinical research, especially in multinational trials, when the patientreported measure needs to be adapted into one or several languages for different countries from the one where it was created. The second scenario described by Guillemin, Bombardier, and Beaton (1993) considers new immigrants in the source country but who cannot speak the source language (e.g. Spanish speaking new immigrants arriving in the United States). This scenario also requires different language versions of the patient-reported measure to be developed, but this time used within the same country.
Other scenarios can be envisaged where the same steps are required for in-country usage. One is where there are established subpopulations living in the same country or geographical area but speaking different languages. This is the case in many Asian countries such as India and China, and also in Belgium and Canada where subpopulations can be defined by cultural practices and linguistic dialects (e.g. Thammaiah et al. 2016). Another situation exists in countries where there is an official co-existence of several languages, with individual linguistic competencies in each official language differing, mainly due to historical reasons (e.g. Welsh and English spoken in Wales, United Kingdom UK or Catalan and Spanish spoken in Catalonia, Spain).
Scoping out the selection criteria for identifying a source questionnaire The diversity of existing instruments for measuring the impact of hearing loss and tinnitus means that investigators can choose from a number of different patient-reported measures to assess the construct of interest. Several online resources are available for searching established data systems. A good example is ''HealthMeasures'' (www.healthmeasures.net/); a bank of measurement instruments for assessing global, physical, mental and social health in people living with a chronic condition [see the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative]. To guide the initial selection process, the following questions can help investigators to decide the use of one existing questionnaire instrument over another. Questions address the (i) purpose; (ii) hearing-related constructs of interest; (iii) sampling of those hearing-related constructs; (iv) psychometric properties and (v) feasibility.
(i) For what purpose will the patient-reported measure be used? Just because a questionnaire is popular, does not necessarily mean that it is the most appropriate. For example, a questionnaire designed primarily to discriminate between patients (diagnosis or patient stratification) will likely contain items that have different psychometric properties than one designed to evaluate changes over time (monitoring treatment outcome) (Kirshner and Guyatt, 1985). Any questionnaire to be used for evaluating intervention-related effects should have supporting evidence that it is responsive to change. An example of a questionnaire primarily developed for diagnostic use is the Tinnitus Handicap Inventory (THI; Newman, Jacobson, and Spitzer 1996), while the Tinnitus Functional Index was developed with measuring the effectiveness of interventions as its main goal (TFI; Meikle et al. 2012).
(ii) What kind of hearing-related constructs are the focus of interest? The US Food and Drug Administration (FDA) recommends that investigators first determine whether an adequate patient-reported measure exists to assess and measure the construct of interest (US Department of Health and Human Services FDA Center for Drug Evaluation and Research, 2006). An investigator might be interested in assessing and measuring a broad concept such as hearing disability. But disability is related to a number of discrete aspects of hearing problems such as impact on listening to speech in noise, impact on listening enjoyment, impact on social participation, etc. So, an investigator might be justified in assessing and measuring a single-domain concept instead. Few questionnaires in Audiology seem to focus on measuring a single-domain concept. Instead, most have a multidimensional structure with items assessing and measuring different concepts and combining item scores to provide a global composite score. One example is the Speech, Spatial and Qualities of Hearing Scale (SSQ, Gatehouse and Noble, 2004) which assesses three domains of hearing disability: speech communication, spatial hearing and qualities of listening. Investigators would be advised to consider the FDA guidance that a complex multidimensional claim about the clinical efficacy of an intervention cannot be substantiated by questionnaires that do not cover component domain concepts.
(iii) Are those constructs and how they are sampled comparable across source and target countries? Investigators should be reassured that the concept of interest (and any associated subscale domains) is both conceptually relevant to and equivalent across source and target countries where the questionnaire will be used. This scenario is likely to be true for hearing-related conditions, but investigators should remain sensitive to the fact that the actual domains of hearing loss impact can differ across cultures. When considering whether a questionnaire should be chosen or whether any of its items need to be culturally adapted, investigators should first compare the lifestyle and listening environments between the target and the source populations (e.g. degree of urbanisation, population density, common leisure activities, religious activities, household composition, type and level of noise, etc.). If substantial proportions of the source-language questionnaire contain subscales or items which are not relevant or acceptable, then a different source questionnaire should probably be identified at the outset. As a general piece of advice, investigators should choose questionnaires that require few item changes and should avoid making excessive claims about the generalisation of a universal version without first testing it out in the field.
(iv) Have adequate psychometric properties been demonstrated? Properties include construct validity (the extent to which the questionnaire measures what it is supposed to measure) and reliability (the degree of measurement precision). Depending on the purpose of the questionnaire, discriminability (the degree to which the questionnaire is able to discriminate between individuals) or responsiveness (the degree to which the questionnaire is sensitive to treatment-related change) are also important. The source-language questionnaire must demonstrate adequate psychometric properties (see Valderas et al. 2008;Mokkink et al. 2010;Prinsen et al. 2016).
(v) Is the questionnaire feasible to apply? Feasibility is an important part of the selection process. As a minimum, feasibility should consider three essential practical aspects about the application of the candidate questionnaire; time to complete, cost and comprehensibility 1 . These three criteria originated from an influential set of criteria for determining the applicability of a measurement instrument in rheumatology set by the Outcome Measures in Rheumatology consensus initiative (Boers et al. 1998).
Time to complete a questionnaire is often indicated by the number of items. For example, the SSQ (Gatehouse and Noble, 2004) contains 49 items which might render it less practical for a busy clinic or a clinical trial than a shorter instrument. Cost could be the licence fee for copyrighted materials, although often reduced tariffs are offered by authors for non-commercial (e.g. research) use.
Some general examples of fee-based tools include the Hospital Anxiety and Depression Scale (owned by GL Assessment, Brentford, UK) and Health Utilities Index (Health Utilities Inc., Dundas, Ontario, Canada). Cost could also be the site staffing resources required for questionnaire administration and scoring. For example, for the SSQ an audiologist-administered interview is preferable to self-administration, in order to explain the meaning of the questions and to avoid any misunderstanding by the patient (Gatehouse and Noble, 2004). Comprehensibility (readability) refers to the degree to which an item is readily understood by most people. The linguistic diversity and literacy level of potential respondents should be considered. This preparation stage must scope out the variation in dialects spoken within the target population or the cultural variations across the target region. To help developers create items that could easily be understood by the general public, Guillemin, Bombardier, and Beaton (1993) recommended using simple linguistic structure, such as avoiding colloquialisms, sentences containing two different verbs that suggest different actions and sentences containing two different situational contexts. However, we acknowledge that in a language/culture where colloquialisms are often used (e.g. UK English) completely avoiding them could make an instrument seem a bit dull and lifeless. To check the readability level, investigators might make use of formulas, such as the ''Simple Measure of Gobbledygook'' Readability formula or the Flesch Reading Ease formula (see www.readabilityformulas.com/). But, the use of these is untested in translating and adapting questionnaires. Feasibility might also extend to considerations of potential sensory problems or physical limitations that would affect a respondent's ability to read or respond to the questionnaire.
Getting adequate resourcing to support the cross-cultural adaptation process Achieving a good translation for multinational and multicultural research requires collaborative effort between qualified translators, healthcare professionals with experience of the condition and members of the target population (patients, communication partners, etc.). According to our step-by-step guide, a minimal team for a quality cross-cultural adaptation would involve one Translation Lead to manage the resources, procedural steps and documentation, at least three translators with linguistic competence in the source and target languages (two Forward Translators to create the targetlanguage version, and one Back Translator to recreate a source version from that translated target), a linguistic expert (preferably a professional translator) on the target language, a healthcare professional with specific competence in the source and target languages, and the source-language Questionnaire Developer (if possible). If adequately resourced, a full cross-cultural adaptation process would typically take 4-12 weeks, depending on the difficulty of the materials to be translated, the number of review meetings required, and of reconciliations needed to reach full conceptual, item, semantic and operational equivalence.

A step-by-step guide
For each selected instrument, titles, introductory text, instructions for the administrator of the test, instructions for participants, questionnaire items, response scale anchors and scoring instructions are all equally crucial for cross-cultural adaptation. The remainder of the article gives a full step-by-step explanation of each step, presented in six sequential sections (summarised in Figure 1): (1) Preparation, (2) Translating the source language into the target language (forward translation), (3) Translating the target language back into the source language (back translation), (4) Committee Review, (5) Field testing and (6) Reviewing and finalising the translation. The process of cross-cultural adaptation is time-consuming and resource intensive and so before embarking on any project, it is strongly advised to identify whether there already exists a translation of the questionnaire in the language and culture where it is going to be used (Wild et al. 2009) (Figure 1, Table 1). As a general rule, the copyright holder of the original questionnaire is also the copyright holder of the respective translated versions. If there is any doubt about whether a target-language version already exists, the copyright holder can usually signpost investigators to any translated versions and associated reports that can serve as a useful starting point. Preferably, the procedural steps of each translated version should be published (see Beaton et al. 2000), but this may not always be the case. For example, numerous translated versions of the Abbreviated Profile of Hearing Aid Benefit are instead archived on the institutional website (Hearing Aid Research Lab). Other translations might be in hands of the companies that sponsored the translations, specialised translations agencies or published in PhD theses or journals. There is at least one searchable database dedicated for clinical outcome assessments and their translations (see ePROVIDEÔ, https://eprovide.mapi-trust.org/).

Section 1 Preparation
If a same language version does already exist, then it is important to ascertain whether that existing version is adequate, and if not then to identify what cross-cultural adaptation steps have been done, and any limitations. In some fields, copyright owners may offer many translations of their staple measures and for some or all of these there may be no additional information regarding translation details, and almost always no peer-reviewed publication detailing the cross-cultural adaptation process and psychometric exploration of the translated versions. Translated measures without such information should in general be avoided, or taken as a starting point only. It is worth asking if the source-language Questionnaire Developer can provide a description of the process and a copy of a signed and dated certificate documenting the translation process. There is a possibility that such certification may be requested by the Independent Review Board (ethics committee) or a regulatory body (such as FDA). Practical guidelines about what should be contained in the certificate of translation are given in Supplemental File 2. It should typically include the credentials of the personnel involved, list the steps conducted, source language document, final version target language document and the person or organisation responsible for the final translation. If a certificate does not exist, but the investigators' opinion is that the existing target language version is of good quality, then it is good practice for the investigator to conduct at least one independent back translation to confirm that the items are equivalent to the original version. If there are any concerns about semantic equivalence, then the existing translation could at least serve as one of the forward translations (see Item 2d). In the context of hearing-related questionnaire translations, the reporting of Item 1a has sometimes been unclear (e.g. Wrzosek et al. 2006;Aksoy, Firat, and Alpar 2007;Müller et al. 2016 Test developers should respect any copyright law and agreements that exist for the original questionnaire. Under certain circumstances (called ''fair use'') the cross-cultural adaptation of a copyrighted work may not infringe copyright law. Nevertheless, investigators should carefully consider this matter before starting to translate any work without permission and in no matter what the circumstance, it is always wise to make determined endeavours to contact the copyright owner. From our experience, we appreciate that it may not be possible to succeed in making contact, but the steps taken to do so should at least be reported as part of the translation process (e.g. requests sent with an acknowledgement of receipt). Wherever possible, written permission/approval could be granted from the source-language Questionnaire Developer, from his/her institution or from a publisher (if the questionnaire is published in a book, journal or publishing companies), depending on whoever holds the copyright and conditions of use. Sometimes the copyright owner might specify certain expectations or requirements. For example, these could specify the minimal steps needed to be taken when producing translations, or could even refer to these published guidelines. The copyright owner might also stipulate what the role they wish to take in the process, and whether they will charge a fee for doing so. Some request to be actively involved in certain key steps (e.g. CORE System Trust, www.coreims.co.uk/), while others do not. Even in the case of copyright-free instruments, it is good practice to seek permission of study-specific use from the source-language Questionnaire Developer. It is not uncommon for the source-language Questionnaire Developer to request at least to be informed about the final version of the translation and provide approval before its use in research (e.g. International Outcome Inventory for Hearing Aids, http://icra-audiology.org/).

DEVELOPER TO BE INVOLVED
There are distinct advantages to inviting the source-language Questionnaire Developer to be involved in the cross-cultural adaptation process or for him/her to nominate a competent delegate. The source-language Questionnaire Developer can provide the most up-to-date information and materials to the team at the start of the project. This can include the latest existing version or formats of the questionnaire, manuals, training materials or any other useful documentation that would help in describing the concepts that are assessed and measured. Involvement is particularly beneficial at the Committee Review (Section 4) (e.g. Guillemin, Bombardier, and Beaton 1993;Beaton et al. 2000;Acquadro et al. 2008). The sourcelanguage Questionnaire Developer can share his/her accumulated knowledge, and can prompt the team to consider dialect variations, literacy levels, gender and culture issues, etc.

FOLLOW, CUSTOMISED TO THE END USERS
A set of self-reflective questions about (i) literacy, (ii) population characteristics and (iii) administration can help define key objectives.
(i) Literacy levels that differ from the original source language population or diversity within the target population. In some countries, populations may have a range of educational opportunities and literacy levels may vary greatly. If this is the case, one objective might be to use purposive sampling in recruiting participants to ''pre-test'' the translated version and to explicitly ask each of them to rephrase every item in their own way so that the investigators can be certain whether an item has been understood or not (e.g. Weinstein et al. 2015).
(ii) Other characteristics of the target population. Important patient characteristics, such as age and physical disability, can influence the choice of wording to handle stages of language development, format of administration to handle accessibility etc.
(iii) Administration elements. There are many different ways in which a questionnaire can be administered. During development, decisions are made about questionnaire format (written or video), instructions (for two adjacent response boxes ''Pick which best. . .'' or ''Pick which best. . . Do not tick two boxes''), mode of administration (paper-pencil, computer, interaction with intelligent personal assistant etc.) and measurement methods (Visual Analogue Scale or Likert scale). However, a translation can only achieve operational equivalence when any changes in these elements do not affect the results (Herdman, Fox-Rushby, and Badia 1998 In-country residents specifically refer to residents of the target country.
good practice to build a unified document describing all steps taken and decisions made. This information is of value to keep track in case of any future modifications and if external reviews or audits are performed. Supplemental File 3 is a template ''reconciliation report'' (as an excel spreadsheet) that can be modified for use, such as a unified document (see also Antunes et al. 2012 The main goal is to produce a final product that preserves the same meaning, is understood by the target population, and adequately reflect any nuances of the source or target language. There are different positions on the recruitment criteria and minimum number of Forward Translators (reviewed by Acquadro et al. 2008) ( Figure  1, Table 2). For example, some guidelines recommend as few as two independent Forward Translators, but insist they are bilingual, with high proficiency in both languages. Other guidelines recommend more translators, but have less stringent restrictions on their fluency in the source language. Common to most guidelines are that the target language is the first language for all Forward Translators and that they should have lived experience of the target country/ culture. Ideally for the minimum number of two translators, one of them should be a professional translator because they have a certified linguistic competency, and one should be a healthcare professional who has experience of the condition of interest. An advantage of this mix in skill sets is that individual biases are reduced, thus promoting a translation which is fit for purpose. The two translations can be compared and any discrepancies can be identified for subsequent discussion and resolution (see Item 2e). The goal of the translation is to maintain the same interpretation of meaning across cultures, and so this should be clearly explained to the Forward Translators using a single-briefing session that includes a description of the health concepts (see item 1f) and an explanation of how to use these definitions to achieve item-by-item semantic equivalence (see also Item 2c) (Beaton et al. 2000

ACCEPTABILITY OF WORDING) AND PREFERRED TERMINOLOGY
An accurate translation is not about making a literal translation, but instead the instructions to the translators should be guided by the key objectives set out in Item 1d. Conceptual, item, semantic and operational equivalences have been discussed already. There may also be a preferred condition-specific terminology that reflects common usage by doctors and patients in the target country but which varies across cultures (such as the term for ''tinnitus''). These words and phrases could constitute a glossary of terms that can be kept updated for future reference (Supplemental File 4). It is good practice to provide the same instructions and information to all translators. At this stage, the translators should also be instructed about the priorities for conducting the translation; to maintain conceptual, item and semantic equivalences, and to promote everyday non-technical language. It can also be useful to instruct the translators to rate the difficulty of translation for each unit of the instrument because this information can be referred to when discrepancies are observed, discussed and reconciled. Useful preparatory activities could also include (i) instructing the translators on the condition and the symptoms or everyday complaints that the instrument aims to measure and (ii) providing supplementary materials written in the local language, such as patient leaflets published by health, charity or scientific organisations; especially if that material is bilingual. Supplemental File 1 gives two examples of the reporting instructions given to translators and the adaptations made as a result.
If they do not contain culturally sensitive information, instructions for scoring and for interpreting the scores of the questionnaire could be done at minimum with one Forward Translator and one Back Translator. Response options should be given due consideration. For terminology relating to response options in a Likert scale, such as Not at all, Only a little, A moderate amount, Quite a lot, Very much indeed, then existing terms may already be in common usage in other target-language questionnaires. These could prove to be a good starting point. If, however, such information is not available, the Translation Lead should carefully assess whether the translated response options have the same interpretation as those used in the original source-language questionnaire. Of particular note, some cultures are less forthcoming about selecting response options at the extreme ends of a scale; narrowing the response range for statistical analysis. Furthermore, response options originally intended to be equidistant may not maintain equidistance in a literal translation.

ITEM 2D. WORKING INDEPENDENTLY, EACH TRANSLATOR PRODUCES A WRITTEN RECORD OF FORWARD TRANSLATION
Each translator should independently work to the brief provided in Items 2b/c to create a translation from source to target language, unit-by-unit. An example of reporting independent working can be found in Supplemental File 1. Moreover, relevant parts of the reconciliation report can be completed separately by each translator (see Supplemental File 3, column headings ''Forward TR1 NEW LANGUAGE'' and ''Forward TR2 NEW LANGUAGE''). While optional for each Forward Translator to provide comments, such information can help to highlight any particular items that were difficult to translate or to document the decision taken for future reference. At this stage, Forward Translators could rate the degree of difficulty in translating each unit of the instrument using a Likert scale to inform later review and reconciliation.

ITEM 2E. RECONCILE THE FORWARD TRANSLATIONS TO CREATE A SINGLE FORWARD TRANSLATION; PRODUCING ONE WRITTEN RECORD WITH COMMENTS, WHEN NEEDED
The aim of this step is to harmonise the forward translations. If there are any discrepancies among the two Forward Translators, then these need to be resolved by a Reviewer who makes an independent decision, in consultation with the translators when needed. When a literal translation of the word/phrase is not possible, attempts should be made to consider closest possible meaning, using the concept definition as a guide. If a consensus is not reached, then the Translation Lead could decide the final version based on the input from source-language Questionnaire Developer or by consulting a new translator or by consulting other people who do not necessarily speak the source language but who can nevertheless comment on any differences between the forward translations. The Translation Lead should not suggest new translation options because this would compromise the process. Upon review of each translation, the person in charge of the reconciliation should highlight unit-by-unit in the forward translation, each section of the text that is discrepant Section 3 Translating the target language back into the source language (back translation) Back translation is a commonly used quality assessment tool, but it is not without controversy and there is no compelling evidence that this step enhances the target-language version (Epstein, Santo, and Guillemin 2015) (Figure 1, Table 3). Committee review and field testing may be sufficient, if done well (Epstein, Santo, and Guillemin 2015;Colina et al. 2017 The main goal is the same as the forward translations; to produce a translation that reflects the same level of language as the original. One of the minimal criteria for recruitment into the role of Back Translator should be a bilingual speaker with lived experience of the target culture, even if the translator is not currently an in-country resident. Ideally this person should be a professional translator because they have the appropriate linguistic expertise. Some of the guidelines recommend two Back Translators (Acquadro et al. 2008) but not all do, and two is not so common in the commercial sector. This is why we have specified one as a minimum standard.

ITEM 3B. WORKING INDEPENDENTLY, EACH TRANSLATOR PRODUCES A WRITTEN RECORD OF BACK TRANSLATION
The reconciled version in the target language should be back translated at least once into the source language with the translator working to the brief provided in Item 2b. Again the translation should be done on a unit-by-unit basis, for all parts of the instrument. The relevant part of our template should be completed in a blinded way by the translator where they are given only the reconciled target language version (see Supplemental File 3, column headings ''Back Translation''). While optional for the translator to provide comments, such information can help to highlight any particular items that were difficult to translate or to document the decision taken for future reference. See Supplemental File 1 for an example of describing the translation brief.

LANGUAGE
The person in charge of this step should highlight, unit-by-unit, each section of the back translation text that is discrepant to the source.
To help with the Committee Review, any discrepancies can be classified using an A-D scheme (e.g. Badia and Alonso 1994;Sanchez-Moreno et al. 2008). According to this scheme, A ¼ items which show perfect semantic equivalence and good literal and semantic parallels between the back translated and source version; B ¼ items which show satisfactory semantic equivalence, but have used one or two different words; C ¼ items which preserve the meaning of the original, but without a satisfactory semantic equivalence; and D ¼ items which have no agreement. An example of category C is ''. . . you had much more energy than usual?'' versus ''. . . you had more energy than usual?'' (Sanchez-Moreno et al. 2008). Items classified as ''D'' are certainly ones requiring further action. Supplemental File 1 gives a reporting example of how the back translation was reviewed. This step involves the appointment of an expert, multi-disciplinary Committee to compare and confirm the congruence between the forward and back translations against the source-language questionnaire and to resolve any discrepancies (Figure 1, Table 4). The Committee should preferably include members with local language expertise, in-depth knowledge of the field, and expertise with the research methodology and translation process. Hence, it is advisable to have a linguistic expert, a healthcare professional with knowledge of the content area (preferably independent from the project team to avoid conflict of interests) and all Forward and Back Translators. The Translation Lead should be in close contact with the Committee during this time. The source-language Questionnaire Developer, if proficient in the target language can also be invited to participate in the Committee Review or at least (s)he can be requested to help in clarifying differences observed (if any arise) between the source and the target versions (e.g. Wild et al. 2005 The task of the Committee is to examine whether all the translation units are accurate and whether they map to the original intent of the source-language Questionnaire Developer. It is easy to see how the written records (during all the substages) are crucial to this Committee Review to make the meeting efficient (Thammaiah et al. 2016). The endpoint is to reach consensus on the first final version of the questionnaire in the target language (Acquadro et al. 2008) before going to subsequent steps. This process can sometimes highlight a problem in the source questionnaire which was not previously acknowledged such as identifying an item that is simply not culturally transferable. This is a good case where having the source-language Questionnaire Developer on-side and supportive can help to resolve the issue. To go further, any change or edit introduced to the target-language version at this stage of harmonisation needs to be back translated again, with final confirmation of conceptual, item, semantic and operational equivalences. A variety of processes have been used to achieve harmonisation (Wild et al. 2005) and so we provide three different illustrative reporting examples in Supplemental File 1.

Section 5 Field Testing
Examining feasibility is the last stage of the cross-cultural adaptation process before producing the final version of the translated questionnaire (Figure 1, Table 5). Field testing can also be used to investigate any translation alternatives where no consensus was found during the Committee review.

ITEM 5A. RECRUIT A SMALL SAMPLE OF PATIENTS FROM THE TARGET POPULATION
A purposive sampling method should be used for recruitment so that there is adequate representation from across the target population in terms of the severity of the condition of interest (e.g. hearing loss), age, gender, education, regional dialect, socio-economic status and any relevant cultural factors. There is no consensus on the desired sample size in the literature, and the sample size generally varies 4a Appoint a multi-disciplinary committee that includes at least one bilingual member (preferably a linguist) whose first language is the target language and who are in-country residents with experience of the target culture, and one healthcare professional To provide an additional quality control step Cross-cultural equivalence may not be achieved (or is presumed when it may not be possible)  between 5 and 50. Acquadro et al. (2008) reviewed 17 such guidelines and they found that some of the guidelines do not even specify the desired sample size. We recommend that at least eight participants contribute to the pre-testing of the translated version to ensure the original instructions, items and scoring materials are clearly expressed, but where there is regional variation then sample size might need to be as high as n ¼ 20. No matter what is the overall sample size, each participating group of interviewees should ideally be five to -eight (Antunes et al. 2012). Groups should be conducted independently and therefore would happen at different times. For pilot testing (see Item 5b), if statistical analysis is to be conducted then a larger sample size will be required. We suggest at least n ¼ 50 if internal consistency is to be explored using the average correlation between the questionnaire items (Cronbach's alpha) (Terwee et al. 2007 For (i) cognitive debriefing, methods include a face-to-face semi-structured interview or focus group. The aim is not to elicit numerical scores, but to explore how the participants understand the questions. Patients are often asked to complete the instrument while ''thinking aloud'' and explain the reason for each of their responses, following which specific questions can be asked by the interviewer (York Health Economics Consortium, 2016). Questions generally cover whether there are any difficult words or phrases, how they would explain the item in their own words, and whether they would suggest any changes to the wording to make it clearer or more acceptable. The second question (i.e. asking to paraphrase in their own words) is considered the most important part of the cognitive debriefing process because this provides insight into how the interviewees actually understand the items (and returning to ask about the titles, introductory text, instructions and response scale anchors). The answers will provide clues about how comparable the translation is to the source and may expose issues of comprehension with particular groups (e.g. by dialect or years of education, etc.). Unless the participant clearly finds this difficult or impossible, it can be useful to ask participants not only on how they understand the question but to think of people they know who have had the target problems, and people who have not, and to ask them to think whether those people would (a) answer differently and (b) perhaps even read the item differently.
For (ii) pilot testing, the method tends to be questionnaire completion, to explore how users interact and complete the instrument. It provides an opportunity to investigate the wording of the instructions/items/response scale, its format, size, length and to understand the time necessary for the session in the target population. Investigators can also add questions about difficulty in understanding the items or response options by including supplementary Likert scales. Pilot testing is important before proceeding to a wider evaluation of its psychometric properties or before using the translation in real clinical research.
Section 6 Reviewing and finalising the translation ITEM 6A. REVIEW THE RESULTS OF THE FIELD TESTING AND FINALISE
during field testing ( Figure 1, Table 6). Any problems should be taken seriously, unless recruitment included some participants different from target population (e.g. with unusually low literacy levels) . Reporting examples of how the translation was finalised are given in Supplemental File 1. There appears to be no consensus about what criteria should be used for deciding whether or not to implement changes at this stage, and who should be responsible for approving those changes. It probably depends on size and representativeness of the field sample and coverage of key subpopulations (see Aquino et al. 2011). Major changes should be done only when it is absolutely necessary, should be back translated to confirm semantic equivalence and referred back to the multidisciplinary Committee for review. An example is where items were found to be not relevant to the target culture (see Item 5b example by Weinstein et al. 2015). Our advice is to report these problematic items, so future investigators can be aware of them.  source-language Questionnaire Developer should receive the final translation, the certification of translation, the reconciliation reports and the concept definitions created for this purpose. A final approval and an acknowledgement of reception are always desirable, even if not required beforehand. These materials are all useful for the future. Again, we are not aware of an Audiology example in the published literature. However, instructions on how to proceed with final translations are usually included under instructions to investigators for specific questionnaires. Two online examples are EuroQoL and HealthMeasures (see Supplemental File 1).

ITEM 6D. FINALISE AND ARCHIVE A REPORT
A written report creates a permanent record of the procedures followed, the information collected, the translation interim and final versions, and the decisions made at each stage. Supplemental File 4 contains a recommended checklist of all recommended archival documentation relating to the process of cross-cultural adaptation of patient-reported questionnaire measures. Many of these may not be published, but should be available on request. Wherever possible, it is also advisable to publish a summary of the translation process in a peer-reviewed journal for future reference. Many of the translations of questionnaires in Audiology are published as peer-reviewed journal articles, but often the details of all the different steps are not reported sufficiently well for the reader to follow exactly what was done. The items listed here (Tables 2-7) define preferred reporting standards. Supplementary File 5 itemises the preferred reporting items. We recommend that a completed list is submitted along with the manuscript to help journal reviewers locate which page of the manuscript contains a description of each individual step in the process. Similar lists exist in other areas [see the preferred reporting items in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2009 checklist for systematic reviews and meta-analysis].
Besides the publication of the process, the questionnaire instrument itself can be included in the journal article. We draw attention to the possibility that this might transfer copyright ownership of that instrument to the publisher depending on the copyright agreement, with a potential unintended consequence of restricting dissemination or future modification or translation of that version of the instrument. Alternatively, the Open Access publishing with a Creative Commons public licence promotes the access and re-use of any materials included in that journal article. ''Open Access'' is a term indicating that the relevant work has been licenced by the copyright owner for use in some of the ways that otherwise might require their specific permission. We strongly encourage investigators to publish the translated instrument (with all titles, introductory text, instructions for the administrator of the test, instructions for participants, response scale anchors and scoring instructions) as an integral part of the journal article under a Creative Commons licence with a ''No Derivatives'' attribution (i.e. CC BY-ND or CC BY-NC-ND). ''ND'' prevents any modified versions being distributed, while ''NC'' prohibits commercial usage. In contrast, publishing without an ''ND'' attribution (i.e. CC BY or CC BY-NC) enables anyone to modify and distribute the questionnaire. We note that publishing without ND, appears to be a common practice in those Open Access articles cited in our review (e.g. Caporali et al. 2016;Rogers et al. 2016;Wrzosek et al. 2016).
Putting the translated instrument out in the public domain, such as a website, yields the same loss of control over usage even if the version is watermarked with ''do not copy'' (e.g. http://harlmemphis.org//index.php?cID¼130).

Conclusions
We recognise that hearing healthcare professionals need to play a central role in good translation and adaptation of hearing-related questionnaires. Consulting with hearing professionals and members of the target populations helps to ensure that the questionnaire addresses the needs of the target population. This guideline provides step-by-step recommendations for that process. But, these are just the first essential steps because certainty about functional equivalence requires further quantitative steps to examine the psychometric soundness of the translated questionnaire instrument (e.g. Regnault and Herdman, 2015). To some readers these standards may seem laborious to follow, but they reflect the best practice and would increase expectations that the translated questionnaire instrument performs in the same way as the original. Documenting the process is equally important and we encourage investigators to publish the cross-cultural adaptation. Supplemental File 4 is the checklist of all the preferred reporting items described in this article. We recommend that investigators who are following this step-by-step guide should submit a completed version of the checklist along with the article, noting the page number corresponding to the description of each step (much like the recommended use of the PRISMA 2009 or Consolidated Standards of Reporting Trials 2010 checklists). Whenever there are difficulties complying with any of the recommendations, such as difficulties in finding bilingual speakers for the translation processes, these should be fully explained and their potential risks are carefully considered. We hope that this checklist of preferred reporting items will be adopted widely by Otology and Audiology journal editors and researchers alike. Note 1. OMERACT use the term ''interpretability'' but we have chosen not to use it here because ''interpretability'' has another common meaning which refers to the way in which professionals might interpret the results, through instructions or training.

Disclosure statement
The views expressed are those of the authors and not necessarily those of the societies or institutions they represent.