Measuring mental health burden in humanitarian settings: a critical review of assessment tools

ABSTRACT Background The effects of disasters and conflicts are widespread and heavily studied. While attention to disasters’ impacts on mental health is growing, mental health effects are not well understood due to inconsistencies in measurement. Objective The purpose of this study is to review mental health assessment tools and their use in populations affected by disasters and conflicts. Method Tools that assess posttraumatic stress disorder, depression, substance use disorder, and general mental health were examined. This review began with a search for assessment tools in PubMed, PsycINFO, and Google Scholar. Next, validation studies for the tools were obtained through snowball sampling. A final search was conducted for scientific studies using the selected tools in humanitarian settings to collect the data for analysis. The benefits and limitations described for each tool were compiled into a complete table. Results Twelve assessment tools were included, with 88 studies using them. The primary findings indicate that half of the studies used the Impact of Events Scale-Revised. The most common limitation discussed is that self-report tools inaccurately estimate the prevalence of mental health problems. This inaccuracy is further exacerbated by a lack of cultural appropriateness of the tools, as many are developed for Western contexts. Conclusion It is recommended that researchers and humanitarian workers reflect on the effectiveness of the mental health assessment tool they use to accurately represent the populations under study in emergency settings. In addition, mental health assessment should be coupled with action.


Background
Disasters and conflicts create humanitarian crises that occur globally and affect millions of people yearly. A humanitarian setting is a setting in which a natural or manmade disaster or civil conflict occurs that exceeds local coping capacity and requires external assistance or humanitarian action [1]. In 2018, 315 natural and technological disasters occurred [2]. The majority are natural, and most disasters from 1998 to 2017 were extreme weather events, such as floods, droughts, and heat waves [3]. Other natural and technological disasters include earthquakes, hurricanes, and large-scale accidents. Interest in their mental health effects has grown due to the potential for trauma. Synthesized research about disaster mental health shows that posttraumatic stress disorder (PTSD), major depressive disorder, and substance use disorder are common outcomes [4]. Other outcomes of interest include generalized anxiety disorder (GAD), prolonged grief, panic disorders, and phobias; however, these outcomes are less frequently studied than PTSD, depression, and substance use [4]. In addition to natural and technological disasters, conflicts and related displacement greatly contribute to the global population in need of humanitarian assistance. Mental health research in humanitarian settings is heavily focused on PTSD and indicates that the prevalence of PTSD and depression in these settings is much higher than in the general population [5].
Though the morbidity and mortality of conflictaffected populations are decreasing due to effective disease control programs, these populations continue to face safety concerns with the prolonged nature of contemporary conflicts [6]. Furthermore, conflict research shows that civilians who experience war conflicts, especially women and children, are at a high risk for persisting mental health effects [7]. Displacement contributes to stress and is associated with loss of a loved one, destruction of the home, and limited access to stable resources [7]. The damage to infrastructure that conflicts bring to communities removes access to mental health resources and exacerbates individuals' stress [7].
Great variability exists among the methods of evaluating mental health in humanitarian settings [4]. The lack of standardization in assessment approaches hinders researchers' and humanitarian organizations' ability to ascertain the true impact of disasters on mental health. For example, a systematic review of literature up to November 2013 on the mental health outcomes of Iraqi refugees in Western countries shows the prevalence of PTSD and depression ranging from 8% to 37% and 28% to 75%, respectively [8]. In-depth diagnostic interviews may be the gold standard for such measures, but research in humanitarian settings warrants more brief and easy-to-use tools that measure only symptoms and thus do not require the presence of a clinician. In addition, rapid screening tools can be useful in decision-making and program planning due to their ability to obtain the burden of mental distress in a time-limited setting. The purpose of this critical review is to evaluate the use of different tools for studying or assessing the mental health effects of disasters and conflicts. The outcomes of interest are PTSD, depression, anxiety, substance use disorder, and general mental health and were chosen due to their high prevalence in disaster and conflict research.

Methods
Three searches were conducted for this review: the first search collected commonly used mental health assessment tools, the second collected their validation studies, and the third collected studies that used these tools in disaster or conflict mental health research.

Assessment tool search
A list of mental health assessment tools was compiled using Google Scholar, PsycINFO, and PubMed search engines. Each tool had to be individual, brief, developed in or after 1990, and non-diagnostic to be included in the study. A combination of the following MeSH keywords was used for this search: 'symptom assessment,' 'standards,' 'emergencies,' 'disasters,' 'humanitarian assistance,' 'mental health,' 'posttraumatic stress disorder,' 'depression,' 'substance use disorders. ' We employed snowball sampling to obtain comprehensive information about the tools and ascertain which tools are commonly used, since we had limited initial information regarding the properties of commonly used tools in emergency settings. The length, purpose, and existence of translations for each tool were ascertained. We excluded tools that evaluate community needs, assess lifetime mental illness, or involve in-depth interviews. We selected the most recent version if multiple versions of the tool existed.

Validation study search
We then conducted a search on PubMed and Google Scholar and obtained psychometric properties and validation studies to present consistency in validation and the presence of cross-cultural applications of the tools in the existing literature, regardless of population or setting. Validation studies include studies in which researchers determine if the tool adequately distinguishes between a distressed and a non-distressed person, and the tools are often validated against an existing widely used tool such as the General Health Questionnaire. For this search, we did not employ MeSH search terms; rather, we searched the terms '[assessment tool]' and 'validation study' and recorded the studies that affirm or deny the validity of the tool in specified languages and/or populations.

Study search
Finally, we conducted a targeted review of peerreviewed literature that has used one of the selected assessment tools in humanitarian settings, using both PubMed and Google Scholar. For this final search, a combination of the following MeSH keywords was employed: [assessment tool (not MeSH)] and 'natural disasters,' 'armed conflicts.' An experienced librarian at UCLouvain validated the search methodology. Inclusion and exclusion criteria are summarized in Table 1.
If no studies corresponded with a particular tool, then that tool was dropped from the list, as we could not provide an adequate recommendation without evidence of the tool's utility.
Doubts regarding study or tool eligibility were discussed between AM, MMA, PS, and JvL.
We extracted the benefits and limitations cited in each study regarding the particular tool and its utility in populations affected by disaster or conflict. Based on these observations, we described the main strengths and weaknesses of each tool in assessing the mental health outcomes in these populations.

Assessment tool search results
The assessment tool search resulted in a total of 27 tools for analysis consisting of nine tools for PTSD, seven tools for general mental health, six tools for depression, three tools for anxiety, and two tools for substance use disorder (Figure 1). Fifteen tools were excluded from the study due to a lack of evidence regarding their use in populations affected by disaster or conflict. Twelve tools remained for analysis: seven tools for PTSD, two tools for general mental health, two tools for depression, and one tool for anxiety. We did not identify any tools that evaluated substance use disorder that matched our eligibility criteria. Three tools, the Posttraumatic Symptom Scale -Self Report, SPAN, and Davidson Trauma Scale, required payment to view the full tool details but were nevertheless included due to adequate secondary information. Table 2 presents the year published, psychometric properties, and symptom period of the tools. Most tools exhibit high reliability and validity for the populations in which they were originally developed. Tool length ranges from 4 to 33 items and takes between 5 and 10 minutes. The tools also specify that symptoms should last between 1 week and 1 month. Table 3 presents the validated populations and languages for each tool. The tools have been validated across a variety of different populations and regions. The PHQ-9 had the most validation studies backing it. Most of the tools have been validated in a language other than English. The PSS-SR and the WASSS are the only tools with no validation studies.

Study search results
Of the 86 studies included in the review (Figure 2), 82 focused on people affected by natural and technological disasters and four focused on people affected by conflict. Thirty-four different disasters were studied. The 2008 Wenchuan earthquake and 2005 Hurricane Katrina were the top two most frequently studied disasters with 17 and nine studies, respectively. Of the four studies that examined the effects of conflict, three focused on people affected by the Georgian conflict and one focused on those living in the Gaza strip. All tools but the SQD originated in English. The SQD originated in Japanese but was translated into English for validation. The greatest number of tools was available in Nepali, while the greatest number of studies used a Chinese translation of the tools. Other translations may be available for the selected tools but were not identified due to lack of validation.
The main strengths and limitations for each tool are presented in Table 4. The IES-R, measuring PTSD symptoms, is by far the most widely used tool among all of the studies, with 44 of the 86 studies using it. The second most widely used tool among the studies is the CPSS, with 11 studies using it to study the posttraumatic effects of crises on children.
The most common strengths described for the screening tools are convenience and brevity. However, the limitations of the tools comprised the bulk of the information discussed in the studies. The most common limitation described for all tools, cited 64 times, is that a self-report screening tool is not diagnostic and can therefore over or underestimate the prevalence of the given disorder. However, some studies also list the self-report aspect as a benefit and state that it can provide valuable information about an individual's wellbeing [86]. Another common limitation described is the lack of cultural sensitivity. Most of the tools were developed based on the Diagnostic and Statistical Manual (DSM) criteria, which were established by the American Psychological Association. The origins of many tools in this review may result in cultural bias, even if the tool has been validated in a certain population or translated to another language [87][88][89]. A lack of a suggested cutoff point for diagnosis is the third most common limitation among the studies. Some studies using tools such as the IES-R set their own cutoff point depending on the characteristics of the population and follow previous studies in similar settings. This provides versatility; however, it also lends to inconsistency. Comparisons across populations cannot be made if the cutoff is different for different studies.
The SQD and WASSS, though less frequently used than other tools, were designed particularly for humanitarian settings to briefly identify those in distress after a crisis. The SQD has been used more than the WASSS and is designed for time-limited situations [90].

Discussion
This unprecedented review highlights the high number of existing mental health assessment tools that have been used in the context of disasters and conflict, as well as their benefits and drawbacks. We  [21] 2012 Unknown Unknown 6 items plus a household roster 2 weeks 7-8 minutes identified 12 assessment tools for further analysis, most of which have exhibited high reliability and validity in the populations for which they were originally developed. A systematic literature search uncovered 86 studies that assessed mental health in populations affected by disasters and conflict using one of these tools, half of which used the IES-R. Differential use of assessment tools across studies contributes to the fragmentation of knowledge of the burden of mental health issues in humanitarian settings. Each tool has its own levels of sensitivity and specificity, especially those with variable cutoffs. Furthermore, the disorders have different latency periods from exposure to symptom manifestation, as accounted for by the symptom period specified in the tool characteristics. The timing of measurement can greatly affect estimated prevalence. This fragmentation not only impedes synthesis of knowledge of the effects of disasters and conflicts, but also might lead to multiple assessments of the same communities, resulting in increased emotional and time burden for them. In addition, the tools used may not be culturally appropriate for measuring mental health outcomes in these communities.

SPAN N/A
• Chinese [84] • Korean [71] General mental health tools SQD • People affected by earthquake in Japan [20] • Italian [85] WASSS N/A N/A *HIV: Human Immunodeficiency virus; MSM: men who have sex with men. and the tool most used to study PTSD was the IES-R. The second most studied outcome was depression, for which most of the studies used the PHQ-9. Of all the tools, the PHQ-9 was the most frequently validated, indicating its wide usage outside of humanitarian research. Anxiety was the third most studied outcome and was measured by the BAI. General mental health, measured using the WASSS and SQD, was the least studied outcome. While studies that measured the mental health effects of natural and technological disasters and conflicts were eligible for inclusion, the vast majority of studies in this review focused on natural disasters. Surprisingly, the only conflict-affected populations studied were those who lived in the Gaza strip and those who experienced the Georgian conflict, indicating a dearth in mental health research on civilians in conflict. Further, few studies measured the effects of technological disasters on population mental health, which may be due to a generally smaller impact size of technological disasters compared to natural disasters.
The primary limitation cited in the studies is that a self-report tool may result in inaccurate estimates of the prevalence of a disorder. Self-report screening tools are inherently not diagnostic, as they are designed to rapidly assess those with the highest likelihood of the outcome of interest. Using screening tools to measure the prevalence of a mental health outcome is problematic, because such tools were not designed to definitively assess an individual. However, the alternative 'gold standard' diagnostic interview is not feasible in humanitarian and emergency settings or for the purposes of medium-scale mental health projects without adequate funding. The benefit of screening tools for these purposes is that they are rapid, while diagnostic interviews are lengthy and require the presence of a clinician.
The cultural appropriateness of the tools is an important consideration when using the tools, especially in a global context. Cultural appropriateness of assessment methodology is one of the guiding principles of the Interagency Standing Committee's (IASC) assessment of mental health in humanitarian emergencies [171]. Only one tool, the SQD, was developed in a non-western context. The tools in this review that were developed for high-income western populations and later translated and implemented in low-and middle-income countries could result in culturally insensitive questions, meanings lost in translation, and ultimately inadequate measurement of true effects. Because most assessment tools are based on DSM criteria, they are inherently westernbased and may not produce valid findings in crosscultural mental health research.
evidence and experts to make an informed decision on where to set the cutoff. Some tools have substantially more evidence of use, which might indicate that they are more suitable than others for mental health assessment. While abundant evidence allows for comparisons between and within populations in research, it does not necessarily mean that the tools accurately measure the prevalence of mental health outcomes. On the other hand, tools that were developed specifically for humanitarian situations may be more accurate than other tools when assessing the mental health of those affected by disasters and conflicts. However, these tools that specifically ask about a traumatic event cannot be used in a control group that has not experienced that event. In addition, tools such as the WASSS and the SQD are fairly new and thus do not allow for ready comparison between populations. The motivations behind the use of the assessment tools will ultimately determine which tool is most appropriate for a particular setting.
The importance of mental health assessment in crisis-affected populations is clear. Knowing these effects can inform preparedness and response to a large-scale trauma. However, individuals using these tools must consider the utility and implications of their use. As emphasized by the IASC, the needs of the crisis-affected populations should be prioritized.

Strengths and limitations
The primary strength of this study is that it is among the first to analyze the benefits and limitations of a variety of tools that assess multiple mental health outcomes in populations affected by disasters. Much of the limited existing literature on this topic revolves around a single tool or mental health outcome or only discusses the psychometric properties of the tools [175,176]. In addition, the findings of this review can be used by both researchers and humanitarian workers since the tools included were designed for use in informal settings without the presence of a clinician. As the tools discussed are screening tools, they can be used to estimate prevalence and the care needs of the population to quickly identify those who are in distress.
Some limitations exist in this review. The search method for assessment tools was not systematic, and thus may have overlooked relevant tools or studies. However, the search was extensive and included a wide range of the literature. In addition, some tools may not have been identified through the snowball sampling method. However, this method allowed for a selection of a variety of tools with limited initial information and a reasonable number of tools have been included. Some tools require payment for access, and we were not able to fully examine them for analysis. Nonetheless, adequate information for these tools was available through secondary sources. Finally, the SQD and WASSS were recently developed, and there was little evidence of their use. This limited the conclusions that could be made about these tools. However, their inclusion in the review provided valuable information, as they were specifically designed for crisis-affected populations.

Conclusion
The assessment of mental health in humanitarian settings is highly fragmented due to the use of a wide range of assessment tools. This review provided a thorough analysis on each of the identified tools. Moving forward, researchers and humanitarian workers must understand the implications of using brief mental health assessment tools in affected populations in order to better mitigate the impacts of future emergencies. This review provides the basis for further research on instruments to measure the mental health of populations affected by disasters and conflicts.
Three prominent gaps exist that must be addressed. First, there is no standard assessment tool for disaster and conflict settings. Second, little is known about assessment tool applicability to conflict settings. Third, these studies lack practical next steps to address the mental health outcomes they measure. Fortunately, greater awareness of mental health effects of mass trauma can motivate key stakeholders to close these gaps.

Author contributions
Ashley Moore carried out the tool and study search and wrote the majority of the paper. Joris Adriaan Frank van Loenhout, Maria Moitinho de Almeida, and Pierre Smith proposed the study idea and heavily edited the final manuscript. These authors, along with Ashley Moore, discussed the methodology for the study at length and were involved in deciding which tools to include or exclude. Debarati Guha-Sapir approved the final manuscript and assisted with submission.

Disclosure statement
No potential conflict of interest was reported by the authors.

Ethics and consent
Not applicable.

Funding information
None.

Paper context
Disasters and conflicts exacerbate and induce psychological symptoms. However, the estimated prevalence of these conditions can vary depending on assessment tool. Little is known about which tools are most effective in measuring mental health in disaster and conflict settings. This paper outlines commonly used tools and provides recommendations based on the tool characteristics discussed by the studies reviewed. Researchers should consider these characteristics and choose the most appropriate tool for the study population.