A new short version of the Posttraumatic Diagnostic Scale: validity among Japanese adults with and without PTSD

ABSTRACT Background: Identifying high-risk groups for posttraumatic stress disorder (PTSD) during evacuation situations requires a valid short screening tool. The re-experiencing symptoms of PTSD are considered helpful for distinguishing those with PTSD from those without, as they are thought to be specific to PTSD, have less ambiguity for respondents, and are representative of all PTSD symptoms. Objective: To develop a new short version of the Posttraumatic Diagnostic Scale (PDS) comprising only re-experiencing symptom items. Method: We used existing data (N = 169) from our previous study on the Japanese version of the PDS and the Clinician-Administered PTSD Scale (CAPS). The sample included both clinical outpatients (n = 106) and university students (n = 63), all of whom reported one or more traumatic experiences. We created candidate 2- and 3-item versions of the PDS and compared their psychometric characteristics against the CAPS. Results: The best candidate (comprising items for ‘intrusive images’, ‘nightmares’, and ‘physiological reactions when reminded of the trauma’) demonstrated an area under the curve of .95, 94.8% sensitivity, 86.1% specificity for the best cut-off score of three. The candidate scale also showed a strong correlation with CAPS-evaluated severity score and internal consistency. Conclusions: The brief re-experiencing PDS had good psychometric properties among Japanese adults with and without PTSD.


Introduction
Posttraumatic stress symptoms are among the most devastating psychological reactions following life-threatening experiences. However, they tend to be under-diagnosed, particularly during extreme circumstances such as natural and manmade disasters, including earthquakes, terrorist attacks, war crimes, or accidents, when resources for evaluation are typically limited (Disaster Psychiatry Outreach, 2008;Katz, 2011). Despite a general tendency for spontaneous recovery, certain individuals will follow the more chronic path and ultimately develop posttraumatic stress disorder (PTSD). An early evaluation and diagnosis are crucial (Galea, Nandi, & Vlahov, 2005;Watson, Gibson, & Ruzek, 2014).
Considering that short scales must consist entirely of clear and essential questions for screening people during an evacuation situation, we assume that items on reexperiencing symptoms could be useful for the following three reasons. First, re-experiencing symptoms are more specific to PTSD compared to hyperarousal or numbing symptoms. Hyperarousal, despite being a main symptom of PTSD, are often observed in other disorders (e.g. generalized anxiety disorder, caffeine intoxication, tobacco withdrawal). Similarly, numbing symptoms are often confused with negative symptoms of schizophrenia, as well as anhedonia in depression (Pandya, 2011).
Second, questions regarding hyperarousal and avoidance symptoms tend to be more ambiguous for evacuation survivors, limiting their utility. For example, the questions concerning 'hyper-alert' or 'trouble falling asleep' (hyperarousal symptoms) could be easily confused with the anxiety resulting from being in an unfamiliar environment or real-life problems after the event (e.g. aftershocks of an earthquake, problematic behaviours of the abusers or other family members). In addition, items such as 'trying to avoid activities, situations, or places that remind you of the trauma' or 'psychological isolation' (an avoidance symptom) might also be ambiguous for people during an evacuation situation. These items could be endorsed 'almost all' of the time because evacuees are separated from a familiar place/people. Conversely, respondents could respond 'not at all' because once they have already evacuated, avoidance was no longer needed, or there is a need to communicate with other evacuees or supporters to receive help or information. Furthermore, certain avoidance items may be more difficult to answer, especially for people deprived of sufficient time or cognitive resources immediately following a trauma. That is, items such as 'trying to avoid thoughts or feelings related to the trauma', 'trying to avoid activities, situations, or places that remind you of the trauma', or 'not being able to remember important parts of the trauma' requires respondents to count the 'absence' of an action (not doing something). This may be more difficult than counting the 'existence' of action (frequency of doing something, e.g. intrusive images). For these reasons, there are doubts surrounding the inclusion of hyperarousal or avoidance symptoms within a self-administered screening scale for survivors of an evacuation scenario.
Third, items on re-experiencing symptoms may be representative of overall severity of PTSD symptoms. Lang and Stein (2005) have shown that the re-experiencing items of the 17-item PCL (Blanchard et al., 1996;Lang & Stein, 2005;Tiet, Schutte, & Leyva, 2013) had the strongest correlations with total PTSD severity score. Furthermore, an abbreviated version of the PCL scale comprising only the two re-experiencing items had good validity for a diagnosis via the PTSD section of the Composite International Diagnostic Interview Version 2.1 (a sensitivity of .96 and a specificity of .58). Similarly, a 2-item re-experiencing PCL scale had good validity comparable to the PC-PTSD according to the areas under the curve (AUCs), which are calculated through a receiver-operating characteristic (ROC) analysis; the former demonstrated AUCs of .77-.88 (Lang & Stein, 2005;Tiet et al., 2013) and the latter AUCs of .80-.88 (Bliese et al., 2008;Calhoun et al., 2010;Tiet et al., 2013). In addition, adding avoidance and hyperarousal symptoms to this re-experiencing scale did not always result in increases of the AUCs: the AUCs for the 3-, 4-, and 6-item versions were .86, .86, and .89 (Lang & Stein, 2005); .84, .85, and .84 (in a sample of patients with substance use disorder; Tiet et al., 2013);and .77, .78, and .77 (in a sample of general mental health patients; Tiet et al., 2013), respectively. In other words, the results suggest that even a short scale comprising only two re-experiencing items could have comparable diagnostic validity to 3-, 4-, or 6-item version scales that include items on avoidance and hyperarousal.
The purpose of this study was to develop a short re-experiencing scale from the items of the PDS that corresponds to a PTSD diagnosis made via the Clinician-Administered PTSD Scale (CAPS). Although previous studies developed abbreviated scales comprising items that were highly correlated with a PTSD severity score, they did not confirm whether these scales were optimal for determining a PTSD diagnosis made via the CAPS. Thus, in this study, we created short scales by systematically choosing a set of items with the highest validity for a PTSD diagnosis and sufficient internal consistency. 1 While PTSD severity is generally related to PTSD diagnosis, there could be symptoms that contribute substantially to the severity score but are less relevant for an accurate diagnosis (e.g. A2 criterion, which was deleted in the DSM-5; Osei-Bonsu et al., 2012). We compared 2-and 3-item versions of the new reexperiencing scales, because a screening scale can never be too short in terms of reducing the burden on respondents, but longer scales can increase the likelihood of correctly identifying the concept of concern. We used the PDS to develop a short scale because it has as high validity as other self-rating scales such as the PCL (Adkins, Weathers, McDevitt-Murphy, & Daniels, 2008), corresponds to the DSM criteria for PTSD, and has been validated in Japanese (Itoh et al., 2017).

Participants
We used data from the PDS and CAPS collected in our previous study (Itoh et al., 2017) with clinical and nonclinical samples (N = 225). The data were screened and excluded if (1) the complete item data were not accessible (n = 3) or (2) there was no listed traumatic event in the checklist on Part 1 of the PDS (n = 53, nearly half of the non-clinical sample). In the latter case, further evaluation for PTSD symptoms was stopped because the respondents had no trauma-related symptoms. If one or more traumas were checked on Part 1 of the PDS, the respondents proceeded to the following questions that assessed symptoms. The index trauma reported on the PDS was the same as obtained from the CAPS interview. We consequently analysed data from a total of 169 participants; 106 were outpatients who visited the psychiatric ward for trauma therapy at a women's clinic at a medical university located in Tokyo, Japan, and the remaining 63 were undergraduates who reported one or more subjective traumatic experiences and consented to participate in a study organized by their university. Ethical clearance was obtained from the ethics committees of our affiliated universities.

Posttraumatic Diagnostic Scale (PDS/DSM-IV)
In the present study, responses to the full-version of the PDS were used to examine the short scales. For example, we chose responses to re-experiencing symptoms of B1 and B2 from the five re-experiencing items for a 2-item candidate scale. In total, we created 10 patterns of 2-item candidate scales and another 10 patterns of 3-item candidate scales.
The PDS (Foa, 1995;Foa et al., 1997) is a self-rating inventory that corresponds to the diagnostic criteria for adult PTSD of the DSM-IV. The Japanese version (Itoh et al., 2017) was used in this study; its validity is high (AUC of .97 against the CAPS diagnosis, the sensitivity of 97.0%, and specificity of 93.6%). The PDS comprises four parts assessing traumatic experiences (Part 1 and 2), symptom severity over the past month corresponding to Criteria B-D of the DSM-IV (Part 3), and functional disorder (Part 4).
The responses to Part 3 of participants who reported having one or more trauma in Part 1 were used in the present study. We analysed the 4-point scale responses (0 = 'not at all or only one time' to 3 = 'five or more times a week/almost always') to the five re-experiencing symptoms (Criterion B). These symptoms were (B1) intrusive images, (B2) nightmares, (B3) reliving of the trauma, (B4) emotionally upset when reminded of the trauma, and (B5) physical reactions when reminded of the trauma (Table 1).

Clinician-Administered PTSD Scale (CAPS/ DSM-IV)
The CAPS (Blake et al., 1995) is a semi-structured interview known as the gold standard for diagnosing and assessing the symptom severity of PTSD. We used the Japanese version of the CAPS (Asukai, Hirohata, Kato, & Konishi, 2003) as the external criterion for evaluating the validity of the 2-or 3item versions of the PDS. The CAPS was designed to assess the frequency (0 = 'never' to 4 = 'daily or almost every day') and intensity (0 = 'none' to 4 = 'extreme') of each of the 17 DSM-IV symptoms and associated features of PTSD. Blake et al. (1995) recommended that a frequency score of '1' and an intensity score of '2' are required for a particular symptom to meet the DSM-IV diagnostic criteria for PTSD; thus, we followed this recommendation for the present study. A severity score was calculated by summing the frequency and intensity scores (range: 0-136). All of the CAPS interviewers (clinical psychologists or graduate students majoring in clinical psychology) had received specialist training on administration of the CAPS.

Data analysis
To obtain a detailed description of the performance of the 2-or 3-item candidate scales in diagnosing PTSD, we compared the AUCs of each candidate. The AUCs were calculated for the total score of each 2-or 3-item candidate scale (range: 0-6 for the 2-item scales and 0-9 for the 3-item scales) against the CAPS diagnostic outcome (i.e. a PTSD diagnosis of 'positive' or 'negative'). We further calculated the sensitivity and specificity at the optimal cut-off score (where the sum of the sensitivity and specificity was maximal). In addition, to evaluate the validity of the candidate scales in assessing PTSD symptom severity, we calculated Pearson's correlation coefficients between the total score of each candidate scale and the CAPS symptom severity score. The internal consistency of the candidate scales was also examined using Cronbach's alpha coefficients. All of the analyses were conducted with the entire sample rather than separately for the clinical and non-clinical subsamples. Ideally, it would be desirable to choose a sample within the spectrum of exceedingly mild to severe symptoms in order to examine the validity of this screening tool, but our available data mostly comprised people with severe symptoms (i.e. a clinical subsample) and people with mild symptoms (i.e. non-clinical subsample). In such subsamples, assuming that all of the participants in the clinical subsample are truly positive for PTSD, the sensitivity and specificity of a scale would be 100% and 0%, if the scale has perfect validity. Similarly, assuming that all of the participants in the non-clinical subsample are truly negative for PTSD, the sensitivity and specificity of a scale would be 0% and 100%, if the scale has perfect validity. Thus, separate analyses for each subsample would not provide meaningful results for evaluation and, in the present study, the analysis was conducted on the combined sample to include a more diverse range of people.
All analyses were performed with SPSS Statistics 23 or the R version 3.2.4 Revised Epi package (plots of the ROC and calculation of the optimal cut-off scores; there were no differences between both software programs). All parameters were calculated with a 95% confidence interval.

Characteristics of the study participants
The characteristics of the clinical and non-clinical subsamples are shown in Table 2. The clinical participants were older than the non-clinical participants and were more likely to be female and have CAPSdiagnosed PTSD. For traumatic events, the clinical subsample more often reported nonsexual or sexual Index trauma was self-rated via the Posttraumatic Diagnostic Scale and confirmed to be the same with what was obtained through the Clinician-Administered PTSD Scale (CAPS) interview. Some participants indicated that more than one event disturbed them the most; PTSD, posttraumatic stress disorder diagnosed via the CAPS. Other traumatic events included sustained domestic violence (n = 6), abuse (n = 2), sexual or power harassment (n = 2), sexual or nonsexual crime (n = 2) in the clinical subsample; bullying (n = 3), injury (n = 3), unnatural death of close person or close place (n = 3), painful childhood incidents (divorce, punishment) in the family (n = 2), victim of stalker, molester, or encounter with a stranger with a weapon (n = 3), other setbacks (n = 2), or no answer/ unclear (n = 2) in the non-clinical subsample. Most of the non-clinical participants (87%) answered that they experienced the index trauma over a half year ago, but data for the clinical sample was not clear.
assault, whereas the non-clinical subsample more often reported accidents, fires, or other events. The prevalence of PTSD assessed via the CAPS was 57.4% (entire sample). Table 1 shows the five re-experiencing symptoms of the DSM-IV (which are the same as those in the DSM-5). The number of participants who answered '1' or more to each item were then separated according to whether they met the CAPS criteria for a PTSD diagnosis (PTSD+) or did not (PTSD-). Except for 'reliving of the trauma (B3)', the re-experiencing symptoms were experienced by over 88% of the PTSD+ participants. In particular, 'emotionally upset when reminded of the trauma (B4)' was experienced by the almost all of the PTSD+ participants, as well as nearly half of the PTSD-participants.

ROC analysis
The AUCs of candidate scales are indicated in Table 3. The values were generally large (AUC = .89-.95) among all candidate scales. As indicated in Figure 1, the best ROC curve for the 3-item scales was that obtained from the candidate scale comprised 'intrusive images (B1)', 'nightmares (B2)', and 'physiological reactions when reminded of the trauma (B5)' (hereafter, this candidate is referred to as PDS B1,B2,B5 ). For the 2-item scales, the best ROC curve was observed for the candidate scale comprised 'nightmares (B2)' and 'physiological reactions when reminded of the trauma (B5)' (hereafter, this candidate is referred to as PDS B2,B5 ). Separate analyses for the clinical subsample (n = 106, including 94 PTSD), university subsample (n = 63, including three PTSD), and female subsample (n = 137, including 97 PTSD) revealed that the PDS B1,B2,B5 had the highest or top level AUCs among the candidate scales (Table 4). The optimal cut-off scores were a total score of ≥3 and ≥2 for the PDS B1,B2,B5 and the PDS B2,B5 , respectively. At these cut-off scores, the sensitivity, specificity, and efficiency of the scales were calculated. These values for the PDS B1,B2,B5 and PDS B2,B5 were generally high (>85%). The PDS B1,B2,B5 showed better sensitivity (94.8%) than did the PDS B2,B5 (91.8%), but the PDS B2,B5 had a somewhat better specificity (88.9%) than did the PDS B1,B2,B5 (86.1%). The efficiency value and the AUC indicated that the general diagnostic validity for these two scales was almost the same, although it was slightly higher for the PDS B1,B2,B5 than for the PDS B2,B5 . The positive and negative predictive values are not shown in Table 3, as these were originally valuable indices when the sample has a 'representative' respondent pool; this was not the case for our sample. Nevertheless, the positive and negative predictive values were both high for the PDS B2,B5 (91.8% and 88.9%, respectively) and PDS B1,B2,B5 (90.2% and 92.5%, respectively).

Correlational analysis
As shown in Table 3, there were generally strong positive correlations between the total scores of the candidate scales and the CAPS severity score (Pearson's r = .75-86). The strongest correlations among the 2-and 3-item candidate scales were for the PDS B2,B5 (r = .86, p < .01) and the PDS B1,B2,B5 (r = .86, p < .01), respectively. The PDS B2,B5 and the PDS B1,B2,B5 also demonstrated the highest values for the subsample analyses (Table 4).

Reliability analysis
As shown in Table 3, there were generally sufficient internal consistency coefficients for the candidate scales (Cronbach's alpha = .73-.92). The internal consistency was high for the PDS B1,B2,B5 (alpha = .88) and for the PDS B2,B5 (alpha = .82).

Discussion
Here, we sought to develop a short self-rating scale for screening PTSD, which focused on re-experiencing symptoms. Before discussing the obtained values, it is important to note our sample. Generally, in examining scale validity, researchers should ensure that the sample includes individuals with a range from mild to severe symptoms. In contrast, our examination was based on the data of a previous study, combining Japanese female outpatients suffering from traumatic experiences and university students who had subjective traumatic experiences. In short, our sample might have included only those with mild or severe symptoms, with few in the middle of the spectrum, thus resulting in bias. Therefore, it is important to note that the values obtained in the present study may be overestimated due to our ease of being able to discriminate between PTSD+/-. Nevertheless, despite this potential limitation, our original interest was in clarifying which symptoms (B1-B5) contribute more to a discrimination of PTSD +/-, assuming the possibility of overestimated values, regardless of the candidate scale(s) we employed. Therefore, the present results are useful as preliminary evidence toward developing a new short, PTSD scale.
The present results showed that the 3-item scale comprising 'intrusive images (B1)', 'nightmares (B2)', and 'physiological reactions when reminded of the Table 3. Discriminative values (95% CI), symptom severity correlation coefficients (Pearson's r), and reliability coefficients (Cronbach's alpha) of different candidates of the PDS 2-or 3-item scale applied to Japanese participants with and without PTSD. .89 Bold numbers indicate values of the best candidate for the 2-or 3-item scale. PDS, Posttraumatic Diagnostic Scale; PTSD, posttraumatic stress disorder; DSM-IV, Diagnostic and Statistical Manual of Mental Disorders, 4th ed.; SN, sensitivity; SP, specificity; AUC, area under the curve.
trauma (B5)', and the 2-item scale of 'nightmares (B2)' and 'physiological reactions when reminded of the trauma (B5)' had the highest AUCs. Their correlations with PTSD severity score were at least .86 and internal consistencies (alpha) were ≥ .82. Notably, the observed values were generally higher than were those for previous short scales. In particular, compared with the similar 2-item re-experiencing scales based on the PCL, the PDS B2,B5 , and PDS B1,B2,B5 had somewhat lower but similar (Lang & Stein, 2005; Study 2) or higher (Tiet et al., 2013; Sample 2) sensitivity and obviously higher specificity (Lang & Stein, 2005;Tiet et al., 2013). Additionally, in comparison with the PC-PTSD (Prins et al., 2003), which is a 4item scale widely used in primary care settings, the PDS B2,B5 and PDS B1,B2,B5 showed greater sensitivity despite fewer items, but similar specificity. These results can be explained by the following three reasons. First, our procedure for choosing items was systematic: we created a pool of candidate scales and chose the most appropriate ones in relation to a CAPS diagnosis (rather than PCL-evaluated severity). Second, our new scales were created from items in the PDS, and used a 4-point rating scale (with total ranges of 0-6 and 0-9 for the 2-and 3item scales, respectively). In contrast, the PC-PTSD uses a binary rating scale (yes/no; total scores range of 0-4); as such, the PTSD classification might be more precise in our scales because of the more detailed scoring procedure. Third, it might be easy to discriminate PTSD +/-in our sample. Our clinical subsample comprised people who had visited psychiatric clinics; thus, they might present with more severe symptoms, be less hesitant to confess their own symptoms, or express their own symptoms more explicitly to obtain others' support when compared with the non-clinical subsample. Previous studies might have found it more difficult to  discriminate PTSD +/-because their samples reflected real clinical populations and included more people with subtle symptoms. Although our scales should not be immediately applied to a clinical population, the chosen re-experiencing scales had adequate validity and generality. When examined by the specific re-experiencing items chosen, B1 was selected for both our scale and the 2-item version of the PCL (Lang & Stein, 2005). The B1 was rated as '1' or more by almost all of the PTSD+ participants in the present study (97%), and the B1 was also frequently experienced by PTSD+ participants in the original PDS validation study (98%; Foa et al., 1997); thus, this symptom is likely common among those with PTSD and is actually useful for PTSD screening. With respect to B2, fewer PTSD+ people reported this symptom in the present study (89%) and the original PDS validation study (78%; Foa et al., 1997). This item was not chosen for the 2-item version of the PCL but was chosen in this study. The reason for the difference could be that B2 was little experienced by the PTSDgroup (8% and 32%, in the present study and Foa et al., 1997, respectively), and, despite being helpful for discriminating PTSD+/-, was less helpful for representing total PCL-evaluated severity. B3 was chosen neither for our scale nor for the 2-item version of the PCL (Lang & Stein, 2005). B3 was the least-experienced symptom of the five re-experiencing symptoms in the PTSD+ group (79%), and a lower tendency for PTSD+ to report B3 (74%) was also reported by Foa et al. (1997). B3 might not affect the sum score of the 2-or 3-item scale, and thus may not contribute to discriminating PTSD+/-. In contrast, B4 was the most commonly reported symptom by PTSD+ (99%) and was chosen for the 2-item version of the PCL; however, it was not chosen for our scale. It was also experienced by nearly half of the PTSD-participants (49%), so it might not be helpful for discriminating PTSD+/-. In Foa et al. (1997), B4 was the most experienced symptom by the PTSD-(73%). Finally, B5 was chosen for our scale but not for the 2-item version of the PCL (Lang & Stein, 2005). B5 was experienced by many PTSD+ participants (97%) but fewer PTSD-participants (26%), suggesting that it might be useful for diagnosing PTSD. In summary, B1, B2, and B5 were experienced by about 90% or more PTSD+ participants but by 35% or fewer PTSD-participants.
With respect to the length of the scale, our results indicated similar values between 2-and 3-item versions. Although higher values tended to be observed for the 3-rather than 2-item version, the 95% confidence intervals of 3-item version included the corresponding values of 2-item version. There is currently no definitive evidence by which to recommend either the 2-or 3-item version. Interestingly, B2 and B5 were common to the 3-and 2-item versions. Both relate to physiological responses; thus, physiological assessments might be promising methods of screening for PTSD in the future.
Several limitations to this study must be mentioned. First, as noted above, the study sample did not include an appropriate spectrum of people with mild to severe PTSD symptoms, and the results were based on separate subsamples. The analyses with the clinical and university subsamples (Table 4) revealed stability in the results for the 3-item scale but not the 2-item scale. Therefore, validation of this instrument in a larger population (e.g. in an evacuation situation) should be performed in the future study. Second, most of the PTSD+ participants were female patients with interpersonal violence trauma. Although results with the whole sample did not differ from results for the female subsample, at least for the 3-item scale (Table 3, 4), gender differences in trauma types, psychological and biological responses, and subsequent outcomes have been suggested (Olff, Langeland, Draijer, & Gersons, 2007), and it is uncertain whether B1, B2, and B5 would be essential elements for a PTSD diagnosis in men. Nevertheless, even if our findings might be specific to a female sample, the present study has clinical value, because women have a higher risk of developing PTSD than men (Brewin, Andrews, & Valentine, 2000). Third, our results were based on a posteriori selection of response data from the original full-length PDS answer dataset, so 'framing' or 'carryover' effects of prior questions might be included. Whether the PDS B1,B2,B5 and PDS B2,B5 would show the same psychometric characteristics when presented by themselves should be considered in a future study.

Conclusion
For identifying high-risk individuals with PTSD during an evacuation situation, two items ('nightmares (B2)' and 'physiological reactions when reminded of the trauma (B5)') or three items ('intrusive images (B1)', 'nightmares (B2)', and 'physiological reactions when reminded of the trauma (B5)') provide adequate validity based on ROC analyses and when compared to the CAPS as an external criterion. Therefore, the present study represents an important step towards the development of a new short PTSD screening scale. We expect that further validation of the scales among a wide range of individuals who have experienced various traumas will help provide a higher quality screening tool for PTSD, particularly for evacuation scenarios and will improve the efficiency of early intervention. Note 1. We also adopted item response theory (IRT) to our data to understand characteristics of each re-experiencing item. We confirmed that the set of PDS 22, 23, and 26 (3-item version chosen by the ROC analyses) or PDS 23 and 26 (2-item version chosen by the ROC analyses), thought to offer a certain level of information for low to high latent PTSD, served the purpose of screening a wide range of people. For more information, please contact Yoshiharu Kim (kimpds3@gmail.com).