Original Research

Validation of Hospital Anxiety and Depression Scale in an Indonesian population: a scale adaptation study

Abstract

Objective This study aims to adapt the English-language Hospital Anxiety and Depression Scale (HADS) to the Indonesian language and evaluate the validity and reliability of the adapted version (ie, HADS-Indonesia).

Design A cross-sectional study was conducted between June and November 2018. First, a translation and back-translation process was conducted by a committee consisting of the researchers, a psychiatrist, a methodology consultant and two translators. Face and convergent validity and test–retest reliability evaluations were conducted. Next, structural validity and internal consistency analyses were performed. An intraclass correlation coefficient (ICC) test evaluated the scale’s test–retest reliability. A Spearman’s rank correlation coefficient was calculated to evaluate the correlation between HADS-Indonesia and Zung’s Self-rating Anxiety Scale (SAS) and Self-rating Depression Scale (SDS) for convergent validity evidence. Next, a structural validity analysis using exploratory factor analysis (EFA) and an internal consistency evaluation based on Cronbach’s alpha was conducted.

Setting This study was conducted in three villages in Jatinangor subdistrict, Sumedang Regency, West Java province, Indonesia; the villages were chosen based on their profiles.

Participants A total of 200 participants (male: n=91, 45.50% and female: n=109, 54.50%), with a mean age of 42.41 (14.25) years, were enrolled in this study using a convenience sampling method. The inclusion criteria were age ≥18 years old with basic Indonesian language literacy.

Results The overall HADS-Indonesia’s ICC value was 0.98. There was a significant positive correlation between HADS-Indonesia’s anxiety subscale and Zung’s SAS (rs=0.45, p=0.030) and between the depression subscale of HADS-Indonesia and Zung’s SDS (rs=0.58, p<0.001). The Kaiser-Meyer-Olkin statistics (KMO) (KMO=0.89) and Bartlett’s test of sphericity (χ2(91, N=200)=1052.38, p<0.001)) indicated an adequate number of samples for EFA. All items’ commonality was >0.40 and the average inter-item correlation was 0.36. EFA yielded a 2-factor solution explaining 50.80% (40.40%+10.40%) of the total variance. All items from the original HADS were retained, including its original subscales. The adapted HADS-Anxiety subscale consisted of seven items (alpha=0.85), and the HADS-Depression subscale consisted of seven items (alpha=0.80).

Conclusions HADS-Indonesia is a valid and reliable instrument for use in the general population of Indonesia. However, further studies are warranted to provide more sophisticated validity and reliability evidence.

What is already known on this topic

  • The Hospital Anxiety and Depression Scale (HADS) has been widely used to study anxiety and depression and has been adapted and validated in various cultures. To allow Indonesian conducting studies in the area, an accurate scale concerning their language and culture (ie, adapted HADS) is crucial; however, such a scale is, until today, non-existent.

What this study adds

  • This study adapts and validates HADS in the Indonesian language and provides evidence for its validity and reliability.

How this study might affect research, practice or policy

  • Such a scale will provide an accurate tool for Indonesian aiming to study anxiety and depression in this population. Furthermore, a successful validation of HADS into the Indonesian language might provide a possibility for HADS to be validated in, arguably, similar cultures, and thus, encourage similar studies and more research concerning anxiety and depression in the area.

Introduction

Anxiety and depression are two major mental health problems, and depression is predicted to become one of the leading causes of the global health burden by 2030.1 Excessive, uncontrollable worries and depressive features can lead to physical symptoms and a declination of social function and the ability to self-care, or even to death by suicide.2 3 Global data on anxiety and depressive disorders (2015) reported that depression and anxiety affect 4.40% and 3.60% of the global population, respectively, with variations between regions.4 Given their impact, the importance of the early detection of anxiety and depressive disorders is evident,5 and the availability of effective, efficient, valid, reliable and easy-to-use instruments is crucial. Among the instruments developed to screen and diagnose anxiety6–9 and depression,6 9–11 the Hospital Anxiety and Depression Scale (HADS) has been widely used for anxiety and depression screening.12 Psychometrically, HADS has been reported to provide consistent validity and reliability evidence. The original study of the development of HADS reported the correlation of items in the anxiety subscale (based on Spearman analysis) as ranging between +0.76 and +0.41, with p<0.001 for all items. As for the depression subscale, all items had a correlation ranging between +0.60 and +0.30, (p<0.020 for each item).6 In addition, HADS has been adapted for use in several languages. A literature review12 of HADS’ validity and reliability studies reported 19 studies which have adapted and validated HADS for use in several languages, mostly in the European populations, as well as in Arab and Chinese populations. Most of the studies reported in the review retained the original structure of HADS, which was a two-factor model. Its internal reliability has been reported between 0.76 and 0.90.12

Although HADS has been adapted for use in many languages and specific populations, the instrument has neither been properly adapted for use in the Indonesian language nor sufficiently tested psychometrically in the Indonesian population. Rudi et al13 conducted a study intended to adapt HADS for use in the Indonesian language and tested it on stroke patients in a hospital setting. However, the total patients enrolled were, arguably, insufficient (n=20) and the only psychometric testing reported was that measuring the reliability of the adapted scale without any test having been run to measure its validity, and thus, the psychometric evidence provided might not be sufficient. This study aims to adapt HADS into Indonesian language and evaluate this adapted version’s validity and reliability in the Indonesian population. The evidence of validity and reliability of the HADS-Indonesia might support for the adaption of HADS for use in similar languages, such as Malay or other language used in the Southeast Asia region. In addition, the availability of HADS-Indonesia may help Indonesian healthcare workers to administer screening for depression and anxiety in the Indonesian population and, thus, early interventions. Beside, the availability of the scale might support future studies in the area.

Methods

Study design, sample size and sampling procedure

A cross-sectional study was conducted between June and November 2018 to investigate the validity and reliability of the HADS-Indonesia in measuring anxiety and depression in the Indonesian population. This study was designed to measure construct validity of the scale using factor analysis, convergent validity, face validity, test–retest reliability using intraclass correlation coefficient (ICC) and internal reliability. A minimum factor analysis sample size is required to yield a stable factor model. The minimum ratio of participants to items has been reported to be 10:1 (10 participants for one item), although more participants might provide a more robust model.14 Thus, we needed 140 or more participants to run a factor analysis and produce a stable model. A minimum of 28 responses has been deemed sufficient for test–retest analysis reliability using ICC in a review article and only a small number of responses is usually needed for the study.15

The study’s inclusion criteria were adults (18 years old and older) with basic Indonesian language literacy, including reading and writing. There were no specific exclusion criteria in this study. Using a convenience sampling method, we invited 200 participants from three villages in Jatinangor subdistrict, Sumedang regency, West Java province Indonesia, to take part. The villages were chosen based on their profiles (ie, their urban and suburban characteristics).

We provided an incentive to each participant in the form of daily food and other necessities, valued at approximately IDR50 000 (approximately US$3.50). All data were collected and stored by the researchers and was only used during data analysis without disclosing participants’ personal information.

Hospital Anxiety and Depression Scale

The HADS is copyrighted under R. P. Snaith and A. S. Zigmond, 1983, 1992 and 1994. HADS was initially published in Acta Psychiatrica Scandinavica 67, 361-70, Munksgaard International Publishers, Copenhagen, 1983.6 Permission has been obtained for its translation and use.

Given its simplicity, HADS has been widely used and validated for measuring anxiety and depression cases.12 HADS is a self-completion questionnaire consisting of 14 items comprising depression and anxiety subscales, which can be completed in 2–5 min. In this study, respondents’ answers were based on their feelings within a week of the data collection. Each respondent’s score was the sum of the ratings for all items (each score ranged from 0 to 3 points) in each subscale. The maximum score for each subscale is 21, with 0–7 points meaning no depression/anxiety cases (ie, non-cases), 8–10 points representing borderline or doubtful cases, and ≥11 points suggesting anxiety or depressive cases.6

Translation and back translation of HADS

The adaptation process followed the guidelines furnished by Beaton et al.16 The first stage in the adaption process was the translation of the original English HADS text by two native Indonesian translators with different profiles and backgrounds. Two independent Indonesian language versions of HADS were produced along with the translators’ reports on challenging phrases. The reports and translations were followed up with a discussion between the translators and experts to synthesise the translations into one unified version. Next, the unified version was discussed and evaluated by a committee consisting of the researchers, a psychiatrist, a study methodology consultant and the two translators. The experts agreed on a prefinal version of the instrument at this stage.

A native English translator conducted a back-translation process to ensure the prefinal Indonesian version of HADS reflected the original version. The back-translation version suggested that the Indonesian version of the questionnaire reflected the original version.

Face validity evaluation

The prefinal version of the Indonesian version of HADS was tested on 35 subjects selected from the target population. Each subject completed the questionnaire and underwent an interview with one of the researchers. The interviews aimed to probe each subject’s understanding of each item and the subject’s responses. This process provided several insights which supported the development of the final version of the Indonesian HADS.

Test–retest reliability and convergent validity evaluation

After face validity evaluation was completed, 41 participants were invited to complete the questionnaire twice. At the first data collection stage, they were asked to complete three questionnaires, the final version of the HADS-Indonesia, Zung’s Self-rating Anxiety Scale (SAS)7 and Zung’s Self-rating Depression Scale (SDS).11 SAS has been adapted and validated in the Indonesian population and was found to have a Cronbach’s alpha score of 0.65 with additional evidence of convergent and construct validity.17 SDS has also been validated in the Indonesian population with a Cronbach’s alpha score of 0.88.18 The second data collection stage was conducted 3 months later, and the participants were reinvited to complete the final version of HADS-Indonesia, evaluating the scale’s test–retest reliability. A 3-month interval was chosen for test–retest evaluation to avoid the possibility that the participants might have memorised their previous answers. Also, a longer interval was chosen since the diagnosis of depression and anxiety needs a longer duration (>2 weeks).19

Data collection method

After the second data collection and the analysis of the convergent validity and test–retest reliability data, 30 people from Jatinangor were trained to prepare the questionnaire for data collection, identify potential participants, assess participants’ eligibility and collect the data. They visited the potential participants in their target area as trained and collected the data directly.

Data analysis

Test–retest Reliability

The test–retest (ie, intraobserver) reliability of the scale and each item was examined using ICC for each item, each subscale and the HADS-Indonesia scale as a whole. A two-way random-effect model based on absolute agreement was used to assess the test–retest reliability of HADS-Indonesia. An ICC value less than 0.5 indicated poor reliability, a value between 0.50 and 0.75 suggested moderate reliability, a value between 0.75 and 0.90 demonstrated good reliability, and a value more than 0.90 indicated excellent reliability.20

Convergent Validity

We assessed HADS-Indonesia’s convergent validity using Spearman’s rank correlation coefficient (ρ) for the correlation between the Indonesian version of HADS and SAS7 17 for the anxiety subscale and Zung’s SDS11 18 for depression subscale. A significant value of Spearman’s rank correlation coefficient (p<0.050) suggested the convergent validity of the Indonesian version of HADS.

Construct validity

Although the factor structure of HADS has been analysed in many studies, the structure proposed by each study and population is different (ranging between two-factor and four-factor model).12 Therefore, in this study, to examine the scale’s structure and to determine whether the adapted scale behaves in a similar way to the original version or whether cultural differences affect the questionnaire’s structure, an exploratory factor analysis (EFA) was conducted.21 We took the following steps to examine the factor’s structure: first, the assumption of factor analysis was assessed using Kaiser-Meyer-Olkin (KMO) statistics and Bartlett’s test of sphericity. A KMO score of >0.70 and a Bartlett’s test of sphericity of <0.050 were considered adequate to conduct an EFA. Next, each item’s correlation and communality (h2) and average inter-item correlation (AIC) were scrutinised. The scale’s AIC was expected to be between 0.20 and 0.40,22 and items with h2<0.4 were deleted. An EFA using principal component analysis (PCA) was conducted. To determine the number of factors to retain, we retained factors with eigenvalue >1 and compared this to the scree plot. Next, both orthogonal and oblique rotations were conducted and the results were compared to help us understand the structure of the adapted scale.23 24 If both rotations yielded similar factor structures, we reported the orthogonal rotation’s version as it generally provides a more straightforward interpretation and maximises the variance across all factors.24 A factor loading of 0.40 was chosen as the threshold.25 In the case of cross-loaded items (an item with a factor loading difference between two or more factors <0.20),26 we decided to put the item into the appropriate factor according to the theory used in the development of the original version of HADS.23

Internal consistency

Each subscale’s internal consistency was calculated using Cronbach’s alpha score. An alpha score of 0.70 or higher was deemed to indicate adequate reliability.27

Patient and public involvement

Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.

Results

Participants and demographic profiles

A total of 200 responses were collected and the response rate was 100% as all participants invited consented to participate. The mean age of the participants was 42.41 (14.25) years old. Most of the participants were female (n=109, 54.5%) and graduated from middle school (n=120; 60%). Most of the women in our study were homemakers (n=85, 78% of the total women participants). Table 1 shows the demographic profile of the study’s population.

Table 1
|
Demographic profiles of The Study’s Participants (N=200)

Face validity

A to the original version of HADS, the HADS-Indonesia consists of 14 items (online supplemental appendix 1). The questionnaire was completed by all participants (n=35) within 2–5 min. Several amendments were made based on 35 participants’ reviews of the items. Changes were made to item 4 (ie, I can laugh and see the funny side of things), item 5 (ie, worrying thoughts go through my mind), item 8 (ie, I feel as if I am slowed down), item 9 (ie, I get a sort of frightened feeling like ‘butterflies’ in the stomach), item 10 (ie, I have lost interest in my appearance), item 11 (ie, I feel restless as if I have to be on the move), item 13 (ie, I get sudden feelings of panic) and item 14 (ie, I can enjoy a good book or radio or TV programme). The changes to each item were applied to the item statement, to the response options, or both. This evaluation resulted in the final version of HADS-Indonesia used for the validation process which followed.

Convergent validity

A total of 41 participants completed the HADS-Indonesia, SAS and SDS scales. The Spearman correlation result indicated a significant correlation between the Indonesian version of HADS-Anxiety (HADS-A) and SAS (rs=0.45, p=0.030, N=41). As was true for the Indonesian version of HADS-A, there was a statistically significant correlation between HADS-Depression (HADS-D) and SDS (rs=0.58, p<0.001, N=41).

Test–retest reliability

After a 3-month interval, 41 participants completed the final version of HADS-Indonesia for the second time to measure the instrument’s intrarater reliability. ICC estimates and 95% CIs were based on a mean-rating, absolute-agreement, two-way-mixed effects model, which showed an ICC score of 0.98 (excellent reliability) for the whole scale. Each item’s ICC value ranged between 0.72 (moderate reliability) (item 1; I feel tense or ‘wound up’) and 0.96 (excellent reliability) (item 3; ‘I get a sort of frightened feeling as if something awful is about to happen’). Table 2 presents the ICC value of each item and scale and the total scale indicating the test–retest reliability results in this study.

Table 2
|
Intraclass correlation value for HADS-Indonesia (N=41) using single-rating, absolute-agreement, two-way random effect model

Structural validity

The KMO (KMO=0.89) and Bartlett’s test of sphericity (χ2(91, N=200)=1052.38, p<0.001) analysis indicated sufficient data for EFA. All items had a communality of >0.40, and inter-item correlation showed an AIC of 0.36. A putative factor extraction using PCA yielded three factors with an eigenvalue greater than 1. However, the scree plot supported the thesis of two significant factors, meaning the third factor would explain an insignificant amount of variance and, therefore, was not retained (figure 1).

Figure 1
Figure 1

Scree plot confirmed a two-factor solution for HADS-Indonesia. HADS, Hospital Anxiety and Depression Scale.

An orthogonal (ie, varimax) rotation based on the two-factor model showed that all items load to a factor (factor loading >0.40). However, item A7 (ie, I can sit at ease and feel relaxed) and D8 (ie, I feel as if I am slowed down) were cross-loaded into both factors (D8: 0.47 to factor 1 and 0.46 to factor 2; A7: 0.47 to factor 1 and 0.47 to factor 2). Therefore, as the protocol of this study was to put the cross-loaded items into the theoretically relevant factor, item A7 was put into factor 1 and item D8 into factor 2. After completing factor analysis, we found that the HADS-Indonesia consisted of 14 items like the original version. All the anxiety items converged into factor 1 and all the depression items gathered into factor 2; therefore, we labelled factor 1 as ‘Anxiety Subscales/HADS-A’ and factor 2 as ‘Depression Subscales/HADS-D’. Table 3 shows the result of the factor analysis in this study, including each item’s factor loading, commonalities and descriptive statistics. The factor model explained 50.80% (40.40%+10.40%) of the variance in the data.

Table 3
|
Principal component analysis using orthogonal (ie, varimax) rotation of HADS-Indonesia (N=200)

Internal consistency

The HADS-Indonesia consisted of 14 items with a 2-factor model. The HADS-A subscale consisted of 7 items (alpha=0.85), and the HADS-D subscale consisted of seven items (alpha=0.80). Table 4 presents each factor’s internal consistency and descriptive statistics.

Table 4
|
Descriptive statistics of HADS-Indonesia’s factors (N=200)

Discussion

Summary of findings

This study aims to adapt the HADS scale for use in the Indonesian language and to provide evidence of its validity and reliability. It is a validation study of one of the most widely used instruments for measuring anxiety and depression in different settings. A face validity evaluation showed that HADS-Indonesia could be understood by, and was acceptable to, the target population. Next, this study showed an excellent convergent validity for each subscale when they were compared with SAS and SDS, as well as excellent test–retest reliability. PCA analysis conducted on 14 items yielded a 2-factor model for HADS-Indonesia while also maintaining the original HADS structure overall. Also, each factor showed good internal consistency.

Validity of HADS-Indonesia

This study evaluated three pieces of validity evidence of HADS-Indonesia; face validity, convergent validity and structural validity. The face validity evaluation of HADS-Indonesia showed that this scale was acceptable to the target population. Several phrases in the original version could not be directly translated into the Indonesian language. For example, item 9 (ie, I get a sort of frightened feeling like ‘butterflies’ in the stomach). The idiomatic phrase ‘butterflies in my stomach’’ has no direct equivalent in the Indonesian language and was translated into Indonesian using the word for ‘nauseous’. Problems with translating HADS have been reported in the literature28 and could have result in several problems impacting the validity of the adapted version. In our study, this problem seemed to be solved, however, by a proper face validity evaluation involving the target population, resulting in changes in item statements and answer options.

The convergent validity test of the HADS-Indonesia was conducted by comparing each scale to SAS and SDS using Spearman correlation analysis. Spearman’s analysis showed a good correlation between the anxiety subscale of HADS-Indonesia and SAS, as well as the depression subscale of HADS-Indonesia and SDS, thus providing HADS-Indonesia with evidence of its convergent validity. Different studies have compared HADS to SAS and SDS. To the best of our knowledge, all studies reported a positive correlation between the anxiety subscale of the HADS and SAS as well as between the depression subscale of the HADS to SDS.29–31 Our study, therefore, adds to the current body of literature, strengthening the evidence of HADS convergent validity.

An EFA using PCA analysis provided construct validity evidence for HADS-Indonesia with a two-factor solution, which explained 50.80% of the total variance. Conceptualisation of the items loaded into each factor led us to label the first factor as the anxiety subscale (ie, reflecting the anxiety sub-scale of the original version/HADS-A) and the depression subscale (ie, reflecting the depression subscale of the original version). In this study, two items were cross-loaded into both factors (ie, item A7, ‘I can sit at ease and feel relaxed’) and D8 (ie, ‘I feel as if I am slowed down’). Following the original version, both items were loaded into the anxiety and depression subscale and put into the relevant factor. A similar situation has been reported in several adaptations of the HADS, such as in Cantonese,32 Japanese33 and Canadian French.34 In terms of factor structure, our study’s EFA retained the original subscales developed by Zigmond and Snaith.6 This result adds to the body of literature showing results resulting from retaining the original subscales, such as the validation conducted in the case of the Italian population,35 the Norwegian and Swedish populations,36 and the Spanish population.37

Reliability of HADS-Indonesia

Two pieces of reliability evidence were evaluated (ie, internal consistency and test–retest reliability). Both factors showed an excellent internal consistency, 0.85 for HADS-A and 0.80 for HADS-D. The AIC also yielded a good inter-item correlation, which indicated a good correlation between items. To the best of our knowledge, the Cronbach’s alpha score of HADS-A in different populations was between 0.76 (Portuguese translation)38 to 0.86 (Spanish translation).39 HADS-D had an excellent internal consistency, although it was slightly lower than HADS-A, between 0.63 in the Hong Kong translation40 to 0.86 in the Iranian41 and Arab translations.42 In addition, this result was also comparable to the findings of studies conducted in the general population or primary healthcare by el-Rufaie and Absood (Cronbach’s alpha of 0.78 for HADS-A and 0.86 for HADS-D),42 Roberge et al (Cronbach’s alpha of 0.82 for HADS-A and 0.83 for HADS-D),34 and higher than one conducted in Columbia (Cronbach’s alpha of 0.77 for HADS-A and 0.75 for HADS-D).43

The ICC showed excellent test–retest reliability (ICC=0.98). The ICC score showed that HADS-Indonesia was still reliable despite the 3-month evaluation period. In addition, all items showed a good ICC score (ICC=0.72 (item 1) to 0.96 (item 3)). One study by Quintana et al examined the reliability of HADS in the Spanish population.39 They reported a similar test–retest reliability score between 0.85 and 0.91.

Study limitations, implications and future research

To the best of our knowledge, this study is the first to provide multiple validity and reliability evidence of HADS in the Indonesian population. Although the best measurements possible have been conducted, this study has several limitations. First, HADS is a self-reported questionnaire which is therefore prone to social desirability bias. The study participants can answer the questionnaire according to the applicable value within their society, which prevented us from obtaining objective information. Second, this study was a preliminary study of HADS-Indonesia. Although HADS itself is a sound scale, which have been adapted to different populations and settings, this study is still the first research to evaluate its validity in the Indonesian population. Therefore, the validity and reliability evidence of HADS-Indonesia needs to be studied further.

Our results may have valuable implications for the body of literature on HADS. First, as this study provides the first validity and reliability evidence of HADS in the Indonesian population, it may open the possibility that HADS can also be adapted into other languages and used in relatively similar cultures, such as that of the Malay language and other cultures in the Southeast Asian region. Second, our study provides healthcare workers and researchers working in mental health and other field a valid and reliable instrument to measure anxiety and depression, which may assist the early detection of mental ill health conditions and help researchers in the study area. Finally, these results open the possibility for further research. New studies using confirmatory factor analysis (CFA) are required to provide more robust validity evidence. Other validity studies, such as those examining discriminant validity, should be undertaken. Finally, as this study was conducted in a predetermined population, further replication studies are required, especially in other areas of Indonesia and in different settings, to evaluate HADS-Indonesia’s reproducibility and validity in these settings.

Conclusion

Our study provides validity and reliability evidence of HADS-Indonesia (online supplemental appendix 1). The face validity of HADS-Indonesia has been presented. Our research has also found the convergent validity evidence of HADS-Indonesia as anxiety and depression subscales of HADS-Indonesia has a positive correlation with SAS and SDS. The construct validity of HADS-Indonesia has also been tested using EFA and has successfully retained the original structure of HADS using a two-factor model. This study has also provided two forms of reliability evidence for the instrument: (1) test–retest reliability according to ICC analysis and (2) internal reliability based on Cronbach’s alpha score. Despite the results, our study was limited. It was only a preliminary study in which we did not provide other validity evidence such as predictive validity, discriminant validity or predictive validity. Further reliability evidence is needed. Further studies in different settings are required to ensure HADS-Indonesia’s validity. A CFA analysis and more validity testing and reliability analysis are recommended to test the fitness of the HADS-Indonesia factor model and the instrument’s applicability in different settings.