Cause-specific mortality in Africa and Asia: evidence from INDEPTH health and demographic surveillance system sites

Background Because most deaths in Africa and Asia are not well documented, estimates of mortality are often made using scanty data. The INDEPTH Network works to alleviate this problem by collating detailed individual data from defined Health and Demographic Surveillance sites. By registering all deaths over time and carrying out verbal autopsies to determine cause of death across many such sites, using standardised methods, the Network seeks to generate population-based mortality statistics that are not otherwise available. Objective To build a large standardised mortality database from African and Asian sites, detailing the relevant methods, and use it to describe cause-specific mortality patterns. Design Individual demographic and verbal autopsy (VA) data from 22 INDEPTH sites were collated into a standardised database. The INDEPTH 2013 population was used for standardisation. The WHO 2012 VA standard and the InterVA-4 model were used for assigning cause of death. Results A total of 111,910 deaths occurring over 12,204,043 person-years (accumulated between 1992 and 2012) were registered across the 22 sites, and for 98,429 of these deaths (88.0%) verbal autopsies were successfully completed. There was considerable variation in all-cause mortality between sites, with most of the differences being accounted for by variations in infectious causes as a proportion of all deaths. Conclusions This dataset documents individual deaths across Africa and Asia in a standardised way, and on an unprecedented scale. While INDEPTH sites are not constructed to constitute a representative sample, and VA may not be the ideal method of determining cause of death, nevertheless these findings represent detailed mortality patterns for parts of the world that are severely under-served in terms of measuring mortality. Further papers explore details of mortality patterns among children and specifically for NCDs, external causes, pregnancy-related mortality, malaria, and HIV/AIDS. Comparisons will also be made where possible with other findings on mortality in the same regions. Findings presented here and in accompanying papers support the need for continued work towards much wider implementation of universal civil registration of deaths by cause on a worldwide basis.

T he vast majority of deaths in sub-Saharan Africa and southern Asia are not individually registered nor assigned a cause of death. Data from WHO (1) show that, apart from in a few countries, the coverage of routine vital registration including cause of death in Africa and Asia is minimal and thus such cause-specific mortality data that do exist generally come from health facility records and ad hoc surveys. Therefore, when global estimates of cause-specific mortality are made, the data contributed from Africa and Asia are inevitably patchy and outcomes depend heavily on modelling assumptions that create huge uncertainty (2). As a result, very little is accurately known about mortality patterns in these regions, but nonetheless policy, practice, and investment decisions are made that supposedly depend on knowledge of death rates and causes.
The INDEPTH Network (International Network for the Demographic Evaluation of Populations and their Health) is an umbrella organisation for a number of independent centres operating health and demographic surveillance system (HDSS) sites, most of which are located in sub-Saharan Africa and Asia (3). These HDSS operations were started at various times and cover a range of defined rural and urban populations. Basic requirements in all the sites include registering all deaths occurring within the defined populations, and carrying out verbal autopsy (VA) procedures (interviews with relatives, care-givers, and witnesses after deaths have occurred, the results of which are subsequently interpreted into likely causes of death).
Any process of attributing cause of death, ranging from pathologists' post-mortems through hospital cause of death records, physician certificates, and verbal autopsies, involves a combination of expertise and evidence (4). Consequently, all causes of death data also incorporate some degree of uncertainty, which may include both systematic and random variations. Undertaking VA interviews and attributing causes of death are complex processes which need to be standardised as far as is possible. A WHO-led process resulted in new standard procedures for VA in 2012 (5), in terms of defining questions that need to be included in VA interviews and VA cause of death categories corresponding to the International Classification of Diseases 10th Edition (ICD-10) (6). A detailed review of that process, building on previous VA materials, is available (7). A new version of the InterVA model for interpreting cause of death from VA data was also released in 2012 (8), which exactly corresponds to the WHO 2012 VA standard in terms of VA questions and cause of death categories.
Consideration of the absolute validity of any cause of death data is also complex. A number of studies have made comparisons between pathologists' post-mortems and hospital records (9Á11); others have compared validity between hospital records and mortality registers (12), with varying degrees of concordance. In some cases, VA findings have been compared with hospital records, but this approach has been hampered by the generally small and unrepresentative proportions of deaths actually occurring in hospitals located in populations where VAs are used (13Á15). Many studies have compared the use of automated VA coding models with results from physicians coding the same VA material (often termed physician-coded verbal autopsy (PCVA)), several of which have involved using InterVA models (16). However, there can be difficulties in separating differences that arise from possible systematic errors in models and more random inter-or intra-physician variations (17). For some specific causes of death, there can be absolute standards for comparison, for example, ante-mortem HIV or sickle cell status, but this only applies to a minority of causes (18,19). Many cause of death processes do not allow the attribution of uncertainty at the individual level. Although assigning a death as being 100% due to a particular cause simplifies further analyses, in reality there is a range of certainty associated with individual cause of death assignments, depending on the extent of available information for a particular case, as well as other factors. The probabilistic modelling used by InterVA-4 facilitates the capture of this uncertainty for each individual case, with the possibility of attributing part of a death as being of indeterminate cause (8).
Despite the absence of widespread and reliable cause of death registration across Africa and Asia, much can be learnt about mortality patterns by considering standardised VA findings from sites where such data are routinely collected at the population level on a longitudinal basis. Mortality surveillance within circumscribed populations leads to findings based on every death that occurs. Consequently, cause-specific fractions total 100%, without the difficulties that some modelled estimates have encountered of needing to impose an overall mortality envelope. Furthermore, the advantages of consistency over time and place offered by VA models in assigning cause of death is particularly relevant for large epidemiological studies such as reported here, even though physicians might arguably follow a more nuanced approach in assigning individual causes of death.
The objectives of this introductory paper are to describe a large VA dataset compiled across a range of INDEPTH HDSS sites in Africa and Asia together with details of the overall methods used, as well as to report key findings on overall patterns of mortality and highlight areas of specific interest which have been examined in more detail in accompanying papers. Specifically, childhood mortality (20) and adult non-communicable disease (NCD) mortality (21), plus mortality from external causes (22) and associated with pregnancy (23), have been explored in more detail. Malaria (24) and HIV/AIDS-related (25) mortality, being two highly significant causes that vary  considerably between sites, are also documented separately. Overall, findings and ways forward are brought together in a concluding synthesis (26).
The publication of this series of papers coincides with depositing the overall cause of death dataset (27) into the public domain at the INDEPTH Data Repository (28). Half of the HDSSs involved are already part of INDEPTH's iSHARE programme (www.indepth-ishare. org), and already have other individual-level population surveillance data in the public domain. The remaining sites have aggregated population data publicly available as part of the associated INDEPTHStats programme. As agreed by the INDEPTH Network Board, a separate set of anonymised identifiers have been used for the public domain cause of death data as distinct from other individual-level data, in order to guard against identity disclosure risks. Enquiries relating to specific research plans that would need to link the individual cause of death data with other individual-level data can be made to the INDEPTH Secretariat.

Populations and methods
Cause of death data based on VA interviews were contributed by 14 INDEPTH HDSS sites in sub-Saharan Africa and eight sites in Asia, located as shown in Fig. 1. The principles of the Network and its constituent population surveillance sites have been described generically (3), and detailed descriptions of the individual sites involved, including local attributes, are available elsewhere (29Á50). Each HDSS site is committed to long-term longitudinal surveillance of circumscribed populations, typically each covering around 50,000Á100,000 people. Households are registered and visited regularly by lay field-workers, with a frequency varying from once per year to several times per year. All vital events are registered at each such visit, and any deaths recorded are followed up with VA interviews, usually undertaken by specially trained lay interviewers. A few sites were already operational in the 1990s, but in this dataset 95% of the person-time observed related to the period from 2000 onwards, with 68% from 2006 onwards. Two sites, in Nairobi and Ouagadougou, followed urban populations, while the remainder covered areas that were generally more rural in character, although some included local urban centres. Sites covered entire populations, although the Karonga, Malawi, site only contributed VAs for deaths of people aged 12 years and older. Because the sites were not located or designed in a systematic way to be representative of national or regional populations, it is not meaningful to aggregate results over sites. Therefore, site-specific, cause-specific mortality fractions (CSMFs) and mortality rates (CSMRs) were used as the basis for analyses and comparisons. Since each site encompassed an entire non-sampled population, it was not meaningful to consider confidence intervals around site-specific measurements. Uncertainty around individual assignments of cause of death was however incorporated into the dataset as described below.
Because there are possible inter-site and inter-year variations in the ageÁsex structure of the populations concerned, it was also important, at least for some comparisons, to be able to adjust CSMFs and CSMRs to a standard population. Due to the different age and sex profiles of many causes of death, if this was not done then observed variations in CSMFs and CSMRs could have been due to (or masked by) differences in population structure between sites and over time. For this purpose, we took the INDEPTH 2013 standard population structure for low-and middle-income countries (LMICs) in Africa and Asia (51), as shown in Table 1. This publicdomain standard population has been presented in relation to other global standards such as Segi and WHO, from which it differs in reflecting the higher fertility and younger-age mortality rates commonly seen in LMIC populations (51). As shown in Table 1, this standard has a very similar structure to the aggregated population from which these VA data came. Using the INDEPTH 2013 standard, encompassing lower income populations in both Africa and Asia, meant that the overall effect of standardisation across the whole mortality dataset was minimal, while resulting in important adjustments for certain sub-groups. Thus, calculating a standardised weight for every combination of site, age group, sex, and year made it possible to compare cause-specific mortality without concern for differences in underlying population structures (referred to hereafter as ageÁsexÁtime standardisation). Using the same INDEPTH 2013 standard, it will be possible to directly compare future work on these same lines with any mortality data from Africa and Asia. All causes of death assignments in this dataset were made using the InterVA-4 model version 4.02 (8). InterVA-4 uses probabilistic modelling to arrive at likely cause(s) of death for each VA case, the workings of the model being based on a combination of expert medical opinion and relevant available data. InterVA-4 is the only model currently available that processes VA data according to the WHO 2012 standard and categorises causes of death according to ICD-10. Since the VA data reported here were collected before the WHO 2012 standard was formulated, they were all retrospectively transformed into the WHO 2012 and InterVA-4 input format for processing. The phrase 'successfully completed VA interview' means that a VA interview was undertaken which yielded basic personal characteristics of the deceased (age, sex, etc.) and some symptoms relating to the final illness. The InterVA-4 'high' malaria setting was used for all the West African sites, plus the East African sites (with the exceptions, on the grounds of high altitude, of Nairobi, Kenya, and Kilite-Awlaelo, Ethiopia); other sites used the 'low' setting. The InterVA-4 'high' HIV/AIDS setting was used for sites in Kenya, Malawi, and South Africa; for all other sites the 'low' setting was used. These settings were chosen in line with InterVA recommendations and previous experience, and are discussed further in the specific papers on malaria and HIV/AIDS-related mortality (24,25).
The InterVA-4 model was applied to the data from each site, yielding, for each case, up to three possible causes of death or an indeterminate result. In a minority of cases, for example, where symptoms were vague, contradictory or mutually inconsistent, it was impossible for InterVA-4 to determine a cause of death, and these deaths were attributed as entirely indeterminate. For the remaining cases, one to three likely causes and their likelihoods were assigned by InterVA-4, and if the sum of their likelihoods was less than one, the residual component was then assigned as being indeterminate. This was an important process for capturing uncertainty in cause of death outcome(s) from the model at the individual level, thus avoiding over-interpretation of specific causes. As a consequence there were three sources of unattributed cause of death: deaths registered for which VAs were not successfully completed; VAs completed but where the cause was entirely indeterminate; and residual components of deaths attributed as indeterminate.
An overall dataset (27) was compiled in which each case had between one and four records, each with its own cause and likelihood. Cases for which VAs were not successfully completed had single records with the cause of death recorded as 'VA not completed' and a likelihood of one. Thus, the overall sum of the likelihoods equated to the total number of deaths. Each record also contained a population weighting factor reflecting the ratio of the population fraction for its site, age group, sex, and year to the corresponding age group and sex fraction in the standard population described in Table 1, for the purposes of standardisation. Then a further factor was calculated for each record as the product of the VA cause likelihood and the population standard weighting (both described above), which could be used as the basis for calculating ageÁsexÁtime standardised CSMFs and CSMRs.
These descriptions of methods used to construct this multisite dataset (27) apply to the following series of analytical papers using the dataset (20Á25). A standard Box summarising these methods is included in each of these papers.
In this context, all of these data are secondary datasets derived from primary data collected separately by each participating site. In all cases, the primary data collection was covered by site-level ethical approvals relating to on-going health and demographic surveillance in those specific locations. No individual identity or household location data were included in the secondary data and no specific ethical approvals were required for these pooled analyses.

Results
A total of 111,910 deaths occurring over 12,204,043 person-years were registered across the 22 sites. For 98,429 of these deaths (88.0%), VAs were successfully completed. Figure 1 includes the numbers of deaths, completed VAs and person-time observed for each site. Among the 98,429 completed VAs, InterVA-4 was unable to reach any conclusive cause of death (i.e. arrived at 100% indeterminate outcome) in 4,680 (4.8%) of cases. Residual indeterminate fractions totalled 7,545.9 (7.7%) of completed VAs. Thus, out of the total of 111,910 deaths recorded, specific causes were successfully assigned to 86,203 deaths (77.0%). AgeÁsexÁtime standardisation made less than 1% difference to the overall dataset (112,653 standardised deaths compared with 111,910 observed deaths) but was particularly important for some sites, for example, the urban slum population in Nairobi, where the population structure differed markedly from the standard. Table 2 shows mortality rates per 1,000 person-years by age group and time period for each of the 22 participating sites. Figure 2 shows ageÁsexÁtime standardised mortality rates for each site, by major categories of cause of death (infections, neoplasms, NCDs, maternal/neonatal, trauma, and indeterminate). The indeterminate category includes cases where VAs were not done, as well as indeterminate components reflecting individual uncertainty in cause of death assignment. All-cause ageÁsexÁtime standardised mortality rates in individual sites ranged from 18.5 per 1,000 person-years in Kisumu, Kenya, to 3.9 per 1,000 person-years in FilaBavi, Vietnam. A large part of this variation was accounted for by differences in infectious causes of death (10.7 per 1,000 person-years in Kisumu to 0.5 per 1,000 person-years in FilaBavi). A number of sites reflected low overall mortality rates as a consequence of being at stages of demographic transition where life expectancy increases as health standards improve, but where population proportions of elderly people remain relatively low.

Discussion
This dataset documents individual deaths across sub-Saharan Africa and southern Asia on a hitherto un-precedented scale. In addition, because the deaths are recorded in the context of longitudinal surveillance operations, it is also possible to determine populationbased mortality rates, overall and by year, cause, age group, and sex. The application of the WHO 2012 VA standard (7) and the InterVA-4 (8) model to these data enabled assignment of cause of death in a consistent manner, and, where appropriate, ageÁsexÁtime standardisation of rates enabled systematic comparisons between sites.
This dataset provided opportunities for a wide range of more detailed analyses, as reflected in the following series of papers. Apart from looking in more detail at causes of death within obvious population sub-groups such as neonates, infants, and under-5s (20), it is also interesting to consider the detail within some of the cause groups shown in Figure 2. It would appear that population-based rates of NCD mortality are relatively constant across the sites, compared with the variations in overall mortality, which seem to be more strongly driven by the magnitude of infectious causes (21). Although there is much reported on so-called epidemics of NCDs in LMICs, this may partly reflect relatively large proportions of NCD mortality in populations also experiencing historically low levels of overall mortality, particularly in Asia. These low overall rates are demographically driven in populations experiencing rapid life expectancy increases, but not yet having accumulated larger proportions of older people. Nevertheless, concerns over current accumulations of NCD risk factors are very valid in relation to future NCD mortality. These results on NCD mortality may therefore constitute an important baseline measurement against which to judge future developments. External causes of death Á intentional and unintentional, selfinflicted and imposed Á also constitute an increasingly large component of mortality in various places and age groups, which have been explored further in this dataset (22). This was also a sufficiently large dataset to look in more detail at some specific causes of death, such as pregnancy-related deaths (23) and infectious causes such as malaria (24) and HIV/AIDS (25). Availability of the dataset in the public domain will facilitate further analyses of specific causes and patterns of mortality.
Although HDSS sites such as those contributing data here can be critiqued in terms of how representative they may be of surrounding areas, this consideration has to be offset against the unique opportunities HDSSs have of assessing mortality patterns for complete communities, rather than the more commonly used health facility mortality records. Most deaths in Africa and Asia do not occur in health facilities, and it is by no means evident that facility-based deaths are very representative of mortality in general. Cause of death as determined by VA may also be regarded as less than ideal, but it is the only viable method for large-scale cause of death assignment in Africa and Asia. Here, the WHO 2012 VA standard and the InterVA-4 model have been used to ensure, as far as possible, that there is consistency across the sites and time periods involved, which is not guaranteed with physician interpretation of VA. Many sites also undertake physician interpretation of their VAs, which may differ in detail from these results, and be reported separately; we are not making any comparisons with physician findings here. We acknowledge that it was a compromise to have to transform VA data collected using a range of historical instruments, but that was unavoidable. Most of the instruments used had evolved from earlier WHO and INDEPTH VA versions with much common core content, which were themselves the starting point for the development of the WHO 2012 standard. The application of ageÁsexÁtime standardisation, using the INDEPTH 2013 population standard, further enabled comparisons of mortality over time and place to be made objectively.
A Population Health Metrics Research Consortium (PHMRC) study aimed to collect a 'gold standard' VA dataset from selected tertiary institutions, which has been used both to build VA models and evaluate different approaches to VA cause of death assignment (52). Although that study concluded that InterVA-4 was less effective than PHMRC methods in assigning cause of death, that conclusion was reached by comparing the internal validity of PHMRC methods within the 'gold standard' dataset against the external validity of InterVA-4 and physician coding (53). Issues with the quality of the PHMRC VA data, the use of different VA questions and deviations from ICD-10 classifications further compromised those findings, but nevertheless InterVA-4 coding of the PHMRC data still demonstrated an overall concordance correlation of 0.61. Since InterVA-4 is the only available VA model which corresponds to the WHO 2012 VA standard and ICD-10 coded causes of death, it was the preferred choice to use here.
A number of sites contributing data to the overall dataset have undertaken site-specific analyses of their mortality patterns which are reported separately (54Á66). In some countries (Bangladesh, Ghana, Burkina Faso, Kenya, South Africa) there were multiple sites involved which present interesting opportunities to consider withincountry variations. This also facilitates comparisons with other national sources of data such as Demographic and Health Surveys and Global Burden of Disease outputs. To some extent, it was also possible to look at trends in mortality over time, although that was limited by the different time periods over which individual sites have been operating. Since more sites have reported for recent periods, there may be findings here of interest in terms of trends towards the 2015 deadlines for Millennium Development Goals. These may also serve as baseline figures for post-2015 targets.
Most of the detailed comparisons made between results from this dataset and comparable figures from various other sources of estimates, explored in the accompanying papers, showed a high degree of congruence. Given that the methodologies involved Á of counting individual deaths at INDEPTH sites and aggregating upwards, contrasted with taking available data sources and constructing global models to derive national estimates Á are completely different, this congruence in findings adds plausibility to both approaches. Nevertheless, it must still be recognised that moving towards complete civil registration of deaths, including cause of death, is a critical objective yet to be achieved in Africa and Asia (67). of the Witwatersrand Rural Knowledge Hub to analyse and draft these results was supported by the European Community Marie Curie Actions IPHTRE project (no. 295168). icddr,b is thankful to the Governments of Australia, Bangladesh, Canada, Sweden and the UK for providing core/unrestricted support. The Ouagadougou site acknowledges the Wellcome Trust for its financial support to the Ouagadougou HDSS (grant number WT081993MA). The Kilite Awlaelo HDSS is supported by the US Centers for Disease Control and Prevention (CDC) and the Ethiopian Public Health Association (EPHA), in accordance with the EPHA-CDC Cooperative Agreement No.5U22/PS022179_10 and Mekelle University, though these findings do not necessarily represent the funders' official views. The