Routine health management information system data in Ethiopia: consistency, trends, and challenges

ABSTRACT Background: Ethiopia is investing in the routine Health Management Information System. Improved routine data are needed for decision-making in the health sector. Objective: To analyse the quality of the routine Health Management Information System data and triangulate with other sources, such as the Demographic and Health Surveys. Methods: We analysed national Health Management Information System data on 19 indicators of maternal health, neonatal survival, immunization, child nutrition, malaria, and tuberculosis over the 2012–2018 time period. The analyses were conducted by 38 analysts from the Ministry of Health, Ethiopia, and two government agencies who participated in the Operational Research and Coaching for Analysts (ORCA) project between June 2018 and June 2020. Using a World Health Organization Data Quality Review toolkit, we assessed indicator definitions, completeness, internal consistency over time and between related indicators, and external consistency compared with other data sources. Results: Several services reported coverage of above 100%. For many indicators, denominators were based on poor-quality population data estimates. Data on individual vaccinations had relatively good internal consistency. In contrast, there was low external consistency for data on fully vaccinated children, with the routine Health Management Information System showing 89% coverage but the Demographic and Health Survey estimate at 39%. Maternal health indicators displayed increasing coverage over time. Indicators on child nutrition, malaria, and tuberculosis were less consistent. Data on neonatal mortality were incomplete and operationalised as mortality on day 0–6. Our comparisons with survey and population projections indicated that one in eight early neonatal deaths were reported in the routine Health Management Information System. Data quality varied between regions. Conclusions: The quality of routine data gathered in the health system needs further attention. We suggest regular triangulation with data from other sources. We recommend addressing the denominator issues, reducing the complexity of indicators, and aligning indicators to international definitions.


Background
A routine Health Management Information System (HMIS) ideally provides accurate, disaggregated, and real-time information from all health system levels to enable disease surveillance, activity monitoring, allocation of resources, and policy formation. It can also inform patients and provide feedback to professionals in the health-care system. In the absence of a well-functioning routine HMIS, most low-and middle-income countries rely upon survey data, such as the Ethiopian Demographic and Health Surveys (EDHS) [1,2]. Such studies usually produce valid and reliable information. However, these surveys are costly, retrospective, and intermittent, which makes their results less suitable for guiding current planning and policy formation [3]. In most cases, national surveys do not provide district-level data for health planning.
The Information Revolution was one of the four transformation agendas in Ethiopia's first Health Sector Transformation Plan [4,5]. The information revolution aimed to advance collection, analysis, presentation, and dissemination of information that could influence decision-making. A particular focus was given to the introduction of new information technology, including the computer software District Health Information System (DHIS2), used in over 60 countries [6,7]. However, many health facilities in Ethiopia lack the necessary infrastructure, such as reliable electricity. Although some structural data quality problems can be expected to improve with the increasing use of information technology, other issues may remain [8][9][10].
Health data are often expressed as prevalence or coverage and depend both on valid numerator information and on appropriate definition and assessment of the denominator. Data from the latest Ethiopian census in 2007 frequently serves as a basis for population estimates based on specific algorithms for each region [11,12, Ministry of Health, Ethiopia, personal communication, 2019].
Extensive routine data are collected in Ethiopia from all health-care levels and outside the healthcare system [13]. The routine HMIS, sometimes referred to as the Routine Health Information System (RHIS), includes any regular data collection conducted in the health system and community with an interval of less than 1 year [14]. In the routine HMIS in Ethiopia, over 1000 data elements are reported monthly and around 400 quarterly. Further data are included on specific diseases such as HIV, and on activities such as quality assurance. Some data elements are compiled into indicators for the routine HMIS at the health centre level and above [11]. In 2017, the Ministry of Health (MOH) increased the number of routine HMIS indicators from 122 to 131 [11] (Supplementary Table S1). Information from the lower levels is also forwarded to the Ethiopian Public Health Institute in the Public Health Emergency Management framework, including the Maternal and Perinatal Death Surveillance and Response, and in the supply and procurement systems to the Ethiopian Pharmaceutical Supply Agency (Supplementary Figure S1). Vital events, such as births and deaths, are reported to the Vital Events Registration Authority that forwards data to the Central Statistical Agency.
Here we report an assessment of the quality of Ethiopian routine HMIS data. We compared 19 indicator definitions with the EDHS definitions, assessed completeness, internal consistency of national routine HMIS data, and external consistency with data from other sources, mainly the EDHS; and identified strengths, challenges, and opportunities for improvements in the routine HMIS data in Ethiopia. The rationale for our study was the pivotal role that can be played by a wellfunctioning routine HMIS for allocating resources and planning health care.

Study area, study population, and selected indicators
This study targeted national and regional routine HMIS data from Ethiopia. Ethiopia is the secondmost populous country in Africa, with an estimated population of 110 million [15,16]. After re-structuring in 2020, there are 10 administrative regions as well as the two city administrations of Addis Ababa and Dire Dawa. Regions are sub-divided into 98 zones and further into 923 districts, woreda, which in turn are divided into the lowest administrative unit, kebele, having around 5000 inhabitants [17]. The MOH governs the health system with decentralized power to Regional Health Bureaux, which are responsible for management, coordination, and distribution of technical support to the lower levels. The health system has three levels (tiers): primary level (health posts, health centres, primary hospitals); secondary level (general hospitals); and third-level health care (specialised hospitals) [4]. Data flow starts at the point of service delivery, and data are compiled at the district, zonal, and regional offices before reaching the national level (Supplementary Figure S1).
Thirty-eight analysts from the MOH, the Ethiopian Public Health Institute, and the Ethiopian Pharmaceutical Supply Agency were selected for a two-year on-the-job capacity-development intervention, the Operational Research and Coaching for Analysts (ORCA) project. The ORCA-project was initiated by the MOH and implemented by the London School of Hygiene & Tropical Medicine. Workshops, training, and facilitated analytical work took place in parallel to the participants' professional responsibilities. The work was performed in six thematic groups: Maternal Health, Neonatal Survival, Immunization, Child Nutrition, Malaria, and Tuberculosis. The ORCA thematic groups analysed 19 indicators and data elements that contribute to the indicators. Data sources used are shown in Table 1.

Data collection and analysis
The routine HMIS data from the national level, the nine regions that existed at the time of the study, and from the two city administrations were available from the MOH. Each thematic group identified appropriate source documents for external comparisons, such as the EDHS [1,2] (Table 1). The EDHS 2016 data on neonatal deaths were disaggregated to show early neonatal deaths by age in days. We also used data from the Health Commodities Management Information System (HCMIS) [4]. Data on pharmaceutical drugs and vaccines were expressed as standard person doses. For malaria, we compared the routine HMIS data with the World Health Organization (WHO) annual World Malaria Reports for 2015-2018 [18].
We used the second module of the WHO Data Quality Review: a toolkit for facility data quality assessment, and Excel sheets for data analysis, which were prepared using the same operational definitions as the WHO toolkit, see below [6,19]. The Ethiopian calendar differs from the Gregorian calendar by 7-8 years, and these differences, were considered in all comparisons (Table 1).

Study design and operational definitions
The WHO toolkit [19] provides a method for analysing routine HMIS data using four dimensions of data quality. The first dimension concerns whether data are available (completeness, timeliness). The second looks at the internal consistency of routine HMIS data compared over time and between indicators that could be expected to have a relation, such as the number of women coming for antenatal care visits, deliveries and newborn vaccinations. The third dimension concerns the external consistency when routine HMIS data are compared with data from other sources such as the number of vaccinations compared to the supply of vaccines, or the ratio of routine HMIS performance over coverage in population surveys. The fourth dimension compares population estimates. In this study, we assessed the data quality dimensions of completeness, internal consistency, and external consistency and used population estimates to predict births.
Definitions of indicators. The definition of the indicators was analysed using the HMIS Indicators Reference Guide [11], and comparison with the indicators used by the EDHS [1].
Data Quality Review dimension 1. Completeness of data. We analysed the completeness of routine HMIS data for the 12 months of 1 year for the 14 indicators of all thematic groups, except malaria and tuberculosis. Completeness of data elements was defined as the presence of the reported aggregated data for the specified month.
Data Quality Review dimension 2. Internal consistency of reported data (presence of outliers, consistency over time, and consistency between related indicators). A value within 2 to 3 standard deviations from the mean for the indicator over 12 months was considered a moderate outlier. A value of 3 or more standard deviations from the mean was considered an extreme outlier. The index year's performance was divided by the average of the preceding 3 years to represent the consistency over time for each selected routine HMIS indicator or data element. The quality range was set at ±33%. Consequently, we defined consistency over time as a ratio from 0.66 to 1.33 and refer to these values as an acceptable value, 'within the quality range.' Using the WHO toolkit and based on the participants' assumptions for each indicator, we considered whether the trend for each indicator was expected to be consistent, decreasing, or increasing. The indicators in the  routine HMIS that were expected to have a logical relationship, such as fourth antenatal care visit and skilled birth attendance, or measles vaccinations and fully vaccinated, were used to evaluate the consistency between related indicators. The quality range for consistency between related indicators was set at ±10%.
Data Quality Review dimension 3. External consistency of reported data. The routine HMIS indicators or data elements were compared with data from other relevant data sources, mainly the EDHS [1,2]. The quality range for external consistency was set at ±33%.

Definitions and alignment of indicators and data elements used
We compared definitions and reporting periods for the 19 routine indicators and data elements between the routine HMIS and the EDHS ( Table 2). The four maternal health indicators were well aligned with the corresponding indicators in the latest EDHS. However, the routine HMIS defined a skilled birth attendant as a facility delivery with a skilled attendant who is a nurse, midwife, health officer, or doctor trained in deliveries, but not a health extension worker. In contrast, the EDHS 2016 categorised the birth attendants including the health extension worker, while the place of delivery was described in a separate indicator [1,11].
The definition of indicators regarding neonatal mortality were poorly aligned. The routine HMIS reported neonatal deaths as deaths on days 0-6 after birth, but did not include deaths from day 7-28. The data element 'total number of births in the same kebele' was used as the denominator to determine the community's early neonatal death rate. The total number of births in the same kebele included all births, whether institutional or at home [11]. In the EDHS, a neonatal death was defined as a death occurring on day 0-28 [1].
The definition of fully vaccinated children was similar in the routine HMIS and the EDHS at the time of the comparisons. Since then, some new vaccines have been added to the HMIS definition of fully vaccinated ( Table  2). The EDHS presented figures of all basic vaccinations assessed in the group 12-23 months, and 'vaccinated by appropriate age,' which can be compared to the routine HMIS data for fully vaccinated children <1 year. The denominators used in the routine HMIS for coverage of the third pentavalent vaccination, measles vaccine, and 'fully vaccinated' were the estimated number of surviving infants for all indicators (Ministry of Health, Ethiopia, 2020, personal communication). This algorithm-based estimate was the practice despite guidelines that stated the total number of surviving infants should be used for measles vaccine and 'fully vaccinated', implying actual children who survived their first birthday. The HMIS defined Vitamin A supplementation as two doses in 1 year, whereas the EDHS reported data on one dose of Vitamin A in 6 months. Also, there were minor differences in age groups. The HMIS tuberculosis indicators were numerous [16] and complex, with both numerators and denominators being both highly specific and complex, such as 'latent tuberculosis infection treatment coverage for under 5-years children who are contacts of pulmonary TB cases' (Supplementary Table S1). For malaria and tuberculosis, there were no indicators in the EDHS.

Data quality review dimension 1: completeness of data
Data on immunization in 2017/18 were complete, as were data on child nutrition indicators for 2015/16. Data on skilled birth attendance and first and fourth antenatal care visits 2017/18 were complete, but postnatal care showed 2 months with missing data, both from the same region (Table 3(a)). Data on early neonatal deaths in the community for 2014/2015 were not reported from one city administration (Table 3(b)).

Data quality review dimension 2: internal consistency
Presence of outliers Data on postnatal care showed a few outliers (Table 3(a)), including one extreme outlier with a recorded value of zero. Data on births (data not shown) and early neonatal deaths in the community (Table 3(b)) were both prone to outliers. For Vitamin A supplementation, there were both extreme and moderate outliers, and outliers were also present for deworming. Outliers were often seen in month six and month 12, corresponding to reporting periods.

Consistency over time
The routine HMIS indicator data for 1 year were compared to the average of the preceding 3 years for all 19 indicators (Supplementary Table S2). There was consistency over time for maternal health and immunization in most regions and city administrations. In contrast, neonatal health and child nutrition indicators showed consistency over time in less than half of the regions or city administrations (Figure 1). All regions but one, showed an expected positive trend in maternal health indicators. However, all remained within the quality range of 33% of the average of the three preceding years (Supplementary Table S2).
No region or city administration showed consistency over time for all selected indicators. The consistency over time ranged from six to 12 out of the 19 indicators (Figure 2).

Consistency between related indicators
There was a particular inconsistency in denominators: only 0.7 million births in the same kebele were recorded, yet 2.4 million early postnatal care visits and 2.7 million children vaccinated with the third dose of pentavalent vaccine were reported in 2017/ 2018. The internal consistency between children vaccinated with the third dose of pentavalent vaccine and measles was in the quality range in all regions. The number of treated tuberculosis cases was higher than the number of new and relapse cases in two regions, and the reverse was seen in one region (data not shown).

Data quality review dimension 3: external consistency
The number of women attending four antenatal care visits varied in a non-systematic way. Nevertheless, most regions reported higher numbers in the routine HMIS than was recorded in the EDHS (Figure 3(a)). Similar patterns were noted for the first antenatal care visit and early postnatal care (data not shown  The immunization indicators generally showed good agreement when comparing the routine HMIS with the Ethiopian Demographic and Health Survey (Figure 3(b)). Nevertheless, this external consistency was lower for the indicator 'fully immunized children' (Figure 3(c)) and for 'vaccinated with the third dose of the pentavalent vaccine' and for the number of vaccine doses distributed (Supplementary Figure S2). The child nutrition indicators, such as deworming, showed weak consistency when comparing the routine HMIS with the Ethiopian Demographic and Health Survey 2016 (Figure 3(d)). Indicators suggested a decreasing incidence of malaria over time, but the number of prescribed antimalarials was higher than malaria cases reported in the routine HMIS (data not shown).

Data quality review dimension 4: external comparison of population data
The 0.7 million births reported in 2017/2018 routine HMIS was compared to the crude birth rate of 32.6 per 1000 population [15]. With Ethiopia´s population of 110 million [15,16], the expected number of births was 3.6 million per year, and hence we estimate only 19% of births were reported in the routine HMIS.

Discussion
This study of routine HMIS data quality in Ethiopia showed quality problems for all indicators, especially compared to external information sources. Indicators regarding some aspects of maternal health care and immunization were mostly complete and consistent over time. Indicators on child nutrition, malaria, and tuberculosis were more prone to outliers, less consistent over time, and showed major differences when triangulated with other information sources. Most notably, the indicators on births and neonatal mortality were incomplete and had very low internal and external consistency. We also identified regional differences in the quality of the routine HMIS data.

Indicator definitions and reporting guidelines
The indicators and data elements reflect an ambition to improve health care and a desire to capture new interventions. However, including a large number of complex indicators and data elements may also contribute to the burden of reporting and the risk of errors. The HMIS Indicators Reference Guide for routine HMIS [11] provides information in English on the definition of indicators. Work is ongoing efforts to translate the national guidelines to the major languages used in Ethiopia, and this is likely to increase the understanding of indicator definitions at the health system's lower levels.
Several issues on reporting may need to be addressed: double-reporting, if the first visit by a pregnant woman for antenatal care was registered at a health post and again at a health centre, or overreporting, if an antenatal care visit close to the expected date of delivery is recorded as the fourth visit irrespective of the number of visits. Another example of a reporting issue is vitamin A supplementation. There may be a lack of clarity over whether the number reported represents the Table 3b. Total number of early neonatal death at community by region and month in the routine Health Management Information System, July 2014-June 2015. Internal consistency. Outliers within 2-3 standard definitions from the mean value per region or city administration are underlined. Area  1  2  3  4  5  6  7  8  9  10  11  12  Addis Ababa  (city)   ------------Afar region  2  0  0  2  5  2  4  3  4  0  1  0  Amhara region  8  4  34  19  10  13  14  17  10  13  8  42  Benishangul-Gumuz region  3  1  16  6  5  4  5  6  20  1  11  25  Dire Dawa (city)  0  0  0  0  0  0  0  0  0  0  0  0  Gambela region  0  0  0  6  0  1  0  0  0  1  0  3  Harari region  0  0  0  0  0  0  0  0  0  0  0  1  Oromia region  58  43  44  82  22  59  28  21  14  28  16  24  Somali region  ---0  --0  0  0  1  0  0  SNNP region  1041  43  1701  2008  20  9  160  13  17  24 [20][21][22]. The WHO toolkit provided a useful method for analysing data. Triangulating routine HMIS data can increase the awareness of quality problems and help go beyond analysing the accuracy of reporting within the routine HMIS [23,24]. To some extent, discrepancies in external consistency may be due to reporting errors that affect indicators to varying degrees in the surveys used for comparison. A woman likely remembers where she last gave birth, but the exact number of antenatal visits may be more challenging to capture [25]. Integrated and continuous surveying by permanent survey teams has been suggested as a model for countries aiming at improving their routine HMIS [26]. An initial step could be to use the already existing surveys for these comparisons, despite the long time interval between surveys which would hamper the analyses. One recent scoping review on child immunization in Ethiopia identified considerable discrepancies in reports from various sources, and used this information to identify research priorities, consulting an expert panel of stakeholders [27].    Some strengths of this study are that we used a WHO data quality review toolkit [18] and that the work was led by analysts familiar with the routine HMIS data. Implementation research is likely more efficient when conducted by the usual implementation agencies, as in this study [28]. We compared results for indicators defined differently by different but compatible sources, and we explored these differences in indicator definitions. Timeliness is one aspect of data quality that was not addressed in this study. In our research, the newly introduced vital events registration was not assessed. Also, we did not evaluate data quality at the district and facility levels.

Conclusion
We analysed the consistency of routine HMIS data and identified strengths and challenges in Ethiopia's routine HMIS data. We conclude that the internal consistency varied between indicators and regions. In general, internal consistency in the routine HMIS was better for indicators on maternal health and immunization than for other indicators. Internal consistency was better than external consistency, where routine HMIS data were compared with data from other sources, mostly survey data. The lack of external consistency suggests quality problems in the routine HMIS data that go beyond correct reporting. We also conclude that the uncertainty of population estimates makes a major contribution to discrepancies. We suggest future reality-checks with triangulation of routine HMIS and data from other sources and alignment of routine HMIS indicators with those of the EDHS to increase comparability. Together with the ongoing digitalisation that is part of the Information Revolution brought forward by the MOH, our suggestions may improve the routine HMIS data quality. In Ethiopia and globally, improved routine HMIS data are pivotal to achieving universal health coverage [29].