Historical development of the statistical classification of causes of death and diseases

Abstract Abstract:  This paper offers an historical overview of international mortality/healthcare classification systems, covering developments from the International List of Causes of Death (ILCD) through to the International Classification of Diseases (ICD). The ICD is a global data system established to classify diseases and mortality causes. The past few decades have seen a dramatic increase in use of the ICD, paralleling its improved efficiency and integration into the health information management (HIM) arena. The ICD, published by the World Health Organization (WHO) since 1984, is the successor to ICLD-5 and assigns codes to every health diagnosis. The 10th revision of the WHO International Statistical Classification of Diseases and Related Health Problems (ICD-10-CM) is the latest version, and the 11th is currently under development. A clinical classification and coding schedule is essential for improving and refining clinical data systems in numerous ways, including treatment selection, cause-of-death reporting, eligibility selection, the facilitation of health insurance claims, data storage, health service evaluation, health policy, the management of epidemiological diseases, resource allocation and the reduction of potential costs. All these contribute to proper development and planning within healthcare services. ICD has become the universal standard.


ABOUT THE AUTHOR
This paper is a small portion of a chapter in a larger project (Factors Influencing the Implementation of Clinical Coding/ICD in Saudi Public Hospitals). Together, the research team for this project has examined the knowledge of large-scale implementations of the ICD across health systems in the developed and developing world. The research group has published the questionnaire design and theoretical framework development used in the project 1,2 . In this paper, the authors provide a general overview of the historical development of disease classification. We review the beginnings of the WHO's ICD in the statistical classification of mortalities, before the emergence of national modifications of ICD classification. This has also been published in a separate paper 3 , based on the literature drawn from several developed nations, and the ICD-implementation literature of Thailand and its neighbouring Asia-Pacific network of developing nations.

PUBLIC INTEREST STATEMENT
The clinical coding process involves translating a physician's clinical documentation relating to the diagnoses and interventions of individual patient cases into codes, according to a basic classification schedule. Thus, it is regarded as an essential tool in the improvement of healthcare in that it provides feedback based on statistical compilations and analyses of, for example, disease occurrence, medication and procedure success, and recovery rates. In addition, it provides background demographic data on a geographical or individual basis. The coding of information transcends language barriers, enabling data to be collected and analysed at a global level. In current practice, such as in hospital settings, the primary purpose of coding relates to the submission of health insurance and medical aid reimbursement claims, whereas use of the data in statistics and research is secondary.

Introduction
The International Classification of Diseases (ICD) was originally established to classify the causes of mortality and research in the early forms of the ICD. The past few decades have seen a dramatic increase in ICD use in relation to this purpose. This expanded use has contributed to greater efficiency of the classification system in healthcare via its integration in health information management (HIM). This is essential to improving clinical data systems in numerous ways: treatment selection, cause-of-death reporting, eligibility selection, facilitating health insurance claims, data storage, health service evaluation, health policy, resource allocation, potential cost reduction methods and managing epidemiological diseases such what has been happening when novel Covid-19 erupted into a full-scale pandemic (Alharbi et al., 2020). From this perspective, the current paper intends to remedy gaps in the literature by outlining the development history of this classification from the original to the latest version (generically referred to as ICD-10).
However, the impact of causes-of-death statistics has been to open a new area of medicinepublic health-together with an understanding of the social causes and consequences related to disease. This approach has spread rapidly from the United Kingdom (UK) and Europe, across the Atlantic to North America, and then to Australia, New Zealand and South America, also reaching the developing world Moriyama et al., 2011). The clinical statistical classification in its modern iteration involves translating the physician's clinical documentation on the diagnoses and interventions of individual patient cases into codes, according to a basic classification schedule. It is regarded as an essential tool in the improvement of healthcare, providing feedback based on statistical compilations and analyses of, for example, disease occurrence, medication and procedure success, and recovery rates. Additionally, it provides background demographic data on a geographical or individual basis. The coding of information transcends language barriers, enabling the collection and analysis of data globally .
The World Health Organization's (WHO, 1978b) Declaration of Alma-Ata produced a greater awareness of socioeconomic inequalities in health. Developed nations, along with some developing nations, began taking responsibility for global health funding and assisting poorer nations. This included responses to worldwide health crises, such as the human immunodeficiency virus and acquired immune deficiency syndrome (HIV/AIDS). This global epidemic created a significant burden for providers of healthcare worldwide (De Maeseneer et al., 2008). In addition, the United Nation's (UN) Millennium Development Goals (MDG) 4 and 5 targeted the pitiful state of maternal and child health in many poorer countries, and simultaneously highlighted the benefits of intensified scale-ups based on clinical statistics evidence. The 2015 MDG outcomes showed that since 1990, the worldwide under-five childhood mortality rate had reduced by more than 50% and maternal mortality by 47%; the mortality rates of HIV, malaria, and other diseases reduced by 40%. In many regions, reductions were achieved late in the given period, as health information analytical methods were refined to reveal neglected areas that required intense scale-ups (Way, 2015).
The continuous development of health information power, skills, and statistical analysis and methodology is reflected in the expanded coverage, detail, functionality and potential uses of ICD-10 (Lozano et al., 2011). This expansion includes the practical use of classifications in primary healthcare for the origination and storage of individual data, as well as its transmission to pharmacies and health insurers to expedite prescribed medicines, payments and reimbursements. Using appropriate health information technology (HIT) configured by HIM systems and professionals, ICD-10 has been integrated into all levels of healthcare. This article, as mentioned earlier, provides a brief review that encompasses the historical development of ICD from its inception to its latest version.

The milestone: reviewing the history of International Classification of Diseases development
Unlike political, social and economic history, which are open to debate and contestation between different schools of thought, the history of the classification of diseases and causes of death is factual and universally accepted. The primary documents of the latter stages of this development, after World War II (WWII)-ICD-6 to ICD-10-are available from the WHO, national Centres for Disease Control and Prevention, the Australian Consortium for Classification Development, the German Institute of Medical Documentation and Information and many other organisations. Secondary factual sources have generally been used to outline the earlier development of the ICD. The historical literature on the ICD is not "criticised" here, but is presented as background, to enrich readers' understanding of the magnitude of the accumulated knowledge and experience. The researchers summarise the literature in this unique paper.

Methods
The collected publications were thoroughly examined to provide a comprehensive literature review. Publications were found via PUBMED, ProQuest, Embase and Google Scholar databases. Related studies in the English language were extracted, based on title and abstract screening, with no date filter. The review evaluated articles pertaining to ICD in healthcare. The researchers also used a general review of the literature on primary data collection through examining extant studies on the classification of causes of death and diseases. We evaluated peer-reviewed articles, reports and articles pertinent to the topic in order to gain a deeper understanding. Other sources include the primary documentation on ICD from the WHO and national healthcare organisations, as well as information from the websites of consultancies, vendors, training organisations and national health information management organisations.

What is classification?
Classification entails the systematic arrangement of items into groups or classes according to certain criteria (Beldiman, 2008). Thompson (2003) contended that a basic form of classification is involved in the survival of all animals: "The ability to classify is common to all animals, for to survive animals must group other organisms into at least three classes: Those to be eaten, those to be avoided and those to associate with, especially members of their own class" (p. 788). Scientific classification goes a step further in that it includes the hierarchical arrangement of elements within each class according to governing criteria. Scientific classification reflects observed reality in a modelled structure based on the nomenclature or terminology of the system. In biology, the taxonomic ranks of species, genus, family, order, class, phylum, kingdom and domain are universally agreed upon as the structural nomenclature of the classification.

A historical overview of classifying deaths and diseases
This section outlines the history of mortality/morbidity classification systems, including medical science from ancient times through to emerging statistical approaches to the emerging public health field in the nineteenth century, together with an understanding of the social causes and consequences related to disease. Thereafter, it traces the development of international approaches, from the first International List of Causes of Disease (ILCD) to the ICD-10, which has become the "standard diagnostic tool for epidemiology, health management and clinical purposes" globally (WHO, 2019a, para. 1). Recent decades have seen a dramatic increase in use of the ICD as a multi-functional healthcare information resource, paralleling HIT developments that have enabled online practices of information storage, retrieval, the emergence and sharing of electronic health records, and health information exchange.

Foundations of nosology: from ancient Greece to the Renaissance
In Classical Greece (c. 510-323 BCE), Hippocrates (c. 460 BCE-375 BCE), and the later Roman physician Galen (c. 210 CE-129 CE), produced a lasting classification of diseases based on the effects of external forces on the equilibrium of four bodily humours: blood, yellow bile, black bile and phlegm. This ancient classification of diseases into four basic classes, which persisted until the Renaissance in Europe (c. 14th to 17th centuries CE), constitutes the foundation of nosology, the branch of medical science concerned with disease classification (Kalachanis & Michailidis, 2015).
While this humour-based approach persisted, the first recognised classification of diseases structured according to contemporary principles of scientific empiricism was the Universa Medicina, published in 1554. This was the supreme work of French physician Jean Fernel, acknowledged as the founder of physiology, who classified diseases according to organ (Moriyama et al., 2011). Thomas Sydenham, the "English Hippocrates", published Opera Omnia in 1676 (Moriyama et al., 2011;Pearn, 2011;Poynter, 1973). This was an early classification of interventions. Ancient procedures used to restore balance between the humours, such as bleeding, cupping and leeching, continued to form part of the practice of so-called "barber surgeons" until the end of the nineteenth century (Hart, 2001).
In the eighteenth century, the Swede Carolus Linnaeus (famed for his botanical taxonomy) also classified the animal and mineral kingdoms, and attempted the same for diseases. His contemporaries (physicians) who focused on disease included F. Boissier de la Croix de Sauvages, Jean-Louis Marc Alibert and Erasmus Darwin (Moriyama et al., 2011;Pearn, 2011;Poynter, 1973). In his treatise Nosology Methodica, Sauvages applied similar principles to Linnaeus' taxa, or units, applicable to all levels from kingdom to subspecies. He developed 10 classes, systematically subdivided into some 300 orders, according to medical symptoms (Poppensiek & Budd, 1966). William Cullen's synopsis Nosologae Methodicae was published in 1775, followed in 1817 by John Mason Good's A Physiological System of Nosology, which played an important role in the development of disease nomenclature. These scholars implemented pragmatic changes, instigating a morphological classification system that supported a pathology based on anatomical structure, and facilitating an understanding of epidemic diseases (Moriyama et al., 2011). As Thompson (2003) notes, historically, nomenclature and classification reflect the scientifically observed model that has developed parallel to the discipline.

Nineteenth-century mortality statistics and the emergence of public health
A seventeenth-century forerunner in the emergence of the statistical classification of mortality was John Graunt. In the London Bills of Mortality, he established a 36% mortality rate for children surviving to age six. Graunt foreshadowed the nineteenth-century focus on gathering statistics on the causes of mortality, which culminated in the ILCD (Coiera, 2003).
In 1839, William Farr, a physician employed as a British government statistician, compiled a classification of mortalities. This formed part of the First Annual Report of the Registrar-General of Births, Deaths and Marriages. In Farr's eclectic threefold classification, communicable diseases formed the first class, based on their level of risk; sporadic diseases classified by organ comprised the second class; the final class comprised diseases of uncertain origin, which included tumours, unaccountable sudden death and dementia. Farr strove continuously to reflect the broader social determinants of health in his classifications (Farr, 1885;Hare, 1883). His 1837 mortality report included a comment on 63 deaths resulting from "starvation": "Hunger destroys a much higher proportion than is indicated by the registers in this and every country, but its effects, like the effects of excess, are generally manifested indirectly in the production of diseases of various kinds" (Whitehead, 2000, p. 87). Farr's socially dynamic mortality system led to the establishment of public health as a branch of medicine (Atkinson, 1993;Franklin et al., 2008). Farr's statistical approach demonstrated how inferences drawn from health statistics may be used to improve healthcare.
The first authoritative reference on the terminology of diseases was the Nomenclature of Diseases. This represented the culmination of 12 years' work. It was published by the Royal College of Physicians in 1868 and revised frequently until its last edition in 1959. An editorial in the Indian Medical Gazette of 1877 described its universal recognition, stating that there could be no disputing that the Royal College of Physicians of London deserved the gratitude of the noble profession of medicine and the world for publishing the invaluable reference work. This publication marked a turning point in the history of medicine, providing a reference point for medical professionals in various countries to compare and enhance their knowledge (Nomenclature of Diseases, 1877).
Classification systems underwent very little development until Europe was well into the Renaissance; however, the progress achieved prior to Farr's intervention highlights the reciprocal bond between scientific practice, its nomenclature and its classification. While classification is limited by contemporary medical knowledge, it nonetheless dictates medical practice (Jutel, 2011).

The impact on causes of death of the new industrial cities
The Great Exhibition 4 was held in London, England, in 1851. The Crystal Palace-a large exhibition hall-was constructed especially for this event; its glass and iron construction promoted technology as a way to improve the quality of life. Behind the monumental façade of the building's design and engineering genius, many doctors saw social deficiencies in the emerging industrial cities linked to it, and to the exhibition, symbolically. As William Farr had described the industrial city of Manchester in 1846: "In the midst of a population unmatched for its energy, industry, and manufacturing skill, 13 362 children perished in seven years, over and above the mortality natural to mankind" (Rose, 1971, p. 23). The medical profession had grasped the value of a statistical approach to disease; the contrast between urban development and increased death rates provided the impetus that led to the First International Statistical Congress (ISC) in Brussels in 1853. One area advocated for international collaboration was the causes of death. Achille Guillard, recognised as the founder of demography, proposed the standardisation of nomenclature in the fields addressed by the congress; further, William Farr and the Swiss Marc D'Espine were tasked with developing a uniform international classification of mortalities (Jetté et al., 2010). These two statisticians presented separate lists at the second congress, held in Paris two years later. D'Espine produced a list based on symptoms, while Farr persisted in his categories, which he extended to five. The congress eventually accepted a compromise of the two approaches and produced a list that was then continually revised for its biennial assemblies. However, these never received full international acceptance. A notable resolution was passed by the ISC in 1855, requiring that physicians reporting mortalities use the official international nomenclature (Moriyama et al., 2011).
The ISC developed into the International Statistical Institute (ISI). At an ISI meeting, held in Vienna, 1891, French statistician and demographer Jacques Bertillon, Chief of Statistical Services of the City of Paris, was elected to chair a committee tasked with developing a classification of causes of death (Ferenc, 2013;Gersenovic, 1995). The Bertillon Classification of Causes of Death was based on the principle established by Farr of categorising general diseases separately from those relating to specific organs or anatomical sites. Bertillon's main classes moved from general diseases through diseases related to specific organs, to malformations, specific diseases of infancy and, finally, to diseases with external causes and those insufficiently defined (Moriyama et al., 2011). This classification system was adopted by the ISI at its Chicago meeting in 1893, marking the ILCD's inception. It was adopted by the American Public Health Association in 1898 for use in the United States of America (US), Canada and Mexico, with the proviso that it should be revised every 10 years (Elkin, 2012).
The value of international collaboration was demonstrated by the statistical analysis and establishment of the source of a series of cholera outbreaks in France; this outbreak spread to neighbouring European countries and Britain around the time Bertillon's classification was released (Bowker, 1996). In the late nineteenth century, the cholera bacillus caused series of epidemics, spread by pilgrims returning from Mecca. Before that, travelling on foot or by sailing ship, pilgrims would succumb to the disease before returning to France; after the advent of more efficient travel modes, such as rail and steamboat, people could return more quickly, bringing infection with them. Increased international communication in the 1890s promoted awareness of this problem; consequently, the need to monitor health at the international level was acknowledged (Bowker, 1996). After the initial ILCD, five further versions were produced, with ILCD-5 released in 1938(WHO, 1967. At the end of WWII, the UN was established, immediately followed by the formation of its specialised agencies, including the WHO in 1948 (Moriyama et al., 2011). An outline of the development from ICLD to ICD-10 and the periodicity of revisions is shown in Figure 1.

The international classification of diseases
The WHO was mandated to assume responsibility for international medical classifications. The ICLD was superseded by the International Statistical Classification of Diseases and Related Health Problems, conventionally known as the ICD, which included both a "causes of mortality" and a morbidity classification. Thus, ICD-6, adopted in 1948, is the successor to ICLD-5 (WHO, 1967).
Each step of the progression from ILCD to ICD-10 was based on decisions made at an international revisionary conference. Prior to the fourth ILCD conference, the classifications for diseases and causes of death were regarded as separate entities. This separation was challenged at the fifth revision conference, where Canadian delegates presented the Standard Morbidity Code, published in 1936 by the Canadian Dominion Council of Health (Lancaster, 2012;WHO, 2004).

The development from ICD-6 to ICD-10
The US Committee on Joint Causes of Death was established in 1945, with a mandate to establish guidelines on how to confirm the main cause of death in cases in which several causes were listed on a death certificate. The committee ultimately proposed a combined classification for diseases and deaths. At the sixth revision conference, ICD-6 became the first WHO revision, and the first classification to combine diseases and injuries with causes of death. ICD-6 comprised three tabulated lists, classified aetiologically, with 3-character numeric categories and 4-character subcategories that could be accessed through a separate alphabetical index (Moriyama et al., 2011).

Sources of data: WHO
In 1951, prior to the seventh revision conference, the first WHO Centre for Classification of Diseases was established in the General Register Office of England and Wales, London. The conference maintained the same structure and content for ICD-7, and focused on eliminating initial errors and inaccuracies present in ICD-6. The release of ICD-7 resulted in its broader use, particularly in the US, where it began to be used for the diagnostic indexing of hospital patient clinical records. Israel and Sweden also developed national adaptations, and the Pan American Health Organization developed a Spanish translation of the US ICD-7 adaptation for use in Latin American hospitals (German Institute of Medical Documentation and Information, 2016;Moriyama et al., 2011).
While the basic structure and classificatory principles were maintained, ICD-8 was influenced by the national adaptations of ICD-7. Major adjustments were made to the categories of infective, parasitic, circulatory and perinatal diseases, together with mental disorders, congenital malformations and injuries resulting from accidents, poisoning and violence. The index was mandated to the US National Centre for Health Statistics (Moriyama et al., 2011).
The ninth revision conference received two major recommendations. The first was from specialists expressing the need to retrieve medical records for clinical research. The second was from physicians involved in medical care programs in which emphasis was given to individual patient conditions, rather than an aetiological perspective. As a solution, certain conditions in ICD-9, released in 1977, were classified twice and the "dagger-and-asterisk" system was introduced. This facilitated the classification of diseases of specific organ systems, together with an underlying general disease. For example, tuberculous meningitis is classified under meningitis with a daggerand-asterisk cross-reference to tuberculosis. A further addition to the WHO classification body was the trial publication of supplementary classifications of "Impairments and Handicaps", and "Procedures in Medicine" (WHO, 1978a, para. 7). The original WHO procedure classification was known as the International Classification of Procedures in Medicine (ICPM).
Specialist adaptations of the basic classification were considered for oncology, dentistry and ophthalmology. As an example of a medical specialisation requiring more detail than the general format, the oncology adaptation structure (known as the ICD-O) includes the topography, morphology and behaviour of neoplasms described by a 4-digit topography code, a 4-digit histology code and a 1-digit code for behaviour (WHO, 1976).
The expert committee investigations into alternative classification structures that preceded the tenth revision conference confirmed that the traditional arrangement required no improvement. Attention was focused on achieving the optimum balance of multiple purposes, and on allowing for future expansion without structural disruption to the existing codes (WHO, 1986). The introduction to Volume 2 of the ICD-10 outlines its remit: "The purpose of the ICD is to permit the systematic recording, analysis, interpretation, and comparison of mortality and morbidity data collected in different countries of areas and at different times" (WHO, 2004, p. 3). Thus, the ICD was intended for statistical purposes, whether at the district, national or global level. The paragraph continues: "The ICD is neither intended nor suitable for indexing of distinct clinical entities. There are also some constraints on the use of the ICD for studies of financial aspects, such as billing or resource allocation" (WHO, 2004, p. 3).
The WHO ICD-10 comprises three volumes: Volume 1 contains the tabulated lists of the 3-and 4-character subcategories, as well as introductory texts. Volume 2 contains a general introduction to ICD-10, an overview of the classification's history, and the rules on how to code mortality and morbidity, with numerous examples. Volume 3 comprises the alphabetical index, a wide collection of encoded diagnoses, and the unwanted effects of drugs and chemical substances, as well as the causes of injuries and poisoning (Jiang et al., 2009). The WHO procedures classification system, ICPM, was not successful because most countries preferred to use their own national procedure codes. No procedure classification accompanied ICD-10; however, in 2012, the WHO began work on the International Classification of Health Interventions. As of 2019, this remains in a beta version (WHO, 2019b).
The development from the ICD-9 format of 3-to 5-character codes to 3-to 7-character codes in ICD-10 presents an exponential development in the number of potential codes (WHO, 2007). This expansion is due to the need for greater clinical specificity. As an example, what may have been described previously as an arm muscle injury is now explained as an injury to the right arm bicep. The revision conference also initiated a mechanism for continuous updating, which has been implemented annually since 1996 (Moriyama et al., 2011). The basic classification in the form of a single list of three alphanumeric character codes, structured by category from A00 to Z99, is used for reporting data to the WHO mortality database to facilitate international compatibility. ICD-10 consists of 21 chapters. The first alphabetical character of the code is a letter linked to a specific chapter (Coiera, 2015).

Conclusion
The recording of cause-of-death statistics introduced a new field of medicine-public health -concerned with the social causes and consequences related to disease. It spread rapidly from the UK and Europe to North America, followed by Australia, New Zealand, South America and, eventually, to the developing world. International mortality classification using ILCD was introduced in 1893 to monitor the causes of death, with five revisions implemented between 1900 and 1938 (ILCD-1 to ILCD-5). A new format combining morbidity classifications with the existing mortality lists commenced in 1948 when the WHO began to oversee the system and released the sixth revision. This included a name change to the ICD in the same year. Relatively minor changes were made in the WHO seventh and eighth revisions, while the US, Sweden and Israel made adaptions for indexing hospital diagnostic data. In 1977, the WHO published the ICD-9, which included an expansion into 4-digit level categories, some optional 5-digit level subcategories, and dagger-and-asterisk entries that enhanced the clinical perspective regarding the treatment of individual patients and opportunities for clinical research.
ICD-10, with diagnostic codes of between three and seven characters in length, was released and adopted in 1992 by WHO member nations. This iteration's major strength has been the significant expansion of the classification and corresponding codes to include an unprecedented level of specificity. This has facilitated greater accuracy in billing and costing, as well as in data specificity for research and statistical purposes.
Overall, post-WWII international collaboration underlies the evolution from ICD-6 to ICD-10. A user instruction volume was also added, proceeding from combining the classifications of morbidity and mortality in a single edition with access through an alphabetical index. An approach emphasising the importance of ICD data in research followed, resulting in the addition of daggerand-asterisk cross-referencing on the advice of a panel of specialists, as well as the extension of the basic ICD version for several specialist areas of medicine. Nations" (The Great Exhibition) was the first of these events, popular in the nineteenth century.