Assessment of corona virus (COVID-19) infection spread pattern in Iraq using GIS and RS techniques

Abstract Iraq and other nations have had fatalities as a result of the COVID-19 pandemic, which started from China. In Iraq, the 2020 coronavirus pandemic spread started on February 24–2020, in the Najaf province. Other cases of COVID-19 were detected in other governorates, where the overall number of confirmed infections in Iraq reached 2,114,313 including 24,267 deaths, as of January 16–2022. This study aims to identify infection recognition patterns using data mining applications by Remote Sensing RS and Geographical Information Systems GIS techniques to prepare coronavirus spread mapping based on spatial-temporal distribution and GIS-based spreading pattern processes in Iraq. In addition, to evaluate the air quality in the period of virus breakout and lockdown. The assessed data included the period from the beginning of the spread of the corona until its end. To present the mode of spread at the beginning of pandemic, we relied on statistical and remotely sensed data from February 24 to June 25–2020. Along with the results of the GIS spatial distribution maps, we provided a visual view of infection queries and presented the results as a spreading pattern map of COVID-19 in Iraq. Thus, GIS and remote sensing technologies are indispensable to overcoming contagious diseases by monitoring their geographical distribution.


Introduction
Coronavirus is an emerging new strain of virus that is mutated from SARS and belongs to the RNAenveloped positive-sense family of viruses that infect humans and animals (Ullah, 2021).The disease symptoms vary between severe and advanced cases and other cases with very simple symptoms where most patients infected with the virus show symptoms of fever, fatigue, tonsillitis, and sneezing (Nobel et al., 2020).When the virus reaches the lungs, respiratory symptoms such as coughing, difficulty breathing, and acute respiratory infections may appear in the patient and may be followed by major complications (Azhar et al., 2020).It seems that the elderly are the age group most affected by the infection, in addition to patients with diabetes, chest, heart, and kidney diseases, diseases, and immune system disorders (Du et al., 2020).This virus is transferred by direct contact with the infected person (Arons et al., 2020;Gormley et al., 2020).This virus, like others, has many mutations, which lead to the emergence of new strains of the disease, more dangerous and rapidly spreading, such as the Delta (Perez-Gomez, 2021), and Omicron (Islam et al., 2022).
The coronavirus pandemic was spreading far and wide around the world, resulting in a major threat locally and globally (Feng et al., 2020).This virus, which started in Wuhan, China, became the principal human challenge (Azarafza et al., 2021;Lu et al., 2020).The economic, social, and political conditions have been more adversely impacted by this COVID-19 regional epidemic (Asfaw et al., 2022;Yoosefi Lebni et al., 2021).Different restrictions operations were applied (e.g.closing the country's borders and restricting aerial traffic) to limit the virus spread and reduce the infectious, nevertheless, the virus spread has been very fast and widespread (Azarafza et al., 2021).In addition, some of these preventive measures negatively affected the international economy, people's livelihood, and psychological state, causing many economic crises and increased unemployment and death due to illness or effects (Atkeson, 2020).
On the other hand, these closures led to significant advantages, improving air quality and decreasing the environmental pollution levels (Bherwani et al., 2021).From an environmental standpoint, there are advantages, such as a notable improvement in air quality due to the entire cessation of industrial activity and transportation, which are well-known causes of air pollution, there has been a sudden drop in the concentration profile of air pollutants (Praveen Kumar et al., 2022).Furthermore, to evaluate the substantial changes in air quality, the National Aeronautics and Space Administration (NASA) and European Space Agency (ESA) published air pollution data for Asian and European nations (Gautam, 2020).Moreover, using data on air pollution from nine Asian cities and statistical techniques, Gupta et al. (2021) concluded a relationship between the death rate in infected cases and air pollution.There is a correlation between air pollution levels and COVID-19-related mortality, indicating that air pollution is a significant component in increasing the burden.Some epidemiological studies have been presented by researchers that estimate the dynamic transmission, resource targeting, and intervention strategy (Azarafza et al., 2020;Pande et al., 2023;Sresto et al., 2022).Remote sensing techniques, GIS, and statistical approaches are used to determine the affected regions, the greatest susceptibility regions, and the spreading pattern of the virus (Badillo-Rivera et al., 2020), and to analyze the social resilience level (Habibi et al., 2020).Also, the geo-visualization approach integrated with GIS has been used to analyze and visualize the virus data geographically (Kodge, 2021).Besides, the data mining applications by clustering were mainly used for preparing the virus spread maps (Azarafza et al., 2021).
Clustering is one of the important tools in investigative data analysis (Jain et al., 1999).Hence, the classification and identification of sensitive regions in the spatial-temporal distribution and COVID-19 transference can be a significant strategy for preventing the virus's spread (Lu et al., 2020).
Detecting the COVID-19 distribution pattern and its associated hazards in Iraq has enormous ramifications for healthcare systems, public health, and society at large.Early detection of the virus's propagation pattern enables prompt virus impact mitigation measures.Monitoring hazards enable targeted public health initiatives by allowing an awareness of which regions or communities are more vulnerable.Additionally, a thorough understanding of the dispersion pattern aids in resource allocation and planning for healthcare organizations.Prioritizing immunization efforts benefits from knowledge about the virus's mode of transmission.Priority may be given to high-risk locations.Making informed decisions on lockdowns, travel restrictions, and business closures requires careful observation of the spread pattern.These choices could have a big impact on the economy.
For identifying a coronavirus spread pattern recognition in Iraq, clustering and GIS-based analysis were implemented to detect the COVID-19 spread pattern from the start point (Najaf city) to the other provinces of Iraq.Different types of data have been used and analyzed statistically.Environmental changes related to air quality caused by the outbreak were also discussed based on remote sensing data and ArcGIS maps by evaluating the air quality during lockdown.Based on the information taken into account and provided in this study, it is possible to recognize the virus spread pattern and related risk monitoring.By identifying hazards, limits can be approached more carefully, potentially reducing economic disruptions.Long-term planning for public health infrastructure and pandemic readiness can be influenced by tracking and comprehending COVID-19 dissemination patterns.

Data and study area
Two types of data have been used in this study: data related to COVID-19 and data related to air quality.The data used in this study were downloaded from the World Health Organization WHO website, the Iraqi Ministry of Health, worldometers statistics, and the NASA Worldview application.All used data were reported in Table 1.
The geographic location of Iraq is in Southwestern Asia.Iraq covers a total area of 438,317 km 2 .The largest city is Baghdad, the capital of Iraq.While Iraq's western and southern regions get substantially drier weather, the central and southern regions experience a range of conditions, from continental to dry.The weather is Mediterranean in the mountains in the north and northeast.Wintertime highs typically reach 16 C.The weather is extremely hot during the summer, with daytime highs of over 50 degrees Celsius in July and August and overnight lows of 26 degrees Celsius.Iraq experiences precipitation from December through February ranging between 100 and 180 mm.Rainfall can occasionally last into April in the northern and northeastern regions (Hamed et al., 2021;Jumaah et al., 2023).
Iraq's air is severely polluted by toxins released from a variety of sources Iraq's metropolitan regions experience severe environmental issues, including air pollution, the effects of climate change, poor rainfall, water scarcity, soil salinity, and water pollution, which worsen the state of the country's important ecosystems.Therefore, it is crucial to continue with studies and research on monitoring pollution levels and other serious problems such as COVID-19 spread pattern.

Methods
To monitor the coronavirus spreading pattern in Iraq, the clustering method and geographically spatial distribution based on ArcGIS 10.3 have been applied to identity the COVID-19 spreading risk from the  Modern GIS techniques provide real-time information, facilitate data sharing, and have become very popular in understanding the spread of the coronavirus (Kamel Boulos & Geraghty, 2020).GIS also performs a significant job with multi-layer spatial data incorporation alongside statistical information (Ameen et al., 2021;Jumaah et al., 2018Jumaah et al., , 2021Jumaah et al., , 2023)).It permits effective processing and data investigation to interpret and visualize information properly and present beneficial results (Jumaah et al., 2019(Jumaah et al., , 2022;;Kalantar et al., 2020).The main task of cluster analysis is pattern recognition for virus transmission tracking through investigative data mining.It is well known that clustering consists of various algorithms, each of which has a different idea of what a cluster is and how to most effectively use it to clarify the task (Azarafza et al., 2021).Clustering is formulated as a multitasking enhancer to perform cognitive detection of event patterns according to an iterative method (Aggarwal & Reddy, 2014).
In Figure 3 we relied on COVID-19 statistics from Iraq provinces in order to map the total cases of virus infections and identify spreading patterns.Based on layer properties graduated, colors were established and the methodology involved dividing the provinces into five classes based on infection severity.The second mapping analysis involved using a clustering tool to determine spread patterns.The spatial statistics tool was used.Grouping analysis was used for analyzing patterns.The Euclidean distance method was used.The equation of Euclidean distance can be stated as; where d ij is the Euclidean distance of the i th and the j th data object, p is the number of parameters, x ik is the i th data object in the k th parameter, x jk is the j th data object in the k th parameter (Dini & Fauzan, 2020).
The process of dividing a collection of spatial objects into clusters is known as geospatial clustering.While the clusters themselves are as diverse as possible, the objects within them exhibit a high degree of similarity.Furthermore, a spatial analyst tool was used to present a Kernel density map of coronavirus infections.
Moreover, satellite images of the period of the pandemic have been downloaded by the NASA application (worldview).Two images were downloaded and analyzed based on two air quality parameters (Aerosol Optical Depth AOD and Nitrogen Dioxide NO2).Also, spatial distribution maps of AOD and    586,300 with a standard deviation of 121,585.55 and a mean of 112,036.83.As well as Table 3 represents overall cluster map statistics for the same periods.Spatial constraints of contiguity edges corners resulted in the Euclidean distance method of high coefficients minimum R 2 was 0.96 and maximum of 0.99.

Results
Figure 6 represents the global monitoring of COVID-19 in Iraq.Where Iraq appears in highly infected classes with (500,001-5,000,000,000) cases.Figure 7 ).Spatial distribution of AOD and NO2 was obtained by geoprocessing analysis.The AOD ranged between (0.06-0.5), and NO2 ranged from (0.07-0.6) µg/m 3 .AOD was within low levels in the entire Iraq less than 0.5 except in southwest parts.NO2 was within standards in all of the country.

Discussion
Based on Figure 4 the infections are visualized as a spatial distribution in the provinces in the period of Feb 24-2020 to Oct 24-2021.The blue color represents the minimum infections otherwise red refers to high infections.High infections were within Baghdad and Basrah.The number of cases in these specific regions, such as Baghdad and Basrah, can vary for two essential reasons: population density and travel patterns.Travel volume into and out of major cities and transportation hubs like Baghdad and Basrah are frequently higher.This may result in infections from other areas or nations, which would raise the overall number of cases.
Based on Figure 5, the infections are visualized as a spatial distribution in the provinces in four different periods from Feb 24 to June 25-2020 at the beginning of the virus spread.From the map, it can be seen the virus state and consequent spread of infections starting from Baghdad and continuing to Basrah and Najaf provinces.
Based on Table 3, the analysis result of Euclidean distance method clustering was performed with high coefficients of 0.96, 0.97, 0.99, 0.99, and 0.98 for 24 February to Mar, 24 February to Apr, 24 February to May, 24 February to June, and 24 February 2020 to 24 October 2021, respectively.In Figure 6 Iraq is ranked the second stage in the number of infections on a global scale.The population size, testing capability, healthcare infrastructure, public health initiatives, and pandemic dynamics are some factors that influence infection spread.The COVID-19 scenario is dynamic and subject to quick change.Various factors, such as the introduction of novel variations, modifications to public health policies, vaccination efforts, and population behavior, might cause countries to see spikes in cases at different times.divided into five clusters.From Feb 24 to Mar 25-2020, the spreading pattern involved cluster 1, Sulaymaniyah, cluster 2, Baghdad, cluster 3, Najaf, cluster 4, Arbil, and cluster 5 the rest provinces.
From Feb 24 to Apr 25-2020 and Feb 24 to May 25-2020 the spreading pattern involved cluster 1, Basrah, cluster 2, Baghdad, cluster 3, Najaf, cluster 4, Sulaymaniyah, and Arbil, and cluster 5, the rest provinces.From Feb 24 to Jun 25-2020, the spreading pattern involved cluster 1, Sulaymaniyah, cluster 2, Baghdad, cluster 3, Wasit, Maysan, Thiqar, and Basrah, cluster 4, Karbala, and Najaf, and cluster 5 the rest of provinces.involved Karbala and Baghdad.The fifth group involved Anbar, Maysan, Diwaniyah, Wasit, Babil, Kirkuk, Thiqar, Sulaymaniyah, Mosul, Salahuddin, and Diala.The effect of the spread appears between the first and second groups, and then in the third group.The fourth and fifth groups appear related and influenced by the other groups.
Based on Figure 10 the air contaminants reduced significantly during the outbreak.AOD levels decreased compared to previous levels.An AOD of under 0.1 is considered "clean".Aerosols get so dense that the sun is obscured as AOD rises from 0.5 to 3. The total NO2 column's tropospheric component is shown by the OMI Nitrogen Dioxide (NO2) tropospheric parameter with low levels of less than 0.7 in all entire Iraq during the period of lockdown.

Conclusion
Through time-series monitoring of the coronavirus dissemination pattern in Iraq at the province level, the study that was presented attempted to pinpoint clustering mechanisms.The utilized information were obtained from the World Health Organization (WHO) and national sources, (The Iraqi Ministry of Health) for the periods of February 24-2020 to Jan 16-2022.Based on the clustering method, data mining was applied to examine pattern recognition of infections in Iraq.The evaluation involved different periods to identify spread patterns.The results showed that Najaf and Baghdad provinces are the main points for coronavirus developments.But Baghdad is the main responsible province for virus spreading patterns in Iraq.Having a thorough understanding of the virus's transmission aids in the development of efficient control measures by public health experts and lawmakers.It enables a more precise assessment of the risk connected to certain activities or places.For instance, regions with high transmission rates might be seen as higher risk, which would necessitate the adoption of stronger regulations.Finally, understanding the coronavirus distribution pattern is crucial for managing the pandemic, safeguarding the public's health, and making wise choices at different levels of government and health care It is a crucial component in pandemic reaction and management.Furthermore, a notable decrease in air pollution was detected during the outbreak.During the COVID-19 lockdowns that were executed in many regions of the world, air quality improved in several areas, according to numerous reports and research.There was better air in many areas as a result of the lockdowns, which significantly decreased industrial activity, transportation, and other sources of air pollution.Lower emissions of pollutants like nitrogen oxides (NOx) and particulate matter were caused by a slowdown in industrial production, less transportation, and decreased air travel.These advancements were transient and mostly the outcome of exceptional circumstances.They emphasize the possible advantages of implementing more ecologically responsible and sustainable activities in the long run to maintain improved air quality and lessen the effects of pollution on the environment and public health.
beginning area to the further provinces in Iraq.The full period of the pandemic was considered in the analysis.Furthermore, based on data from February 24 to June 25-2020 the investigation and processes have been conducted and implemented to determine the spreading pattern at the beginning of the propagation.Moreover, statistical analyses were used to introduce the dendrogram of infection clustering in Iraq.We used two methods: Euclidean distance and Pearson correlation.It is a typical task in the field of epidemiology and data analysis to create a dendrogram to cluster infections.The hierarchical relationships between data points or groupings of data points are represented as dendrograms, which resemble trees.A dendrogram can assist you in understanding how various diseases or cases are related to one another in the context of infection clustering based on specific traits or similarities.Finally, remotely sensed images were used to detect environmental changes in air quality during outbreaks and embargoes.The study methodology is shown in Figure3.
Figure 2. The total commutative cases of corona infections in Iraq in; (a) 2020 (b) 2021, and (c) 2022.

Figure 4
Figure 4 represents the cumulative cases of COVID-19 in Iraq from the beginning to October 2021.Based on Figure 4(a), the cumulative infections reached 1,943,701 cases.The maximum number of cases were within two provinces, Baghdad and Basrah, with 586,300 cases.Based on Figure 4(b) cluster map shows the severity of infection in cluster 2 Baghdad province.Figure 5 represents the coronavirus infections in Iraq from Feb 24 to June 25-2020, based on GIS mapping.The maximum number of infections was 265-9445 cases within 4 months concentrated in 10 provinces in north and south of Iraq.Table2represents a cluster map summary involving the minimum and maximum infections in five different periods from Feb 24 to June 25-2020 (4 months evaluation), and Feb 24 to Oct 2021.No infections were reported in Salahuddin province in the beginning period while a total of 97 infections were recorded in June 2020 as a minimum record within the two provinces Dohuk and Mosul.The maximum record in this period was 125 to 9445 infections in Baghdad and other nine provinces.From Feb 24-2020 to Oct 24-2021, the recorded minimum number of infections was 19,074, while the maximum number of infections was

Figure
Figure 3.The study methodology.
Figure 6 represents the global monitoring of COVID-19 in Iraq.Where Iraq appears in highly infected classes with (500,001-5,000,000,000) cases.Figure 7 is a cluster map and Figure 8 represents kernel density map.The dendrogram of infections clustering in Iraq is represented in Figure 9 where Figure 9(a) represents Euclidean distance, and Figure 9(b) represents Pearson correlation.

Figure
Figure 4.The coronavirus in Iraq; (a) infections, and (b) cluster map.

Figure 7
Figure 7 specifies the spreading pattern based on clustering analysis as shown in the map.Cluster 2 indicates a high infection risk from February 24 to June 25-2020.Each period has been

Figure
Figure 5.The coronavirus infections in Iraq from Feb 24 to June 25-2020, based on GIS mapping.

Figure 8
Figure 8 clarifies the kernel density estimates.According to the results presented by the map, a non-parametric probability density function was employed for the virus prevalence in Iraq based on infection development and their spatial distribution during the study period of Feb 24 to Jun 25.The map shows the provinces which might have affected each other.Baghdad, Najaf, Basrah, Arbil, and Sulaymaniyah provinces are set within the crisis parts of Iraq for virus spreading.

Figure 7 .
Figure 7. Cluster map of the period Feb 24 to Jun 25-2020.

Table 2
represents a cluster map summary involving the minimum and maximum infections in five different periods from Feb 24 to June 25-2020 (4 months evaluation), and Feb 24 to Oct 2021.No infections were reported in Salahuddin province in the beginning period while a total of 97 infections were recorded in June 2020 as a minimum record within the two provinces Dohuk and Mosul.The maximum record in this period was 125 to 9445 infections in Baghdad and other nine provinces.From Feb 24-2020 to Oct 24-2021, the recorded minimum number of infections was 19,074, while the maximum number of infections was