A comprehensive evaluation of explainable Artificial Intelligence techniques in stroke diagnosis: A systematic review

Abstract Stroke presents a formidable global health threat, carrying significant risks and challenges. Timely intervention and improved outcomes hinge on the integration of Explainable Artificial Intelligence (XAI) into medical decision-making. XAI, an evolving field, enhances the transparency of conventional Artificial Intelligence (AI) models. This systematic review addresses key research questions: How is XAI applied in the context of stroke diagnosis? To what extent can XAI elucidate the outputs of machine learning models? Which systematic evaluation methodologies are employed, and what categories of explainable approaches (Model Explanation, Outcome Explanation, Model Inspection) are prevalent We conducted this review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our search encompassed five databases: Google Scholar, PubMed, IEEE Xplore, ScienceDirect, and Scopus, spanning studies published between January 1988 and June 2023. Various combinations of search terms, including “stroke,” “explainable,” “interpretable,” “machine learning,” “artificial intelligence,” and “XAI,” were employed. This study identified 17 primary studies employing explainable machine learning techniques for stroke diagnosis. Among these studies, 94.1% incorporated XAI for model visualization, and 47.06% employed model inspection. It is noteworthy that none of the studies employed evaluation metrics such as D, R, F, or S to assess the performance of their XAI systems. Furthermore, none evaluated human confidence in utilizing XAI for stroke diagnosis. Explainable Artificial Intelligence serves as a vital tool in enhancing trust among both patients and healthcare providers in the diagnostic process. The effective implementation of systematic evaluation metrics is crucial for harnessing the potential of XAI in improving stroke diagnosis.


Introduction
Stroke, often referred to as a "brain attack," stands as the second leading cause of death and a major contributor to disability worldwide (Katan & Luft, 2018).It encompasses two primary forms: Ischemic, resulting from blood clot-related blockages (constituting 85% of all cases), and Hemorrhagic, characterized by the rupture of weak blood vessels supplying the brain.In either case, stroke can lead to severe neurological impairment.Stroke diagnosis involves a comprehensive approach, including a physical examination to assess heart rate and blood pressure, blood tests for cholesterol and diabetes, and brain assessments such as tomography (CT) scans and magnetic resonance imaging (MRI) scans.Additional cardiac evaluations, including electrocardiograms (EKG), heart monitoring, and carotid ultrasound, may also be necessary (Patil et al., 2022).Brain imaging, as outlined by the Heart Association in 2020, can be performed via computed tomography (CT) scans or magnetic resonance imaging (MRI) scans (Malikova & Weichet, 2022).These tests provide crucial insights for identifying the root causes of strokes, with computed tomography using X-rays to capture clear and detailed images of the patient's brain to detect bleeding or damage resulting from a stroke.Magnetic resonance imaging, using magnets and radio waves, complements CT scans in stroke diagnosis.As the aging population grows, more people are at risk of stroke, underscoring the importance of precise and effective prediction systems.Early intervention is crucial for saving lives in the face of this global health challenge, making it imperative for doctors to receive support from Explainable Artificial Intelligence (XAI).
Today, Explainable Artificial Intelligence models are enabling accurate, rapid, and straightforward disease diagnosis.Classical Artificial Intelligence models are often deemed "black boxes" because they lack explanations for their decisions (Brożek et al., 2023;Setzu et al., 2021).Explainable Artificial Intelligence bridges this gap by explaining machine learning outputs and the contributions of features in disease prediction models.It represents an emerging field that aims to help humans comprehend decisions made by Artificial Intelligence systems (Buhrmester et al., 2021).The terms "explainable" and "interpretable" are used interchangeably in this context.There are two primary approaches to achieving explainable artificial intelligence: designing models that are interpretable by design, often referred to as "white-box" models (Holzinger et al., 2020;Samek, 2023), and converting inherently non-interpretable "black-box" models into explainable ones using post-hoc explanations.In recent years, numerous techniques have emerged to explain and understand Machine Learning models, especially deep neural networks, previously regarded as impenetrable black boxes, and to verify their predictions, e.g., deep neural networks).
Explainable modeling, referred to as the use of models designed to be interpretable by nature, encompasses "white-box" models (Arrieta et al., 2019;Moradi & Samwald, 2021).In these models, such as logistic regression (LR) classifiers and decision trees, the inner workings are directly accessible to users, facilitating effortless comprehension.On the other hand, post-hoc explanations involve the transformation of models not inherently interpretable, commonly referred to as "black-box" models, into explainable entities through post-hoc explainability techniques (Mahya & Fürnkranz, 2023a).These methods focus on enhancing interpretability by employing various approaches, including text explanations, visual explanations, and explanations based on feature importance, among others.Among these, explanations based on feature importance are particularly prevalent in post-hoc explainability models.They rank the explanatory significance of input features regarding model predictions, providing clinicians with valuable insights into which features contribute to predicted outcomes, and aligning these findings with their prior knowledge (Holzinger et al., 2020).Typically, the quantitative assessment of feature importance is transformed into a more human-readable format through visual representations, such as boxplots.Post-hoc explainability approaches can be categorized into three distinct categories for further refinement, which are as follows.

Model explanation
Model explanation focuses on comprehending the overall rationale behind black box models.In this context, model explanation approaches typically aim to construct surrogate models capable of globally replicating the behavior of the black box model.

Outcome explanation
Outcome explanation, on the other hand, centers on the correlation between specific inputs and their corresponding outputs.It delves into the interpretation of model outputs, such as class predictions, within the context of a particular input instance.This approach primarily examines the local neighborhood surrounding a given input.

Model inspection
The model inspection problem revolves around providing a representation of a specific property related to the black box model or its predictions.
The application of Explainable Artificial Intelligence (XAI) remains an active and evolving research area (Amarasinghe et al., 2023;Samek, 2023).Within the medical literature, there exists a vibrant discussion regarding the utility and necessity of XAI in healthcare (Linwood, n.d.;Longo et al., 2020).Given the global significance of stroke as a healthcare challenge, advanced machine learning support is imperative.This review aims to analyze the utilization of Explainable Artificial Intelligence in the diagnosis of stroke.It offers an overview of the XAI methods currently under development for deployment in stroke diagnosis systems.Furthermore, it evaluates the efforts made to ascertain their utility in clinical practice.Beyond mere accuracy, the authors assess the degree to which Explainable Artificial Intelligence fosters transparency for both doctors and patients.
Explainable Artificial Intelligence stands as one of the youngest and fastest-growing branches in the field of artificial intelligence.Interpretability, as a passive attribute of a model, pertains to the extent to which the internal decision-making processes of the model are comprehensible to human observers (Linardatos et al., 2021;Rojat et al., 2021).In contrast, explainability represents an active characteristic, involving the provision of explanations for the actions or procedures undertaken by the model, intending to elucidate its internal decision-making.While the term "Explainable Artificial Intelligence" (XAI) denotes a model's characteristic, any representation presented to humans, such as input attribution, pertains to the model's interpretability.The concept of the "right to explanation," a term used in the European General Data Protection Regulation (GDPR), is frequently associated with XAI methods (Linardatos et al., 2021).It necessitates that data controllers explain how a decision was reached ( (Belle & Papantonis, 2021;Longo et al., 2020), particularly pertinent when dealing with systems whose decisions are inscrutable to humans, such as deep neural networks.This regulation aims to prevent discrimination and ethical/financial biases in such systems.The field also encompasses a taxonomy of interpretability (Solutions, 2023).
Stroke diagnosis typically involves a comprehensive assessment of physical condition and brain images obtained through brain scanning (Musuka et al., 2015).Various tests are employed to confirm the diagnosis and determine the underlying cause of the stroke.These assessments may encompass blood tests to evaluate cholesterol levels and blood sugar, checks for irregular heartbeat, and blood pressure measurements (Kleindorfer et al., 2021).Among the pivotal tools for evaluating brain health in individuals with suspected strokes are computed tomography (CT) scans and magnetic resonance imaging (MRI) scans (Birenbaum et al., 2011).While CT scans, akin to X-rays, use multiple images to construct a detailed 3D representation of the brain, MRI scans rely on powerful magnets and lowenergy radio waves to generate intricate images of the body's interior (Musuka et al., 2015).Additionally, assessing swallowing function is of paramount importance for stroke patients, as their ability to swallow is often impaired immediately after a stroke (Kleindorfer et al., 2021).Cardiovascular examinations, including ultrasound scans of the carotid artery, aid in identifying any narrowing or blockages in the arteries leading to the brain.Echocardiography, which employs heart images, helps in detecting problems potentially related to stroke.
Over the past decade, the application of machine learning in medicine has undergone rapid evolution.In the realm of stroke care, commercially available machine learning algorithms have been seamlessly integrated into clinical applications, facilitating swift and accurate diagnosis.Leveraging AI technology to assess stroke risk has yielded promising outcomes.Previous research has demonstrated the use of AI algorithms for the early diagnosis of atrial fibrillation using normal sinus rhythm electrocardiographs, enabling timely interventions to mitigate stroke risk.AI applications employing image analysis have emerged as powerful tools to enhance diagnostic accuracy and treatment outcomes for stroke patients.Given the critical importance of time in stroke management, early detection plays a pivotal role in ensuring patients receive efficient treatment, ultimately improving their overall condition (Miao & Miao, 2023).The intersection of machine learning and medicine has witnessed substantial growth in recent years, with commercial machine learning algorithms now integrated into clinical practices for expedited stroke diagnosis.Techniques aimed at assessing stroke risk have shown positive outcomes.Studies have demonstrated the utility of nasal electrocardiography combined with artificial intelligence for early atrial fibrillation diagnosis, offering the potential for early interventions to reduce the risk of stroke (Musuka et al., 2015;Setzu et al., 2021).
The use of artificial intelligence techniques has become pervasive in computing, with applications in training, forecasting, and evaluation (Movassagh et al., 2023).AI provides algorithms that facilitate decision-making and serves as a tool for automated machine learning (Castelli et al., 2022).Novel methods, such as boosted neural network ensemble classification, which combines image and diagnostic parameters, have emerged as efficient tools for doctors in diagnosing patients (ALzubi et al., 2019).Additionally, dynamic programming-based ensemble design algorithms (DPED) have been introduced to reduce ensemble size while promoting diversity to enhance accuracy (Alzubi et al., 2020).Innovative techniques like artificial neural networks and naïve Bayes classifiers, which utilize mathematical analysis, have been instrumental in improving the accuracy and performance of brain tumor detection (ALzubi et al., 2019).AI technologies offer diverse solutions that can lead to cost savings, reduced complexity, and improved productivity, efficiency, and safety (Deif & Vivek, 2022).The integration of AI has the potential to enhance healthcare services, benefiting medical professionals, hospitals, and patients (Chikhaoui et al., 2022).However, many artificial intelligence algorithms, while accurate and reliable, remain black-box systems, unable to provide insights into their decision-making processes.This limitation underscores the need for explainable AI (Explainable Artificial Intelligence) (Chikhaoui et al., 2022).(Chikhaoui et al., 2022) recommend addressing the ethical and legal challenges associated with AI and propose the implementation of new Personal Data Protection Laws.They highlight that nearly 80% of respondents express concerns about the lack of ethical considerations before the use of AI.The existing legal frameworks have not evolved in parallel with technological advancements, leaving gaps in addressing legal and ethical responsibilities related to artificial intelligence use.In addition to informed consent, safety, transparency, algorithmic fairness, and bias, data privacy emerges as a critical concern surrounding AI (Chikhaoui et al., 2022).Balancing the potential benefits of AI for healthcare outcomes with the legal and ethical risks has become a prominent issue in legislative discussions within European and American policymaking circles.
Medical diagnoses made for incorrect reasons have raised substantial ethical and policy concerns (Benois-Pineau, 2023).These concerns can be categorized under the broader themes of "AI Ethics and Values" or "Trustworthy AI."To address these issues comprehensively, our study aims to evaluate the role of explainable artificial intelligence in achieving trustworthy AI in the context of stroke diagnosis.The necessity for both interpretability and fidelity in achieving explainability is emphasized in prior research (Gilpin et al., n.d.).Interpretability pertains to the extent to which an explanation is understandable to humans, while fidelity concerns how accurately an explanation describes model behavior in relation to the task model.Explainable artificial intelligence holds the promise of aligning with "AI Ethics and Values" or "Trustworthy AI." Nonetheless, to ascertain the trustworthiness of such systems and resolve issues related to AI ethics and values, our study embarks on a systematic evaluation of explainable artificial intelligence techniques in the domain of stroke diagnosis.

Search strategy
The search strategy was conducted systematically as follows: A total of 158 papers were sourced from reputable databases, including ScienceDirect, PubMed, Google Scholar, and IEEE Xplore, as illustrated in Figure 1 (number 1).Advanced search mechanisms were employed to ensure comprehensive results.
To retrieve relevant literature, the following search terms were used in various combinations: "explainable," "interpretable," "machine learning," "deep learning," "artificial intelligence," and "stroke."The search query was structured as (((explainable or interpretable) and (artificial intelligence or machine learning or deep learning)) or XAI) and stroke).
Due to the presence of numerous irrelevant papers in the search results from Google Scholar, a thorough sorting process was carried out to prioritize relevance.From the initial list, the first 100 displayed relevant papers were subjected to bibliometric analysis, and 40 of the most pertinent papers were selected and downloaded for further examination.

Inclusion and exclusion criteria
After applying the search equation, the criteria for inclusion and exclusion are as follows: • Literature or systematic review articles were excluded.
• All articles focusing specifically on the use of XAI and strategies for stroke diagnosis (practical or theoretical) were included.
• Articles dealing with relevant technologies but, used for procedures other than stroke diagnosis were excluded, even if these systems were mentioned elsewhere in the article.
• Articles by year of publication were not excluded, given the novelty of using Explainable Artificial Intelligence for stroke diagnosis.

Study selection
For this systematic review, we followed the evidence-based guidelines of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) to ensure methodological rigor.A total of 158 papers were sourced from reputable databases, including ScienceDirect (5 papers), PubMed (106 papers), Google Scholar (40 papers), and IEEEXplore (7 papers), as indicated in Figure 1 (number 3).
To facilitate the study selection process, the downloaded papers were uploaded to the Rayyan platform (A et al., 2014;Ouzzani et al., 2016)

Screening
Studies included in review (n =17) Reports of included studies (n=17) Included inclusion and exclusion of papers based on predefined criteria.Rayyan for Systematic Review enables users to search using record ID, title, abstract, and author in existing imported citation records.It also highlights and detects articles containing the phrase "systematic review".It works through the following steps: Create an account, create projects, import citations, identify duplicates, allow or remove duplicates, and include or exclude articles based on inclusion and exclusion criteria.Finally, export included, excluded, or all documents.
Rayyan also has limitations.The number of articles per project is limited to 1000 and may become a universal tool in the future.Rayyan does not currently allow users to delete or change individual citations, so the usefulness of this "search by topic" feature is limited.Figure 1 (number 3) reveals that 24 out of 158 papers were identified as duplicates using the Rayyan platform (A et al., 2014;Ouzzani et al., 2016) Among these duplicates, 12 were deleted, and the remaining 12 were resolved.In this context, "resolved" refers to considering 12 out of 24 files as original papers and deleting the remaining 12 copies.
After the removal of duplicates, 146 out of the initial 158 papers were considered for the subsequent stages of the review process.As depicted in Figure 1 (number 2), based on the predefined inclusion and exclusion criteria, 17 out of the 146 papers were included, while the remaining 129 papers were excluded from further analysis.
We focus on the examination of Explainable Artificial Intelligence methods suitable for the application as part of diagnostic support tools in day-to-day stroke clinical practice.In stroke diagnosis practice, most tasks that can be supported by Artificial Intelligence are image and clinical dataoriented classification tasks.Brain scanning can be done in two ways: computed tomography (CT scan), and magnetic resonance imaging pathology, both image classification and image segmentation tasks are common.To recommend which studies are methodologically robust and inclusive, this study focuses on studies done on image data, clinical data, and data classification.
All included studies had to classify at least one stroke disease using explainable or interpretable machine learning.Due to their low relevance for clinical practice.Finally, to keep the classification mechanisms in the selected studies as comparable as possible, we only included articles that used XAI excluding other machine learning methods.

Findings
This study identified 17 primary explainable machine learning used for stroke diagnosis (Figure 2).In 16 out of 17 (94.1%)studies, XAI was used for the goal of model visualization, and 8 out of 17 (47.06%) used model inspection.No study used measures to evaluate the outcome of their explainable AI using metrics, D, R, F, S. Additionally, none of them evaluated the confidence of humans in using the XAI system for stroke.A tabular representation of the eligible publications can be found in Table 1.Studies on image (n = 5), clinical data (n = 9), and both (n = 1).One study (Yao et al., 2022) used both image and clinical data with variables comprised of demographic characteristics, clinical factors, laboratory indices, and radiological data.Five different image dataset types are used as datasets (EET, MRS, DWI, T2-Weighted Scan, and CT scan).
Four papers adopted the new algorithm (Leonardi et al., 2022;Prentzas et al., 2019;Williamson et al., 2022); and (Moulton et al., 2023) used the Georgia argument explanation, 3DGradCAM, Trace saliency maps, and Analysis of the attention maps respectively.One paper (Pamungkas et al., 2022) does not specify how and what type of XAI they used; they simply put it as they used explainable artificial intelligence.
Five of the fifteen studies (Islam et al., 2022;Moulton et al., 2023;Mridha et al., 2023;Williamson et al., 2022), and (Uddin et al., 2023) and Four of the fifteen studies (Foroushani et al., 2022;Mridha et al., 2023;Yao et al., 2022), and (Moulton et al., 2023) addressed specific questions such as bias detection and the impact of Explainable artificial intelligence on man-machine-interactions, respectively.However, none of them evaluated the confidence of humans in using an Explainable artificial intelligence system for stroke.

Better metrics for evaluating explainable artificial intelligence
The objective of explainable Artificial Intelligence is not solely about accuracy; it is also about to what extent it is transparent to the user.The study (Rosenfeld, n.d..) argues that many Explainable Artificial Intelligence studies wrongly assume that low-fidelity explanations should be accepted for certain tasks.Furthermore, it argues that user studies may also be subject to confirmation bias in their evaluation of Explainable Artificial Intelligence.To address these concerns (Rosenfeld, n.d..), advocates using four general metrics, namely D, R, F, and S, to quantify the explainability of Explainable Artificial Intelligence.These metrics are based on: • D: The difference in the agent's performance using models with higher fidelity versus lower fidelity.
• R: The number of rules in the outputted explanation.
• F: The number of features used by the agent to generate the explanation.
• S: The stability of the agent's explanation.
The advantage of these measures is that they make no a-priori assumptions about the relative advantage of using an explainable Artificial Intelligence algorithm with higher or lower fidelity.They also facilitate comparison without any potential confirmation bias from user studies.It is hoped that these metrics will be considered in the future for more significant evaluations of explainable Artificial Intelligence.
Stroke is one of the most complex and leading causes of death and disability.So, it needs attention to treat (i.e., multiple sources of patient data to decide almost all of the papers recommend the number of datasets should be enhanced.Additionally, as shown in

Reference
Recommendations and Limitations of the Papers Uddin et al. (2023) The paper used only the SHAP explanation of the algorithm.The paper describes that CT and MRI imaging of the brain are used but it is not shown in the result.Also describes both image and clinical physical exam data in the paper but does not explain how they merged.

Islam et al. (2022)
The results of this research into explainable artificial intelligence will help with treating and rehabilitating people who have had a stroke, as well as making it easier for doctors to explain their diagnoses.

Kokkotis et al. (2022)
Development of a database with easy-to-read and low-cost measurements and potentially complementary data Utilizing the new database, it is necessary to employ more advanced AI tools, feature selection techniques, and interpretation approaches using graphical algorithms.The implementation of a nested cross-validation strategy is costly in computational terms.The whole feature set was utilized in the proposed analysis lack of an external validation dataset for evaluating the generalization of the best ML model (Yao et al., 2022) The study was a single-center, retrospective design, and the sample size was relatively small, which might limit the generalizability of the results.The CT features used in this study were obtained from the first CT scan.KockWiil et al. (n.d.) Increase the risk of selection bias and Significant missing data.Its application to different ethnic groups and non-elderly people requires further investigation and validation.

Prentzas et al. (2019)
Poor explanation and almost it is simply used the words explainable and interpretability Mridha et al. (2023) In the future, it's possible to combine different samples when using the explanation technique.future experimental designs based on the most relevant risk factors.The creation of an end-to-end smart stroke prediction system, only employed one trained model and the analysis used the entire feature set, which could be seen as a limitation.The lack of an external validation dataset to test the generalization of the best ML model is a restriction.

Explanation mode the reviewed papers
As presented in Table 3, all of the studies considered in this review except one used Model Explanation since nowadays most researchers are using post-hoc explainability approaches.However, none of the researchers used Model fidelity or output explanation.Nine of the seventeen studies used Model Inspection explanation methods.

Discussion
This systematic review rigorously assessed 17 studies that explored the utilization of explainable Artificial Intelligence (XAI) in the context of stroke diagnosis.The timeframe for this review encompassed research conducted from 1 January 2019, to 30 June 2023.Among these reviewed studies, one out of seventeen didn't primarily emphasize XAI but rather utilized it as a validation tool for their classifiers.In contrast, three studies introduced novel XAI methods or substantially enhanced existing ones.Furthermore, two studies conducted in-depth evaluations related to XAI, encompassing meticulous analyses of dataset biases.Additionally, three studies compared XAIgenerated explanations with ground-truth segmentations while assessing the interaction dynamics between humans and computer systems.In this paper we assessed 11 papers from 17 (64.7%)which addressed parameters such as gender, age, hypertension, heart disease, ever married, work type, residence type, average glucose level, BMI, and smoking status as a risk factor for stroke.All the studies that indicated these parameters as risk factors stated that their method include explanations such as ''if the age of the patient is more than 60, the prediction confidence in healthy diagnosis decreases (Mridha et al., 2023).Moreover, this systematic review assessed dataset as their study limitation, i.e., small datasets size, data imbalance, and overfit of the training dataset.Some of the papers like (Mridha et al., 2023) try to handle data imbalance through up sampling and down sampling methods.All these papers used a dataset of less than 2000.Based on these reasons, it is difficult to change XAI studies done so far into real-world application.

Model fidelity
Model fidelity is a crucial aspect of explainable Artificial Intelligence (XAI), representing how accurately the explanations provided by XAI methods reflect the actual decision-making process of the underlying classifier.Notably, certain XAI methods, such as 3DGradCAM (Williamson et al., 2022), excel in preserving model fidelity, offering more reliable insights into the inner workings of the decision process.

Post-hoc explanations and human interpretation
All the XAI methods reviewed in this study fall under the category of post-hoc explanations, which means they rely on additional techniques to interpret black-box models.However, it remains challenging to precisely determine the extent of the gap between the explanation disclosed by XAI and the actual decision process of the model.
One proposed solution is to sidestep the need for post-hoc explanations altogether by employing interpretable models from the outset, as suggested by (Chaddad et al., 2023;Mahya & Fürnkranz, 2023b).Another recommendation is to conduct thorough evaluations before implementing XAI in real-world applications, ensuring that medical professionals do not solely rely on these explanations.This is especially important given the dynamic nature of hospital information.Nevertheless, practical challenges may arise due to frequent changes in hospital data.
Moreover, post-hoc methods, like heatmaps and prototype interpretation, often require human interpretation.This introduces a potential bias, as human appraisers might assume that the distributor employs similar methods to generate plausible XAI explanations, even if they are not aware of the exact truth.While well-designed maps can partially mitigate this problem by clarifying the nature of image resolution-based decision-making processes, it's crucial to acknowledge that subtle nuances may persist, necessitating human interpretation.

Representativity of the presented data
When XAI is employed as a sanity check, it frequently involves showcasing only a limited number of explanatory images.This restricted presentation may inadvertently allow for cherry-picking, potentially undermining the expressiveness and reliability of the reported findings.

Limitations
One limitation of this review is that certain studies superficially used XAI but did not explicitly include keywords like "XAI" or "explainability" in their titles or abstracts.To maintain impartiality, specific search terms containing the names of XAI methods were intentionally omitted.Consequently, some articles that only tangentially engaged with XAI may have been inadvertently overlooked, despite our rigorous search process.
This discussion section provides valuable insights into the challenges and considerations associated with the application of XAI in stroke diagnosis.It underscores the significance of model fidelity, human interpretation, and representativity of data, and candidly acknowledges the limitations inherent in the review process.Researchers are encouraged to carefully assess the utility and constraints of XAI when employing it in real-world medical scenarios.

Conclusions
While numerous studies have explored the application of explainable Artificial Intelligence (XAI) methods in stroke diagnosis, a notable gap exists in evaluating their impact on diagnostic accuracy and acceptance by medical professionals.The current body of research provides limited insight into these crucial aspects.Consequently, there is a pressing need for further investigations to establish a more robust understanding of how XAI influences stroke diagnosis practices.Beyond its application in stroke diagnosis, XAI plays a pivotal role in addressing trustworthiness and ethical considerations within the realm of artificial intelligence.To navigate complex AI Ethics and Values issues successfully, the adoption of XAI becomes imperative, ensuring that users can place their trust in AI systems.

Figure
Figure 2. Preferred Reporting Items for systematic Reviews and meta-Analyses method used for excluding including research for review.
an online free platform that enables efficient

Table 1 . Overview of reviewed articles on stroke data. An unabbreviated version of this table can be found in appendix No Study How does it deal with XAI? Explained model Classification task Data set (s)
used Gender, Age, Hypertension, Heart disease, ever married, Work type, Residence type, Average glucose Level, BMI, Smoking status as features.

Table 2 . (Continued) Reference Recommendations and Limitations of the Papers
status, its risk factors.However, ID cannot be a determinant feature.Data is also small, and the model is not tested in the real world.
Table 2 below all the papers use only a single model.However, stroke treatment needs more than three models to treat the disease well.