Artificial intelligence in the diagnosis of glaucoma and neurodegenerative diseases

ABSTRACT Artificial Intelligence is a rapidly expanding field within computer science that encompasses the emulation of human intelligence by machines. Machine learning and deep learning – two primary data-driven pattern analysis approaches under the umbrella of artificial intelligence – has created considerable interest in the last few decades. The evolution of technology has resulted in a substantial amount of artificial intelligence research on ophthalmic and neurodegenerative disease diagnosis using retinal images. Various artificial intelligence-based techniques have been used for diagnostic purposes, including traditional machine learning, deep learning, and their combinations. Presented here is a review of the literature covering the last 10 years on this topic, discussing the use of artificial intelligence in analysing data from different modalities and their combinations for the diagnosis of glaucoma and neurodegenerative diseases. The performance of published artificial intelligence methods varies due to several factors, yet the results suggest that such methods can potentially facilitate clinical diagnosis. Generally, the accuracy of artificial intelligence-assisted diagnosis ranges from 67–98%, and the area under the sensitivity-specificity curve (AUC) ranges from 0.71–0.98, which outperforms typical human performance of 71.5% accuracy and 0.86 area under the curve. This indicates that artificial intelligence-based tools can provide clinicians with useful information that would assist in providing improved diagnosis. The review suggests that there is room for improvement of existing artificial intelligence-based models using retinal imaging modalities before they are incorporated into clinical practice.


Artificial intelligence in action
Artificial intelligence (AI) has brought significant breakthroughs and a paradigm shift in science and technology, leading the world towards the fourth industrial revolution (4IR). 1 It is an inseparable part of applied computer science, where computer algorithms are used to replicate human intelligence. 2Artificial intelligence systems are widely used in real-life applications -from Netflix suggestions to key information relating to healthcare.
32 Current state-of-the-art artificial intelligence methods widely use clinical data, retinal images from different modalities, visual field assessment, and multi-modal information to diagnose glaucoma and neurodegenerative diseases.
Glaucoma, causing 'glaucomatous optic neuropathy', is defined as a set of optic neuropathies leading to the damage of retinal ganglion cells, and provides an example of primary optic nerve degeneration.It is considered to be the secondmost common cause of total blindness in developed nations. 33Since the damage is irreversible, early detection and treatment of glaucoma is crucial.It is anticipated that the number of glaucoma patients will almost double in the next two decades (76 million in 2020 and an estimated 112 million in 2040). 34eurodegenerative disorders are a diverse set of diseases that cause deterioration of the structure and functionalities of the central nervous system. 35Among the neurodegenerative diseases, Alzheimer's disease (AD)/dementia and Parkinson's disease (PD) are regarded as two major progressive neurodegenerative disorders. 36Specifically, Alzheimer's disease is the main cause of dementia in the elderly. 37While these diseases are primarily associated with central nervous system involvement, sometimes the retina may also be a site of direct injury in Alzheimer's disease.
Apart from potential direct injury in retina, central nervous system-associated neurodegenerative diseases contribute to substantial axonal damage with peripapillary retinal nerve fibre layer (RNFL) reduction [38][39][40][41] and retinal ganglion cell degeneration, mostly in superior and inferior regions in relation to the retinal optic nerve, [42][43][44][45][46] confounding diagnosis of primary optic nerve degeneration such as glaucoma with these retrograde-related changes.
8][49] One explanation is that the cerebral pathology associated with Alzheimer's disease affects the connections between neurons in the visual pathway, leading to deterioration of the optic nerve and associated retinal layers in a retrograde manner. 49This degeneration results in thinner retinal neuronal and axonal layers, including the RNFL and ganglion cell-inner plexiform layer. 49However, the thinning of the peripapillary RNFL alone cannot distinguish between control subjects and Alzheimer's disease patients primarily affected in the parieto-occipital (visual) cortex, and in this variant, retrograde degeneration is expected to occur explicitly. 50nother explanation is that pathological signs of Alzheimer's (e.g., amyloid-β plaques and neuroinflammation) occur simultaneously in the brain and retina, suggesting a common underlying mechanism -which further contribute to the linking changes in the retina and associated neuronal axons. 51,52In some rare cases of Alzheimer's disease, the RNFL may appear thicker than normal.This could be due to a reaction in the inner part of the retina called reactive gliosis, which is an inflammation that occurs during the early stages of the disease. 53,54This reactive gliosis might occur before the thinning of the retinal neuronal layer or hide any subtle thinning that can be detected using optical coherence tomography (OCT). 55o date, many approaches are used to diagnose and test for glaucoma; however, they are manual, timeconsuming, expensive, and require expert operators.A visual field defect can be identified using perimetry, corneal thickness can be determined using pachymetry, intraocular pressure can be measured using tonometry, and the optic nerve head can be examined using fundoscopy.][59][60] Recent studies have shown in-vivo retinal imaging to be promising for early-stage diagnosis of neurodegenerative diseases, which is less expensive and faster compared to the neuroimaging modalities. 60,61his review discusses artificial intelligence techniques for glaucoma and neurodegenerative disease diagnosis using retinal imaging modalities.It considers the main retinal imaging modalities now utilised with artificial intelligence and summarises the utility of these modalities and the extracted features in diagnosis.Based on an analysis of many studies, possible future directions for artificial intelligence-based diagnostic systems are recommended for implementation in the healthcare system to assist clinicians to differentiate disease entities and assist in diagnostic accuracy.
For this narrative review, electronic bibliographic searches were conducted in five different databases (Web of Science, PubMed, EMBASE, Scopus, and IEEE Xplore Digital Library) for articles published in the last 10 years (2012-2022).Keywords were searched using MESH terms, such as, 'glaucoma', 'neurodegenerative', 'Alzheimer's', 'dementia', 'Parkinson's', 'machine learning', 'deep learning', 'artificial intelligence', 'convolutional neural network', and 'deep convolutional neural network'.The paper titles and abstracts found were downloaded to EndNote bibliographic software, and relevance screening was performed on the search results following PRISMA structure 62 (Figure 1).

Primary versus secondary neurodegeneration
There are characteristic differences between primary optic nerve degeneration caused by conditions such as glaucoma, and retrograde degeneration of retinal ganglion cells due to cell death subsequent to lesions in the central visual pathways. 63Retrograde degeneration occurs due to damage to the cells with which they are in communication, creating a disruption of the synaptic signal; as such, atrophic neural changes affect dendrites and axons. 64This process propagates the damage from the original site to neural cells earlier in the visual pathway, referred to as trans-synaptic retrograde degeneration. 65he study by Zangerl et al., 66 discusses reconciling the RNFL asymmetrical pattern and visual field defects in retrograde degeneration (Figure 2).According to the study, preand post-chiasmal RNFL display distinct losses that are eyespecific, and central nervous system (CNS) losses that are disease-specific, respectively.There are some other distinctive differences between the two conditions.First, prechiasmal optic degeneration (or primary degeneration) shows incongruous change in ganglion cell analysis that typically obeys the horizontal midline.In contrast, postchiasmal retrograde degeneration (or secondary degeneration) leads to a congruous change in ganglion cell analysis and typically obeys the vertical midline.
Second, pre-chiasmal optic degeneration shows consistency between visual field (VF)/ganglion cell analysis and RNFL analysis, and the RNFL thinning is often unilateral or in areas of bilateral symmetry.On the other hand, post-chiasmal retrograde degeneration exhibits inconsistency between visual field/ganglion cell analysis and RNFL analysis, and RNFL asymmetry is mostly visible, especially in the superoand/or infero-nasal areas.Although there are patterns that allow primary optic neuropathy (ON) vs retrograde to be differentiated, both display anatomical changes and careful assessment is required to discriminate between them.These differences imply that algorithms may be able to assist in differentiating primary optic neuropathies from retrograde degenerative changes induced by post-chiasmal lesions.

Ageing population
Age is a known risk factor for glaucoma, with increasing prevalence of the condition over time. 34,67,68With a worldwide ageing population, the impact of irreversible blindness due to glaucoma will likely increase over time.Concurrently, neurodegenerative conditions such as dementia are also associated with cognitive ageing 69 and mostly prevalent in the elderly population.Tong et al., 70 developed a location-specific model to identify the changes in macular ganglion cells due to ageing.The findings from the study suggest that the deterioration of Ganglion Cell Layer (GCL) in the macula starts in the late 30s.Since the eye is a part of the central nervous system, this raises the question of whether similar neurodegenerative changes may also manifest in the brain from this critical age.

Under-and overdiagnosis of glaucoma
Despite advances in diagnostic testing, glaucoma remains challenging to detect in clinical practice.This may result in an irreversible loss of eyesight and finally blindness.According to Burr et al., 71 approximately 50% of patients with glaucoma remain undiagnosed worldwide.The underdiagnosis of glaucoma is multifaceted.Lack of awareness of the glaucoma risk and lack of access to specialised eye care or financial resources of the patients could be the potential factors contributing to the underdiagnosis of glaucoma. 72part from underdiagnosis of glaucoma due to different socio-economic factors, overdiagnosis of glaucoma also occurs frequently.Founti et al., 73 performed a crosssectional study on a mainly Caucasian population (2554 participants) and reported that 60% of the population was overdiagnosed with glaucoma.That means more than half of the population diagnosed with glaucoma may not have glaucoma.There are several reasons behind misdiagnosis, such as poorly trained eye care practitioners, lack of quality imaging instruments, inconsistencies in the agreement of diagnostic markers, and the confounding effects of neurodegenerative diseases on glaucoma (e.g., other optic neuropathies, dementia and Parkinson's disease may display similar optic nerve characteristics).Therefore, it is crucial to identify the potential reasons for this misdiagnosis and take necessary action to correct them.

Misdiagnosis of other causes of neurodegenerative diseases as glaucoma
Clinically, the main structural loci of change observed in glaucoma include: characteristic optic nerve head changes (neuroretinal rim loss, and widening and deepening of the optic cup); corresponding adjacent RNFL loss; and ganglion cell loss.OCT can measure these parameters and return RNFL and ganglion cell (and other adjacent layers) thickness values to provide inputs to glaucoma diagnosis and disease progression (Figure 3C, E).It is well-understood that other optic nerve head pathologies can manifest with similar structural losses, therefore understanding the patterns of pathology is critical for accurate differential diagnosis (Figure 3D, F, G).One example in the eye is ischaemic RNFL loss, which can present with a wedge-like pattern of structural defect and structure-function concordance (Figure 3G).Since ischaemia is a known pathophysiological pathway to glaucoma, differentiating it from a non-progressive ischaemic optic neuropathy or RNFL loss is critical. 74The salient differentiating features of primary ischaemic optic neuropathy are the absence of characteristic glaucomatous optic nerve head changes (thinning and cupping) and an abrupt transition between normal and reduced neural tissue. 75here is emerging evidence that neurodegenerative conditions beyond the eye, such as Alzheimer's disease, Parkinson's disease and Parkinsonism, can manifest with ocular signs 76 (Figure 3A, B, D, F).Given parallels in risk factors (such as age and systemic vascular disease) between cognitive neurodegenerative disorders, glaucoma (a neurodegenerative condition of the eye) and ischaemic optic neuropathy, it is important to recognise that the pathologies may co-exist and be superimposed (Figure 3D).
Two major gaps in the literature remain to be reconciled for distinguishing glaucoma and optic nerve disease from cognitive neurodegenerative retinal signs.First, current clinically available methods for evaluating retinal structure and function, at best, provide non-specific information related to cognitive neurodegeneration. 40,41Unlike in age-related macular degeneration and glaucoma where there is a constellation of diagnostic clinical features, disease-specific retinal lesions due to cognitive neurodegeneration remain poorly defined.Second, whilst the natural history of degenerative conditions such as age-related macular degeneration and glaucoma are wellunderstood, 77 there remain gaps in the knowledge regarding the trajectory of retinal manifestations of cognitive neurodegenerative disease.Thus, co-incident retinal and cognitive disease may continue to confound the management plan.

Prevalence, meta-analyses, and usefulness of artificial intelligence
Given that most of the studies performed using OCT images for glaucoma diagnosis evaluate the RNFL and ganglion cell layer thickness changes, there are overlaps in these changes for neurodegenerative diseases, such as dementia and Parkinson's.Several meta-analyses suggest significant differences in RNFL due to glaucoma and neurodegenerative diseases.From the outcome of the meta-analyses performed on dementia 40 and Parkinson's 41 diagnoses, it is apparent that some studies have shown significant changes in RNFL, whereas others show non-significant or no changes (Figure 4).This implies that the findings are still equivocal, and further research is needed.
As most of the studies included in the meta-analysis for RNFL changes due to glaucoma and neurodegenerative diseases have used statistical analysis (such as ANOVA), it is crucial to apply more advanced analysis, which is possible using artificial intelligence.Most artificial intelligence techniques perform better than traditional statistical analysis and, in some cases, surpass human accuracy in several domains.Therefore, applying artificial intelligence techniques, such as machine learning and deep learning in this area, may compensate for the knowledge gap of clinicians and assist them in decision-making using automated computer algorithms.

Machine learning versus deep learning
'Artificial Intelligence' is an umbrella term, first introduced by John McCarthy at a 1956 conference, and refers to the ability of machines to emulate human intelligence.It is a growing field in computer and data science and covers many aspects. 1 There are different approaches to data driven pattern analysis within artificial intelligence, broadly categorised into machine learning (ML) and deep learning (DL). 78Machine learning generally refers to automated computer algorithms that can  The OCT GCA deviation map shows concentric thinning of the central macula adjacent to the foveal pit due to anatomical variation (non-pathological).Perimetry using the HFA could not be completed due to the patient's dexterity limitations.B: a 58-year-old male with Alzheimer's disease.The optic nerve head examination reveals no significant structural abnormality, with an intact neuroretinal rim (blue arrows).OCT imaging results also show no significant structural abnormalities.However, the imaging results were confounded by eye movement artefacts, as indicated by the red arrows.Perimetry using the HFA shows an exaggerated pattern of apparent loss, but this was unreliable, with high false positive (39%), high false negative (36%) and high frequency eye movements during the test.C: a 58-yearold male with advanced glaucoma in the left eye.The optic nerve head examination reveals almost complete loss of the neuroretinal rim superiorly and inferiorly in positions typical for glaucoma (yellow arrows).OCT imaging results also highlight the RNFL defects superiorly and inferiorly (yellow arrows), with the tomogram illustrating the widened cup typical in glaucoma (yellow dashed double arrow).The RNFL TSNIT curves show significant RNFL reductions, with measurements reaching the instrument floor (black arrows).The structural loss was asymmetric, hence the perimetry results also show a more profound superior arcuate defect (mean deviation −18.69 dB).D: a 58-year-old male with coincident right eye early glaucoma and Parkinson's disease.The optic nerve head examination reveals a notch at the inferior neuroretinal rim, with corresponding loss of the RNFL (yellow arrows).The vertical tomogram shows a widened and deepened cup expected in glaucoma (yellow dashed arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with significant loss in the right eye (black arrow).There is demonstrable structure-function concordance with a superior nasal step on the HFA pattern deviation map (mean deviation −2.38 dB).E: a 47-year-old female with early glaucoma in the right eye.The optic nerve head examination reveals a notch at the inferior neuroretinal rim, with corresponding loss of the RNFL (yellow arrows).The vertical tomogram shows a widened and deepened cup expected in glaucoma (yellow dashed arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with early loss in the right eye (black arrow).Since the wide and trajectory of the RNFL loss encompasses the papillomacular bundle, there is also an appreciable, wide arcuate-like loss on the OCT GCA deviation map (navy arrow).There is demonstrable structure-function concordance with a superior nasal step (from the wedge defect on the RNFL deviation map) and paracentral defect (from the papillomacular arcuate defect on the GCA deviation map) on the HFA pattern deviation map (mean deviation −4.57dB).F: a 44-year-old female with Huntington's disease assessed for suspected glaucoma.The optic nerve head examination reveals no significant structural abnormality, with an intact neuroretinal rim (blue arrows).OCT imaging results also show no significant structural abnormalities.However, the RNFL and GCA imaging results were confounded by eye movement artefacts, as indicated by the red arrows.Perimetry using the HFA shows an exaggerated pattern of apparent loss in all four quadrants, but this was unreliable, with high false positive (28%), high false negative (27%) and high frequency eye movements during the test.G: a 67-year-old female with ischaemic RNFL loss in the left eye.The optic nerve head examination reveals an RNFL defect in the superior and superotemporal aspect (yellow arrows), but an intact neuroretinal rim (blue arrow).The horizontal tomogram shows a shallow cup with no evidence of widening which therefore suggests non-glaucomatous pathology (blue dashed double arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with marked loss superotemporally in the left eye (black arrow).The depth of neural structural loss extends to the papillomacular bundle, as highlighted by the GCA deviation map.There is demonstrable structure-function concordance with an inferior nasal step (from the wedge defect on the RNFL deviation map) and paracentral defect (from the papillomacular arcuate defect on the GCA deviation map) on the HFA pattern deviation map (mean deviation −1.39 dB).
learn how to perform a given task from given input data. 78eep learning is a subfield of machine learning which uses deep layers of artificial neurons or simply 'neural networks' for this purpose. 78he main difference between traditional machine learning and deep learning techniques is in the way features are extracted from input data and fed to the algorithm (Figure 5).In the case of traditional machine learning, several human-defined parameters or features are extracted from the input data and supplied to the machine learning algorithm.In contrast, in the case of deep learning, the input data are directly fed to the algorithm, which uses its own mechanisms to extract features to perform the given task.
This end-to-end approach of deep learning eliminates the time-consuming feature design and extraction part of machine learning (for example, in Figure 5, the extraction of RNFL and other thicknesses from the images).However, the internal algorithm-specific features extracted by deep learning are quite complex to understand.Especially in the deeper layers, the extracted high-level features can be abstract and difficult to interpret by human operators.That is why deep learning is often referred to as a 'black-box' approach. 79This black-box nature is a roadblock for clinicians to trust deep learning techniques to be implemented directly in the healthcare system.Explainable artificial intelligence (XAI) is a special branch of artificial intelligence that aims to develop ways to interpret and explain trained deep networks using algorithms to unbox the black-box. 80,81However, most explainable artificial intelligence -based works are still in progress, and there are very limited algorithms in this area to interpret the existing deep learning methods. 82

Retinal imaging datasets
The studies included in this review have used public and private datasets (Supplementary Table 2).Most public datasets provide fundus images (REFUGE, 118 ACRIMA, 119 ORIGA, 120 RIMONE, 121 DRISHTI-GSI, 122 HRF, 123 BEH, 124 EyePACS-AIROGS, 125 CRFO-v4, 126 DR-HAGIS, 127 FIVES, 128 G1020, 129 JSIEC-1000, 130 LES-AV, 131 OIA-ODIR, 132 ORIGAlight, 120 PAPILA, 133 whereas OCT, optical coherence tomography angiography, and other modalities are mostly collected in private studies.A relative comparison of the mean area under the receiver operating characteristic curve of the included studies shows that the performance in studies using only public datasets (area under the curve: 0.92) is higher than those using only private datasets (area under the curve: 0.89) or both (area under the curve: 0.71).Given that the performance of machine learning models depends on the quality and quantity of training data, this is probably due to the larger sample size, the availability of pre-processed images in public datasets than in the private and more variation in the data distribution in the private datasets.

Extracted retinal features
Feature extraction is one of the major steps in machine learning, where certain parameters are extracted from input data and supplied to the machine learning algorithm.For the diagnosis of glaucoma and neurodegenerative diseases, several features have been defined for the different modalities.Mostly, the OCT-based studies extracted the mean RNFL thickness features, 10,17,19,97 with thickness values in four quadrants (temporal, superior, nasal and inferior) and 12 clock hours, with very few studies focusing on macular thickness 32 and ganglion cell-inner plexiform layer thickness. 114Conversely, the fundus-based studies extracted segmentation-based features such as cup to disc ratio (CDR) and rim area, and image-specific texture features such as Histogram of Oriented Gradients (HOG) features 30 and Grey Level Co-occurrence Matrix (GLCM) features. 32A summary of the extracted features from different modalities for glaucoma and neurodegenerative diseases is listed in Supplementary Table 3.

Machine learning classifiers
The extracted features from several retinal imaging modalities have been used for training and testing different machine learning algorithms for diagnosing glaucoma and neurodegenerative diseases (Supplementary Table 4).The most popular classification algorithm for the diagnosis of neurodegenerative diseases is Support Vector Machines (SVM), which has been used for the diagnosis of dementia, 30 Parkinson's disease 31 and differential diagnosis of dementia and Parkinson's disease. 32][16][17]20,24,25 Both models use extracted features as input; however, XGBoost combines multiple decision trees to finally make a decision on the classification, whereas support vector machines transforms the input data into a higher-dimensional space and finds the best boundary or 'hyperplane' between the classes.

Diagnosis of neurodegeneration
Several studies using machine learning models for diagnosing neurodegenerative diseases have used vascular features from fundus images.For example, Tian et al., 29 have performed retinal vasculature analysis for dementia diagnosis on a selected subset from the UK-Biobank dataset and found that the smallest retinal blood vessels hold valuable information.The most informative pixels from the blood vessels were identified by a t-test (p-value = 0.01) on the extracted retinal blood vessel maps from fundus images, which improved the accuracy of their machine learning model (support vector machines) by 14.2%.This indicates that not all parts of retinal blood vessels are affected by disease progression.Similarly, Zhang et al., 30 reported that vascular features extracted from fundus images help diagnose mild cognitive impairment, an early stage of dementia.Diaz et al., 31 presented similar findings for diagnosing Parkinson's disease while using retinal blood vessel maps from another private dataset using a machine learning classifier (support vector machines).
However, these studies reported the extraction of retinal blood vessels as cumbersome, expensive, and timeconsuming, so they leveraged another ophthalmic dataset named the DRIVE dataset, which already includes segmented retinal blood vessels by clinicians.An endeavour to use chromatic pupilloperimetry features, such as pupil light reflex (PLR), for focal red and blue light stimuli in central and peripheral retina was published by Lustig-Barzelay et al., 134 for predicting dementia family history.Moreover, machine learning applied to retinal images for diagnosing neurodegenerative diseases could potentially use some important OCT features, such as macular thickness and volume. 28

Diagnosis of glaucoma
Studies using machine learning for glaucoma diagnosis mostly utilised thickness values from OCT images, with a narrow focus on fundus features.Among the thickness features, the combination of peripapillary and macular parameters provided better results (area under the curve = 0.84), 3 with a greater influence of macular thickness in the inferotemporal area and ganglion cell layer thickness in outertemporal area. 3Wu et al., 10 used Spectralis OCT parameters and reported that ganglion cell layer thickness parameters had the highest importance for the detection of early glaucoma, while the cRNFL thickness values had the greatest influence over more severe glaucoma.Moreover, the global, temporal, inferior, superotemporal, and inferotemporal thickness were prominently influential in glaucoma detection.Machine learning applied to optical coherence tomography angiography parameters revealed that inferior temporal and inferior hemisphere vessel density and peripapillary RNFL thickness are the key diagnostic parameters for glaucoma diagnosis. 6On the other hand, machine learning applied to fundus images for glaucoma detection shows the utility of retinal texture features. 8,9s artificial intelligence-based classification models can improve diagnosis, artificial intelligence tools assisting clinicians have shown promising results.For instance, Gong et al., 7 proposed doctor-assisted artificial intelligence, leveraging the artificial intelligence tool to assist health practitioners in decision-making.Using the 'doctor + artificial intelligence ' model, they compared the performance of four doctors performing diagnosis in two approaches: first, using only their clinical expertise, and second, using an artificial intelligence tool (support vector machines) as an assistant.While the four doctors had a mean accuracy of 71.5% using their clinical expertise, three showed a significantly higher accuracy ranging from 87-90% with the help of the artificial intelligence tool.These findings from the 'doctor+ artificial intelligence' model show the potential utility of artificial intelligenceassisted tools for improved diagnosis by clinicians.

Diagnosis based on multi-modal data
There have been several attempts at using multi-modal data for diagnosis of neurodegenerative diseases and glaucoma, combining structural data from OCT and functional data from visual field tests, such as standard automated perimetry (SAP). 19,22,24Within the structural features, RNFL at inferiornasal was the most important feature, while standard automated perimetry 9°-nasal and 15°-superior points were the best functional features from the visual field test. 22Similarly, Silva et al., 24 reported that the combination of OCT and standard automated perimetry measurements improved the diagnostic accuracy (area under the curve: 0.77-0.94)significantly compared with OCT data alone (p < 0.05). 24urthermore, based on the visual field test, OCT, and intraocular pressure test using a machine learning Model, Sharma et al., 4 proposed an integrated glaucoma risk index (I-GRI), with a very low misclassification rate of 0.07 (7%).

Explainable machine learning and utility for healthcare practitioners
While most of the developed machine learning models are experimental and not easy for clinicians to understand and utilise, there have been some attempts to use explainable machine learning.For example, Oh et al., 11 developed an explainable machine learning system named 'Magellan' using OCT and fundus images to diagnose glaucoma.When results from multiple tests, such as OCT, fundus, visual field and intraocular pressure, are not consistent and vague, Magellan gives key clues for diagnosis and helps clinicians determine whether the patient may have glaucoma (accuracy: 94.7%).Moreover, Escamez et al., 13 applied interpretable machine learning on OCT features for glaucoma diagnosis.They have reported that 7-clock-hour RNFL thicknesses have the greatest importance, while an average thickness of less than 82 µm indicates more likely development of early-stage glaucoma.

State-of-the-art deep learning models
The problem with traditional machine learning is that considerable time and human effort is required to define features without any guarantee they are optimal.Deep learning-based models eliminate this step, as they automatically learn to extract optimal features from input images and provide an end-to-end framework from input to final decision-making.Convolutional neural networks (CNNs) are popular end-to-end frameworks of deep learning, which recognise patterns in images by analysing the relationships between pixels and adjusting their parameters to make accurate predictions.Several studies have leveraged convolutional neural networks for glaucoma diagnosis, focusing on either classification 83,84,86,100,104,112,116,[135][136][137] or segmentation of the optic cup 84,96 and disc. 96onversely, very few studies have used deep-learning methods to diagnose neurodegenerative diseases. 114The most widely used state-of-the-art convolutional neural network models for diagnosis are Inception ResNet, 84 ResNet, 83,84,104,116,135,136 DenseNet, 83,84,116 and VGG-Net 84,112,116,137 for classification and U-Net for segmentation. 84,96The state-of-the-art classification models are different types of convolutional neural network architectures, mostly trained on a large annotated non-medical dataset named 'ImageNet', specifically designed for computer vision tasks; unlike the ImageNet models, 'U-Net' is instead trained on medical imaging datasets, particularly for the segmentation task.

Diagnosis based on unimodal data
Most of the deep learning models used for diagnosing glaucoma are based on unimodal data.However, due to the blackbox nature of the models, most studies simply reported the classification performance without analysing the usefulness of different features.Nevertheless, there are some interesting findings.For example, OCT images were used by Wang et al., 97 leveraging deep learning for diagnosing glaucoma, where they reported the utility of the RNFL thickness map over the entire OCT image.There have some attempts to use other modalities, such as CIRRUS OCT 3D scans, [138][139][140] and confocal scanning, e.g., true-colour confocal scanning (TCCS) and ultra-wide fundus images, 141 The findings by Shin et al., 141 suggest that there is no significant difference between the performance of the two confocal scans (i.e., TCCA and UWF) (p-value: 0.135); however, their performance was significantly superior to the OCTbased diagnosis (p-value: 0.005).

Diagnosis based on multi-modal data
Several studies have used multi-modal data leveraging deep learning for better diagnosis of neurodegenerative disease and glaucoma, which revealed some interesting information.Wisely et al., 114 used colour maps from OCT, optical coherence tomography angiography, ultra-wide field, and fundus autofluorescence scans for the diagnosis of dementia using deep learning and found the most useful single input for the prediction of dementia conditions were the ganglion cellinner plexiform layer maps.Xiong et al., 115 used multimodal data combining OCT scans and visual field reports to diagnose glaucoma.They reported that the multi-modal fusion-based deep learning model gives a higher area under the curve (0.95), which surpasses the OCT-based (0.81) and visual field-based (0.87) deep learning models and assessment by two specialists (area under the curve:0.86).
Similarly, Yi et al., 116 proposed a multi-modal deep learning architecture that combined fundus & visual fields (greyscale images) to diagnose glaucoma severity levels.Their findings suggest that the architecture based on the multimodal fusion image performs better (area under the curve: 0.98) than the unimodal architectures (area under the curve: 0.95-0.96).Huang et al., 117 performed a similar study, where the performance of multimodality-based models using fundus & visual fields (area under the curve: 0.94) surpassed the unimodal models using fundus (area under the curve: 0.90) and visual fields (area under the curve: 0.89) alone.
Several studies have been published where combined machine learning-deep learning models were used for multimodal fusion. 142,143For example, Mehta et al. used an approach with multi-modal data (colour fundus photos, OCT scans, and health data) collected from the UK-Biobank for glaucoma diagnosis.They demonstrated excellent accuracy of a multi-modal model that combined imaging with demographic and clinical characteristics (area under the curve: 0.97).The interpretation of their model emphasises using clinical information known to be connected to the condition, such as age, intraocular pressure, and optic disc morphology.To summarise, the multi-modal studies suggest that the use of multi-modal deep learning models is advantageous in diagnosing neurodegenerative diseases and glaucoma.

Explainable deep learning and real-world applications
Though most studies have used deep learning models as black boxes, few studies have used explainable artificial intelligence approaches. 103,142,144Deperlioglu et al., 144 developed a hybrid system for explainable artificial intelligence and further assessed trust concerns regarding black box models.The system was evaluated as being a reliable and confident tool by 15 medical professionals.Kim et al., 103 used explainable artificial intelligence techniques to identify the location of a glaucomatous region in a given input image.Huang et al., 117 have used visualisations to highlight the imaging biomarkers in fundus images using deep learning.These findings illustrate that deep learning methods applied to medical images may assist clinicians in diagnosing glaucoma more quickly and accurately.
While most of the explainable artificial intelligence work using deep learning are focused on Class Activation Mapping (CAM), 135,145 Berchuck et al., 146 have devised a deep learning algorithm that incorporates a generative technique (generalised variational auto-encoder) to enhance the accuracy of estimating the rates at which visual field loss progresses and predicting the future patterns of such loss in individuals with glaucoma.To advance explainable deep learning in glaucoma detection, Hemelings et al., 147 proposed a methodology that leverages the concept of the vertical cup-disc ratio.The objective of this study was to assess the significance of regions outside the optic nerve head and offer an objective method for explainability in the context of glaucoma detection and vertical cup-disc ratio estimation.They trained and tested their deep learning model on two sets of fundus images with two cropping policies: optic nerve head cropping (cropping policy: crop radius = 10-60% of image size) and inverse crop (periphery cropping policy: inverse of cropping mask with optic nerve head removed).They reported that the original optic nerve head images produce an area under the curve of 0.94 and a coefficient of determination of 77% for vertical cup-disc ratio estimation while the inversely cropped images in the absence of optic nerve head produced an area under the curve of 0.88 and coefficient of 37%.The average saliency map for vertical cup-disc ratio produced by the explainable artificial intelligence in this study reveals a consistent pattern in the infero-and superotemporal regions, which corresponds to the locations of the RNFL that are susceptible to damage caused by glaucoma.
With the rise of smartphone photography and smartphone-based healthcare systems, mobile fundoscopy and deep learning is currently gaining popularity.Using a D-EYE lens attached to an iPhone 6S camera, Neto et al., 148 obtained an area under the curve of 0.82-0.87for glaucoma diagnosis.Another deep learning method to diagnose glaucoma from fundus photos taken with a smartphone (iPhone 8 + D-Eye lens) was validated by Nakahara et al., 104 which showed promising performance (area under the curve: 0.84) on the images acquired with the smartphone.These studies show the potential of inexpensive mobile fundoscopy and the benefit of adopting less expensive lenses in glaucoma diagnosis.

Overall performance analysis
The studies included in this review allow for various relative comparisons of the diagnostic performance of artificial intelligence-based models on several factors, such as the type of artificial intelligence (machine learning, deep learning, or machine learning + deep learning), type of validation sets to test the performance of the artificial intelligence models (internal, external), different modalities (OCT, 3D OCT, optical coherence tomography angiography, colour fundus, multimodal), and type of dataset (private, public, or a combination of both).
The relative performance comparison of the different types of artificial intelligence (Figure 7A, B) shows that studies using a combination of machine learning and deep learning models have comparatively higher area under the curve than individual machine learning-or deep learning-based models (Figure 7B).On internal validation, the artificial intelligence approaches performed better; however, the performance was significantly reduced when using external data as a test set (Figure 7C).Comparing mean performance for different modalities shows that OCT-based studies perform better than optical coherence tomography angiography, colour fundusbased studies perform better than OCT and optical coherence tomography angiography-based studies, and multi-modal studies outperform the other modalities (Figure 7D).Finally, studies using public datasets show better performance than private ones; however, most of the studies using a combination of public and private datasets have used one dataset for internal validation, and another for external, resulting in reduced performance over the test data (Figure 7E).Overall, the accuracy of artificial intelligenceassisted diagnosis ranges from 67-98%, 20,28 and the area under the curve ranges from 0.71-0.98, 17,19,23,116,117which outperforms typical human performance of 71.5% accuracy 7 and 0.86 area under the curve 115 (Table 1).

Artificial intelligence-based diagnostic performance
A comparative analysis of the included studies shows that several factors affect the performance of artificial intelligencebased classifiers.While comparing the performance of artificial intelligence techniques, the diagnostic performance of deep learning-based approaches is higher than traditional machine learning-based diagnostic models.Interestingly, combining machine learning and deep learning models with a hybrid approach yields better performance than individual machine learning/deep learning models (see Section 'Overall performance analysis').The main reason behind the boosted performance of combined models is probably due to the use of feature selection or dimensionality reduction techniques in traditional machine learning, 149 and in some cases, the use of traditional learning approaches (such as, random forest and K-nearest neighbours) to combine multiple deep learning models. 142,150oreover, irrespective of the artificial intelligence approach, the relative performance comparison for different modalities shows that multimodalities can be beneficial.Feature fusion plays an important role here, where the extracted features come from multiple imaging modalities, such as OCT and fundus, 143 as well as health data. 142This signifies the added value of clinical features such as health data 142 and visual field tests 18,22,116,117 to imaging modalities like OCT and fundus images.However, the overlap of characteristic changes in features between glaucoma and neurodegenerative diseases can affect performance (see Section 'Misdiagnosis of other causes of neurodegenerative diseases as glaucoma' for more details).
Apart from the artificial intelligence-based models, there are some non-machine learning-based diagnostic models which have shown promise in diagnosis with a significantly lower number of features.For instance, Fukai and colleagues 151 developed and validated a risk score for realtime population-based glaucoma mass screenings, utilising solely retinal thickness-related values obtained through spectral domain OCT.Their best performing 'Hitachi Risk Score model' demonstrated excellent predictive capabilities, with an area under the curve of 0.97 (95% confidence interval: 0.96-0.98),achieving a sensitivity of 0.93 and specificity of 0.91.In the study, the authors used only six features (variables) in the best performing logistic model, which has significantly lower computation cost.While this study developed the model within the Japanese population, Guzman et al., 152 applied the best-performing Hitachi Risk Score model (trained on Japanese population) to a different population to test its generalisability, where the area under the curve falls to 0.88.
Similarly, Brusini 153 proposed a glaucoma staging system solely based on RNFL damage in OCT using a non-linear equation and regression lines, which provides a sensitivity of 95.2% and specificity of 91.9% while discriminating normal from glaucomatous eyes (borderline results were considered as normal).They used two features (variables) derived from OCT (average RNFL superior and RNFL inferior thickness) in their non-linear equation, which significantly reduces the computational complexity of the model, specifically when compared to the deep learning-based approaches.

Limitations and inconsistencies
Though the artificial intelligence-based techniques using different modalities have shown promising results, several limitations have been identified, such as significant differences in age between normal and diseased groups while applying artificial intelligence (both machine learning 3,10 and deep learning 91,141 models), and small sample sizes of the diseased groups while using artificial intelligence (both machine learning 3,10,25 and deep learning 113,137 models) for diagnostics.
While diagnosing glaucoma, the use of glaucoma specialists, 113 visual field defects, 7,17,154 Hood report and Consensus, 109 cup-to-disc-ratio value (value >0.5 are labelled as glaucoma otherwise healthy, 20 guided progression analysis and progressive glaucomatous optic neuropathy 22 and selfreport labels in UK biobank dataset 142 have been considered.Also, some studies included glaucoma suspects in the glaucoma group, 109,154 while several glaucoma suspects were subsequently found to be healthy. 68-89% 7 89.5-90% 70.97-0.98 7he best performing classifiers were only considered in the case of multiple classification algorithms applied to the same modality.The range of performance reported corresponds to the multiple studies included in this review. In terms of computational complexity and real-life applicability, the non-artificial intelligence-based diagnosis systems show promise in diagnosis with comparatively lower number of variables (features) with respect to the artificial intelligence-based models; however, those are limited to the use of OCT thickness values.In the case of traditional machine learning algorithms such as support vector machines and random forest, those are computationally less expensive to train than deep learning models, as they require few hyperparameters to be tuned; nevertheless, as most of the works based on imaging consist of end-to-end deep learning, the number of parameters is a critical issue in terms of computational complexity.
The state-of-the-art deep convolutional neural network models are widely used for glaucoma diagnosis, with more than dozens of layers, and the number of trainable parameters can be hundreds of millions.Although transfer learning has been applied by some of the studies to overcome this limitation, the number of parameters is subsequently higher.For example, Singh et al., 97 experimented with a transfer learning using eight different state-of-the-art convolutional neural network models (VGG16, VGG19, Inception, Xception, DenseNet, Inception-ResNet-V2, EfficientNet B0 and EfficientNet B4) on 2110 OCT scans (glaucoma vs normal ratio = 50:50).The models were already pre-trained on an extensive dataset of general images (i.e., ImageNet) and were employed as general low-level feature extractors.Subsequently, the pre-trained convolutional neural networks underwent further training to enhance their performance by extracting more task-specific features.This was achieved by adding additional layers on top of the base models, freezing the base model, and then unfreezing a portion of the model (i.e., fine-tuning) using the private glaucoma dataset.
Although transfer learning was employed to leverage pretrained weights and reduce the number of trainable parameters, the study reported a range of trainable parameters between 4 and 64 million (lowest number of parameters: EfficientNet B0, highest number of parameters: Inception-ResNet-V2).They reported an area under the curve of 0.95 on the test dataset; however, the computational complexity resulted in a run time of 555-3251 seconds using a Google Colab-based Titan Xp GPU.Considering this, traditional machine learning-based works might be appropriate and suitable for offline analysis with relatively lower number of features to be deployed in real-world clinical application.

Future research directions
Based on the studies considered here, this review recommends addressing several key research gaps and providing potential future directions.First, there is room for improvement in the artificial intelligence-based models in terms of generalisability.Whereas the studies worked with a dataset collected from a specific region, focused on a particular ethnicity, such as Chinese, 7,89,97,116,155 Korean, 5,11,19,103,104,137,141 Japanese, 4,99,143 Taiwanese, 3,10 Indian, 15,135 Bangladeshi 124 and western, 6,17,18,[21][22][23]91,95,[108][109][110][111]154 clinical studies have shown significant variations in the characteristic pattern (e.g., RNFL) of those diseases among the populations.
On one hand, the prevalence of glaucoma is more than twice as high among Blacks (following the terminology of the original abstract) compared to other racial groups; on the other hand, the performance of deep learning models for Blacks tends to be worse because of limited data availability for minority populations. 156To address this problem of generalisability and class imbalances, Wang et al., 156 proposed an equitable deep learning model for glaucoma progression forecasting within 76.9% White, 14.6% Black and 8.5% Asian population.Their model improved the performance for Asians (from 0.68 to 0.75) and Blacks (from 0.82 to 0.84) and did not compromise the performance for Whites (from 0.80 to 0.83) in terms of mean deviation pointwise progression.This type of development of deep learning models has the potential to reduce health disparities in medical artificial intelligence, particularly in relation to glaucoma.
Moreover, most studies have developed their artificial intelligence-based model based on the retinal image datasets collected with a specific setting (machine) or equipment (such as CIRRUS OCT and Spectralis OCT).However, there are substantial variations between the image quality and/or parameters based on the specific settings and devices for the same modalities.That is why while using the test dataset from an external source, the classification performance was significantly reduced by 9-20% across several studies. 94,109,139,150ince most artificial intelligence-based models are machinespecific and are biased to a specific ethnicity, a well-trained model is required, which is trained on a multi-ethnic dataset but also with different settings for the same modality.On top of that, using a larger sample size, balanced dataset, quality settings and employing appropriate feature selection techniques may help overcome data imbalance and quality issues while employing artificial intelligence for diagnosis.
Second, artificial intelligence-based studies need to focus more on new promising modalities while considering multimodal analysis.The review shows that only a few studies have used multimodalities, and more research is required in this area.Recently a group of biomedical engineers have developed a new approach to imaging called fluorescent hyperspectral imaging (fHSI), which shows promise as a useful tool for the early detection of glaucoma. 157In addition to OCT, fundus and ultra-wide field images, artificial intelligencebased techniques can facilitate multi-modal feature fusion, aiding clinicians in real-world decision-making.
Third, there is a need to use explainable artificial intelligence to interpret the results, especially for overlapping diseases.Clinical studies have reported several changes in RNFL, ganglion cell layer, and macular thickness that show overlap in the characteristics of the conditions.Most artificial intelligence-based studies on glaucoma diagnosis have not excluded patients with neurodegenerative diseases and in some cases, other optic nerve diseases, such as ischaemic optic neuropathy and vice versa.However, these conditions can mimic glaucoma and thus cause potential misdiagnosis. 158Explainable artificial intelligence-based techniques, such as Shapely additive analysis 159 and partial dependency analysis, 160 can help identify the overlapping regions in these cases.This will allow clinicians to understand the progression of neurodegenerative diseases over glaucoma and vice-versa.
Fourth, there is a scarcity of studies on developing user-friendly artificial intelligence-apps for real-time diagnosis of glaucoma and neurodegenerative diseases.The review of a significant amount of work in the area suggests that most studies are experimental, using programming tools such as Python and MATLAB.Given that mobile fundoscopy has recently shown promising results, 104,148 development of such apps for healthcare practitioners and lightweight versions for mobile devices is essential.Introducing explainable artificial intelligence tools on the apps would gain the trust and confidence of healthcare practitioners to use them in clinical decisionmaking.
Fifth, most artificial intelligence-based methods applied to glaucoma diagnosis have focussed on classifying glaucomatous and normal eyes, except for a few that used multi-stage classification. 3,10However, the severity levels of glaucoma are important to identify, especially to diagnose glaucoma in the very early stage and to track the progression of the disease, which is difficult to assess based on subjective clinical examination alone.Also, there is a lack of studies on the classification of glaucoma types (such as low-tension and high-tension glaucoma).Leveraging artificial intelligence-based models trained with multiple severity levels and glaucoma types could help improve diagnosis.
Lastly, as private medical data could include personal and sensitive information, given the rise of smartphone-based artificial intelligence technology in telehealth systems, the legal and ethical issues are important to take into consideration, especially consent of the patients and data storage. 161,162Therefore, artificial intelligence-based diagnostic models must ensure privacy-preservation. Recently, privacy-preserving machine learning has been used for the Internet of Things and cyber security applications.Domain adaptation is required at this stage, and future research on privacy-preserving approaches 163 in artificial intelligencebased models to diagnose glaucoma and neurodegenerative diseases is necessary.

Conclusion
In this study, most of the quality articles published on artificial intelligence-based diagnosis of glaucoma and neurodegenerative diseases over the last decade have been reviewed.The dissection of the included studies signifies the substantial contribution of artificial intelligence in diagnosing glaucoma and neurodegenerative diseases (accuracy: 67-98%, area under the curve: 0.71-0.98),which surpasses human diagnosis (accuracy: 71.5%, area under the curve: 0.86).Artificial intelligence techniques, especially machine learning and deep learning have opened a new door that may help clinicians in real-world decision-making and diagnosis and create a pathway to a more precise and accurate diagnostic system.Given the contributions, there is room for improvement of artificial intelligence-based models in generalisability and explainability, where future research can focus on deploying artificial intelligence-based diagnosis systems on real-world clinical applications.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Figure 1 .
Figure 1.Flow diagram for the retrieval of studies on artificial intelligence-based glaucoma and neurodegenerative disease diagnosis using retinal imaging (the diagram follows the PRISMA structure 62 ).other sources include Google Scholar and ScienceDirect.

Figure 2 .
Figure 2. Primary optic neuropathies and conditions associated with retrograde degeneration (post chiasmal lesions).Post-chiasmal RNFL will display distinctly different losses that are eye specific and CNS disease specific, respectively.Adapted from Zangerl et al., 66 A: primary optic neuropathies B: post-chiasmal causing retinal retrograde degeneration.

Figure 3 .
Figure3.A: a 76-year-old male with Parkinson's disease assessed for suspected glaucoma.The optic nerve head examination reveals no significant structural abnormality, with an intact neuroretinal rim (blue arrows).The OCT RNFL deviation map also shows no abnormality, except with inaccurate fixation during the test.The OCT GCA deviation map shows concentric thinning of the central macula adjacent to the foveal pit due to anatomical variation (non-pathological).Perimetry using the HFA could not be completed due to the patient's dexterity limitations.B: a 58-year-old male with Alzheimer's disease.The optic nerve head examination reveals no significant structural abnormality, with an intact neuroretinal rim (blue arrows).OCT imaging results also show no significant structural abnormalities.However, the imaging results were confounded by eye movement artefacts, as indicated by the red arrows.Perimetry using the HFA shows an exaggerated pattern of apparent loss, but this was unreliable, with high false positive (39%), high false negative (36%) and high frequency eye movements during the test.C: a 58-yearold male with advanced glaucoma in the left eye.The optic nerve head examination reveals almost complete loss of the neuroretinal rim superiorly and inferiorly in positions typical for glaucoma (yellow arrows).OCT imaging results also highlight the RNFL defects superiorly and inferiorly (yellow arrows), with the tomogram illustrating the widened cup typical in glaucoma (yellow dashed double arrow).The RNFL TSNIT curves show significant RNFL reductions, with measurements reaching the instrument floor (black arrows).The structural loss was asymmetric, hence the perimetry results also show a more profound superior arcuate defect (mean deviation −18.69 dB).D: a 58-year-old male with coincident right eye early glaucoma and Parkinson's disease.The optic nerve head examination reveals a notch at the inferior neuroretinal rim, with corresponding loss of the RNFL (yellow arrows).The vertical tomogram shows a widened and deepened cup expected in glaucoma (yellow dashed arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with significant loss in the right eye (black arrow).There is demonstrable structure-function concordance with a superior nasal step on the HFA pattern deviation map (mean deviation −2.38 dB).E: a 47-year-old female with early glaucoma in the right eye.The optic nerve head examination reveals a notch at the inferior neuroretinal rim, with corresponding loss of the RNFL (yellow arrows).The vertical tomogram shows a widened and deepened cup expected in glaucoma (yellow dashed arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with early loss in the right eye (black arrow).Since the wide and trajectory of the RNFL loss encompasses the papillomacular bundle, there is also an appreciable, wide arcuate-like loss on the OCT GCA deviation map (navy arrow).There is demonstrable structure-function concordance with a superior nasal step (from the wedge defect on the RNFL deviation map) and paracentral defect (from the papillomacular arcuate defect on the GCA deviation map) on the HFA pattern deviation map (mean deviation −4.57dB).F: a 44-year-old female with Huntington's disease assessed for suspected glaucoma.The optic nerve head examination reveals no significant structural abnormality, with an intact neuroretinal rim (blue arrows).OCT imaging results also show no significant structural abnormalities.However, the RNFL and GCA imaging results were confounded by eye movement artefacts, as indicated by the red arrows.Perimetry using the HFA shows an exaggerated pattern of apparent loss in all four quadrants, but this was unreliable, with high false positive (28%), high false negative (27%) and high frequency eye movements during the test.G: a 67-year-old female with ischaemic RNFL loss in the left eye.The optic nerve head examination reveals an RNFL defect in the superior and superotemporal aspect (yellow arrows), but an intact neuroretinal rim (blue arrow).The horizontal tomogram shows a shallow cup with no evidence of widening which therefore suggests non-glaucomatous pathology (blue dashed double arrow).The TSNIT curves highlight the asymmetry in RNFL thickness, with marked loss superotemporally in the left eye (black arrow).The depth of neural structural loss extends to the papillomacular bundle, as highlighted by the GCA deviation map.There is demonstrable structure-function concordance with an inferior nasal step (from the wedge defect on the RNFL deviation map) and paracentral defect (from the papillomacular arcuate defect on the GCA deviation map) on the HFA pattern deviation map (mean deviation −1.39 dB).

Figure 6 .
Figure 6.Count of included studies published by modality in the last 10 years (the missing years represent no publications in those years corresponding to the given modality).

Figure 7 .
Figure 7. Performance of studies using different artificial intelligence-based approaches and mean overall performance analysis for different factors.A: performance of studies using different artificial intelligence-based approaches (the best performing classifiers were only considered in the case of multiple classification algorithms applied to same modality) B: performance vs artificial intelligence approaches C: performance vs artificial intelligence validation sets D: performance vs data modalities E: performance vs dataset type.

Table 1 .
Comparative performance for human (clinicians) vs machine (artificial intelligence-based models) for diagnosing glaucoma and neurodegenerative diseases.