Multi-class Classification of Alzheimer’s Disease Using Deep Learning and Transfer Learning on 3D MRI Images

Multi-class Classification of Alzheimer’s Disease Using Deep Learning and Transfer Learning on 3D MRI Images

Battula Srinivasa Rao | Mudiyala Aparna* | Soma Sekhar Kolisetty | Hyma Janapana | Yannam Vasantha Koteswararao

School of Computer and Information Sciences, University of Hyderabad, Telangana 500046, India

Department of Computer Science and Engineering, Tirumala Engineering College (Autonomous), Andhra Pradesh 522601, India

Department of Computer Science and Engineering, University College of Engineering, JNTUK, Narasaraopet 522601, India

Department of Computer Science and Engineering, GST GITAM University, Visakhapatnam 530045, India

Department of Electronics Communications and Engineering, Lendi Institute of Engineering and Technology, Visakhapatnam 530045, India

Corresponding Author Email: 
mudiyalaaparna7083@gmail.com
Page: 
1397-1404
|
DOI: 
https://doi.org/10.18280/ts.410328
Received: 
19 June 2023
|
Revised: 
19 September 2023
|
Accepted: 
8 November 2023
|
Available online: 
26 June 2024
| Citation

© 2024 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Alzheimer's disease (AD) poses a significant challenge for neurologists due to its progressive nature and debilitating impact on cognitive function. Recent advancements in neuroimage analysis have paved the way for innovative machine learning techniques, offering potential for substantial improvements in AD detection, diagnosis, and progression prediction. In this study, we embarked on developing a novel deep learning framework to address this critical need. Traditional manual classification methods for AD are often time-consuming, labor-intensive, and prone to inconsistencies. Given that the brain is the primary organ affected by AD, leveraging a classification system based on brain scans presents a promising avenue for achieving more accurate and reliable results. To effectively capture the spatial information embedded within 3D MRI scans, we extended convolutional techniques to three dimensions. Classification was accomplished by strategically combining features extracted from various layers of the 3D convolutional network, with differential weights assigned to the contributions of each layer. Recognizing the potential of transfer learning to accelerate training time and enhance AD detection efficacy, we incorporated this approach into our methodology. Our proposed framework integrated transfer learning with fine-tuning, harnessing brain MRI images from three distinct classes: Alzheimer's disease (AD), mild cognitive impairment (MCI), and normal control (NC). We explored a range of pre-trained deep learning models, including ResNet50V2 and InceptionResNetV2, for AD classification. ResNet50V2 emerged as the frontrunner, demonstrating superior classification accuracy compared to its counterparts. It achieved a remarkable training accuracy of 92.15%, followed by a sustained high testing accuracy of 91.25%. These results convincingly underscore the remarkable capabilities of deep learning methods, particularly transfer learning with ResNet50V2, in accurately detecting Alzheimer's disease using 3D MRI brain scans.

Keywords: 

image processing techniques, MRI 3D images, neuro image analysis, deep learning models, Alzheimer’s disease and transfer learning

1. Introduction

Alzheimer's disease (AD) is a form of dementia that causes gradual mental deterioration and memory loss over time. Individuals with this suffer permanent brain damage, which ultimately results in death from brain failure [1-3]. It casts a long shadow, not just on individual lives but on society. As a leading cause of death among the aging population, estimated at nearly 50 million individuals globally, its impact is widespread and profound. Magnetic resonance imaging (MRI) and other cutting-edge neuroimaging techniques are currently being used to diagnose AD. Millions of voxels make up the 3D images that an MRI may generate. Most Alzheimer's disease lesions can be seen on magnetic resonance imaging (MRI) scans, and their severity is typically assessed with the help of the radiologist's training and experience. The brain, soft tissues, and lesions are and measured with the help of digital processing technology. Computers have al-lowed clinicians to perform both qualitative and quantitative analyses of lesions and other areas of interest. Helping clinicians make more informed decisions about lesions is one of the many applications for AI in medicine [4]. Mostly, deep learning is used in the scientific tool. Convolutional neural networks, made possible by recent developments in deep learning, offer significant promise in medical image diagnosis and perform well in the classification of natural images. There have been numerous proposals for the classification and segmentation of Alzheimer's disease using convolutional neural networks (CNNs) [5]. To properly categorize AD, one must first examine its defining characteristics. Figure 1 Depicts the 3D MRI images samples.

The primary contributions of this research can be summarized as follows:

We conducted a comprehensive review of 20 widely used Deep Neural Network (DNN) models. The purpose of this review was to aid in the selection of the most effective DNN-based classifiers for the classification of Alzheimer's Disease (AD) using 3D MRI images.

Figure 1. Sample MRI images (a) NC (Normal Control) (b) MCI (Mild Cognitive Impairment) (c) AD (Alzheimer's Disease)

We meticulously analyzed the experimental results of these models to categories AD across its various stages, including Normal Control (NC), Mild Cognitive Impairment (MCI), and AD itself. This analysis provided insights into the performance of these models in a variety of scenarios. We implemented these models with patient age in mind to improve the accuracy of our performance comparisons. Recognizing the potential impact of age on AD diagnosis, our approach accounted for age-related variations. ResNet50V2 demonstrated the best classification performance in our experiments. We improved classification accuracy by replacing all convolution layers in ResNet50V2 with depth-wise convolution layers as a practical solution. The goal of this optimization was to keep accuracy high while reducing computational demands.

2. Literature Survey

Academic interest in AD detection has been growing in recent years, with ML and DL being cited as potential methods for automatic detection. The 3D DL- model can distinguish accurate and detailed spatial and temporal data for accurate classification of AD than the conventional DL model and radiologists, and its clinical application able to further improve the diagnosis of AD.

Klöppel et al. [6] mapped the entire brain's gray matter to a high-dimensional space, where voxels served as coordinates and their values were interpreted as intensity levels. Linear support vector machine was then used to categorize the subjects (SVM).

Lerch et al. [7] using a wide variety of machine learning techniques, many computer-aided systems have been built to interpret disease states from MR images. Features collected from voxel intensity, tissue density, or form descriptor were used in training these algorithms to create the required result.

Zhang et al. [8] in order to classify the deformation vectors on the gray matter of the entire brain as image dissimilarity, employed support vector machines. Due to the high dimensionality of the features, whole brain approaches may be computationally expensive, hence approaches that focus on regional features typically pick a subset of the brain as having relevance to AD or select regions of interest (ROIs) that are tailored to the cohorts. The hippocampus, Para hippocampal gyrus, and entorhinal cortex shared properties in 3D volume and shape with ROI-based techniques.

Brain images were segmented into 116 anatomical ROIs using Mask RCNN technique by Silveira and Marques [9] and boosting classification was used for labeling.

Using a multi-kernel support vector machine, as suggested by Suk et al. [10], multi-modal features, such as tissue volumes estimated from 93 ROIs, might be ensembled to improved accuracy.

The first SVM classification investigation based on hippocampus surface shape invariants was provided by Long and Wyatt [11]. Aspects of spherical harmonics that are rotationally invariant served as the basis for the shape invariants (SPH). Despite the shown efficacy of ROI-based approaches, region segmentation mistakes or feature volatility may impact classification accuracy.

Liu et al. [12] developed a deep learning system based on a 3D convolutional neural network (CNN). By integrating the MRI gray matter density map and PET intensity values with the 3D CNN features, we were able to perform multi-modal AD discrimination. Recent years have seen the widespread implementation of deep neural networks into the categorization procedure.

The 3D convolutional neural network (CNN) model used for feature extraction and subsequent classification was trained by Gutman et al. [13] using sparse automatic encoding.

Liu and Shen [14] cropped MRI images based on the predicted locations of AD lesions using the regression forest technique and then fed them into an SVM to make a diagnosis.

Positron emission computed tomography (PET) and magnetic resonance imaging (MRI) scans were used in a multi-modal diagnostic done by Li et al. [15]. In addition, they used deep neural network for patch-based input to a cascade network to analyze each MRI and PET image.

Using a data-driven approach, Payan and Montana [16] (GAN) generated numerous patches around discriminative anatomical features in each MRI image. For multi-instance diagnosis learning, these patches are fed into several different classification networks.

Karas et al. [17] used generative adversarial networks (GAN) and picture segmentation to generate the missing PET data, which was subsequently used to train a multi-instance neural network.

Multi-modal data was split into multiple patches by Xu et al. [18] before being fed into the CNN pretrained models for fusion diagnosis.

3. Materials and Methods

3.1 Dataset description

Alzheimer's Disease Neuroimaging Initiative (ADNI) is a longitudinal multicenter study aiming to identify diagnostic biomarkers (clinical, imaging, genetic, and biochemical) for early detection of Alzheimer's disease. screening for Alzheimer's disease and subsequent treatment [19]. Since ADNI contains more MRI data than any other publicly available source, we have used it to test the efficacy of our method. We used 375 samples with Alzheimer’s disease, 378 samples with mild cognitive impairment, 447 cognitively normal samples and total number of samples are 1200 images from the dataset. Grad Warp (which fixes distorted picture geometry caused by the gradient model), B1 Correction (which employs B1 calibration scans to even out image brightness), and N3 (which employs a histogram peak sharpening method to even out brightness) were all used as preliminary processing on all samples. The original data dimensions were changed to 113×137×113×3 so that all samples would be uniform in size and shape. As hippocampus volume is a strong indicator of Alzheimer's disease classification, the data reduction was performed by scaling rather than cropping to preserve hippocampi information. Table 1 represents the dataset description.

Table 1. Analysis of dataset for implementing models

Diagnostic Type

Number of Subjects

Number of Samples

Age

Gender (M/F)

AD

50

375

76.13 ± 6.14

132/148

MCI

50

378

75.13 ± 5.23

86/76

NC

50

447

76.16 ± 6.29

144/131

3.2 Proposed methodology

MRI scans are collected from a variety of sources and given for preprocessing. The image's dimensions are then changed by the pre-processing layer. This model classifies AD into three categories and recognizes it. Leveraging the power of 3D convolutions and connection-wise attention mechanisms, our densely linked CNN architecture tackles the challenge of extracting meaningful features from complex 3D brain MRI scans. This robust and efficient network provides a novel approach to analyzing these rich datasets, paving the way for advancements in neuroscience and medical diagnosis. The suggested deep learning-based system model used MRI data to detect and classify disorders at an early stage [20-23]. MRI scans were among the raw training data that was collected. The image was converted from its original 96×120×96×3 dimensions via a pre-processing layer. The study explores applying transfer learning to detect Alzheimer's disease. It utilizes pre-trained deep learning models, modifying their final layers to adapt to this specific task. The proposed paradigm is illustrated in full in Figure 2.

We suggested ResNet (Residual Network) as a solution to the problem of vanishing gradients that occurs during deep convolutional network training. ResNet introduces skip connections, which use an identity function to bypass nonlinear transformations. This one-of-a-kind architectural feature allows for the training of much deeper networks with less computational effort. The benefit of ResNet is that it avoids the vanishing gradient problem that plagues traditional networks by propagating gradients from one layer to the next. Furthermore, densely linked convolutional networks, a novel connectivity pattern, were introduced [24]. This pattern enhances interlayer communication even further, contributing to improved network performance. ResNet capitalizes on the network's potential by reusing features, rather than relying solely on extremely deep or wide architectures to enhance representational power. This approach results in streamlined models that are easier to train and more efficient in terms of parameter usage. Additionally, it has been demonstrated that these feature maps effectively integrate information from previous layers, thereby increasing input variation and enhancing overall network efficiency. Consider a convolutional network being applied to a single image, X0. The network consists of L layers, where l indexes the layer and Hl(.) denotes a non-linear transformation implemented by each layer. As a result, the feature maps from all previous layers, including X0, X1, … and Xl, are input to the lth layer.

Xl=Hl [(Xl−1, Xl−2, Xl−3, …, X0)]            (1)

This network was built using a time-saving strategy that took visual attention into account. This work employed a convolutional neural network architecture with a connection-wise attention mechanism, allowing for flexible feature map integration through weighted summation. Weights for this summation were learned automatically during training, enabling the model to optimize feature representation for the task at hand. By focusing on the most useful information, this has made the network simpler and more effective [25-28]. The ith layer of the convolution neural network received a weighting coefficient W in accordance with Eq. (2), where Wi represented an attention vector composed of i-1 elements. Formula 3, in which the feature maps from the jth layer and Hl were represented by xj (1<j<l-1). was a non-linear transformation that connection-wise focused on the layout of the lth layer.

Wi=[wi−1, i, wi−2, i, …, w2, i, w1, i]              (2)

xl=Hl(wl−1, xl−1)+wl-2, xl− 2 … + w1, x1                    (3)

Figure 2. Basic architecture of proposed methodology

Figure 3. Sample architecture for implementing models with transfer learning

There were four distinct layers in the network. First, there was the layer that the network received the picture patches through called the input layer. The second kind of layer was a convolutional one, which used both the input images and the learnt filters to generate feature maps for each filter.

3.3 Transfer learning for classification

Leveraging transfer learning, the proposed approach achieved 3-way AD classification as depicted in Figure 3.

When we have a large dataset from which model learn all parameters, we can switch to transfer learning. A trained network, such as Resnet50v2, is used as a starting point for learning a new task. After training on ImageNet, the Resnet50v2 model was used ADNI dataset. The frozen fully connected layers produced 890 features and outputs for 3 classes, necessitating a transfer learning approach. To accommodate four-class categorization, the model's architecture underwent modification. The final layers were replaced with a new fully connected layer, a SoftMax layer, and an output layer specifically designed for multi-class handling. The network was then trained using a dataset of magnetic resonance images and optimized training parameters [29]. After training, the model's accuracy was evaluated to assess its effectiveness in making correct classifications. To measure model performance and guide training, loss was calculated using the Cross-Entropy function. This function ensures that the model's output dimensions match the number of classes being classified. X denotes the feature space, and P(X) denotes the associated marginal probability [30-34]. In P(X), X={x1, x2, ..., xn}, where n denotes the number of input images. The domain is represented mathematically.

Domain $=\{\mathrm{X}, \mathrm{P}(\mathrm{x})\}$

Within two distinct domains, the ways in which features were distributed and represented differed significantly. To formalize a specific task within a domain, a set of potential labels (W) and a prediction function (f(.)) were employed.

Task = {W,f(.)}

The model's prediction function (f(.)) was trained on features extracted from the data, enabling it to make predictions on unseen test data. The proposed framework involved two domains: a target domain (Domain_p) and a source domain (Domainq). Data points in the source domain with label wsi were designated as xsi, while those in the target domain with label wti were designated as xti. The target domain and the source domain is formulated as follows:

$Domain_p=\left\{\left(x_{t 1}, w_{t 1}\right),\left(x_{t 2}, w_{t 2}\right), \ldots.\left(x_{t i}, w_{t i}\right)\right\}$               (4)

$Domain_q=\left\{\left(x_{s 1}, \mathrm{w}\right),\left(x_{s 2}, w_{s 2}\right), \ldots .\left(x_{s i}, w_{s i}\right)\right\}$              (5)

Transfer learning shines as a powerful technique for building predictive models (f(.)). It leverages insights gleaned from past tasks and domains (source activities and domain) to efficiently train the model and accurately predict labels for new data points (x). f(x) was represented mathematically as $f(x)=P(W / X)$.

Algorithm:

  • Input

P(Y), Y={y1, y2, ..., yn}: Probability distribution of samples in the dataset, where Y represents the collection of samples.

  • Pre-Training

For each sample in the dataset:

  • Utilizing a pre-trained source domain (Ds) network with embedded knowledge.
  • Preparing target domain (Dt) training and validation sets for model adaptation.
  • Initiating knowledge transfer via training and validation on these sets.

End for

  • Fine-Tuning

For each feature f(y):

  • Perform model customization for the target domain via fine-tuning of designated layers, targeting {Y, P(y)}.
  • Optimize target task performance through further fine-tuning utilizing the training dataset (Dt).
  • Conduct model evaluation on unseen images using the test dataset (Dt) to gauge categorization efficacy.

End For

  • Output

The fine-tuned model achieved a high level of accuracy in the classification of test dataset images.

3.4 Transfer learning for pretrained models

3.4.1 InceptionResNetV2

Inception-ResNet-V2 framework is built around the Residual Inception Block. each block is followed by a meticulous dimensionality check. After each block, an 11-convolution filter expansion layer ensures accurate input depth representation before summation. To maintain harmony, batch normalization is selectively applied to conventional layers only. This intricate network elegantly accepts 113×137×113 pixel inputs and orchestrates a symphony of 164 layers. At the heart of this innovation lies the Residual Inception Block (RIB), a masterful ensemble of diverse convolutional filters and residual connections [35]. This architecture takes advantage of residual connections to combat severe network deterioration and quicken the training process. Since there were no tuning parameters in this core design, Max Pooling was performed to reduce overfitting in the convolutional structure by increasing the correlation between feature importance and label category [36, 37]. This means that max Pooling outperforms the Flatten method in terms of parameter efficiency. To safeguard against overfitting and promote model generalization, a Dropout layer is strategically injected, wielding a constant value of 0.8.

$\sigma(x)_i=\frac{e^{x_i}}{\sum_{j=1}^k e^{y_i}}$            (6)

The SoftMax activation function, applied to the dense layer, transformed its outputs into probability distributions across K classes, as specified in Eq. (6), which in this case is e=2.718, the dense layer was activated using the SoftMax activation function.

$w^{\prime}=w-\alpha \times \nabla\left(w ; x^{(i)} ; y^{(i)}\right)$               (7)

Throughout backpropagation, we optimized with Stochastic Gradient Descent (SGD), an iterative method. In the give Eq. (7), where w stands for weight, $\alpha$ for learning rate, and $\nabla\left(w ; \quad x^{(i)} ; \quad y^{(i)}\right)$ for the gradient to weight, input, and output/label, respectively.

3.4.2 Proposed ResNet50V2 with TL

ResNet50v2, a star performer in the computer vision world, sits alongside champions like VGG16, DenseNet121, Xception, and MobileNetV2. Built upon vast datasets of diverse images, these pre-trained models offer their expertise through transfer learning algorithms, even with limited data and resources. In this study, we leverage a large medical image dataset and perform transfer learning with ten distinct pre-trained weights derived from ResNet50v2. ResNet50v2, a Convolutional Neural Network (CNN) boasting 50 layers, forms the backbone of our exploration. Figure 4 unveils its architecture, alongside our fine-tuning setup for transfer learning.

ResNet50v2 architecture features a series of convolutional layers, starting with an initial layer using 64 kernels, a stride of 2, and a 7×7 filter, followed by 3×3 pooling. It then stacks multiple sets of convolutional layers, each set containing three layers: 1×1 convolutions, 3×3 convolutions, and a final 1×1 convolution that increases channel depth. This pattern repeats with progressively larger kernel numbers and more repetitions through the network. Max pooling and hidden layers combining convolutions, batch normalization, and ReLU activations further refine the features. Notably, the original fully connected layer with 1000 out-features is replaced with a group of fully connected layers to enhance the model's performance. To adapt ResNet50v2 for three-class dementia classification (non-demented, mild, moderate, very mild), the original final layer is replaced with a custom dropout scheme. This involves selecting the first 2048-feature layer with a 50% chance of inclusion, followed by a ReLU layer and another dropout layer with the same probability. Finally, a final fully-connected layer with 4 outputs maps features to the specific dementia classifications. Notably, this study explored transfer learning, utilizing 10 pre-trained ResNet50v2 models from diverse medical image datasets to optimize performance for this specific task.

Figure 4. Enhanced architecture of the proposed model: Modified ResNet50V2 with 2PTL

4. Results

Developing a model for classifying MRI scans and detecting Alzheimer's disease often involves transfer learning, leveraging pre-trained weights from a larger model. This project utilized TensorFlow, a popular framework for building and training machine learning models. The process involved feeding 3256 MRI images into the network. In this study, we employed the Stochastic Gradient Descent with Momentum (SGDM) optimizer to play this role for our Alzheimer's detection model. This technique delicately fine-tuned the model's weights and biases, guiding it towards minimizing the loss function and maximizing accuracy. The learning process unfolded over 50 epochs, each meticulously reviewing the entire dataset 107 times. To ensure the model didn't overfit and memorize specific patterns, we relied on a small batch size of 512 images. This, coupled with an early stopping parameter of 4 on the validation set, acted as a safeguard against clinging to irrelevant details, promoting robust generalization to unseen data. Through this careful choreography of optimization, data exposure, and safeguard measures, we empowered the model to achieve impressive accuracy in Alzheimer's detection, paving the way for more confident diagnoses and improved patient care. This allows the model to stop training if it's not improving on the validation data, preventing it from memorizing the training set without generalizing well to unseen data. Choosing the optimal learning rate plays a crucial role in balancing convergence speed and accuracy. Experiments revealed that while the model achieved its best performance at a learning rate of 1e-4, this was still significantly faster than the average. A learning rate of 1e-4 was therefore used across all models for consistency. Evaluating the performance of a classification model goes beyond simple accuracy. To gain deeper insights, a confusion matrix was used to assess precision, recall, and other metrics for each class. This comprehensive analysis provides a more nuanced understanding of the model's strengths and weaknesses. Overall, this project explored six different models, each trained on a balanced dataset of 1200 MRI scans (400 per category) for 50 epochs. 70% of the data was used for training, with the remaining 30% reserved for testing. By utilizing transfer learning, optimizing training parameters, and employing advanced evaluation techniques, this approach provides a valuable framework for building and refining Alzheimer's disease detection models.

$sensitivity =\frac{\left(\frac{D_p}{N_p}\right)}{\left(\frac{D_p}{N_p}\right)+\left(\frac{D_n}{N_n}\right)} * 100$                (8)

$specificity=\frac{\left(\frac{D_m}{N_n}\right)}{\left(\frac{D_m}{N_m}\right)+\left(\frac{D_e}{N_e}\right)} * 100$            (9)

$precision=\frac{\left(\frac{D_p}{N_p}\right)}{\left(\frac{D_p}{N_p}\right)+\left(\frac{D_e}{N_e}\right)} * 100$              (10)

$accuracy=\frac{\left(\frac{D_P}{N_p}\right)+\left(\frac{D_m}{N_m}\right)}{p+m} * 100$               (11)

$miss \, rate=1-\frac{\left(\frac{D_P}{N_p}\right)+\left(\frac{D_m}{N_m}\right)}{p+m} * 100$                 (12)

$false \,positive \,rate=1-\frac{\left(\frac{D_m}{N_n}\right)}{\left(\frac{D_m}{N_m}\right)+\left(\frac{D_e}{N_e}\right)} * 100$                (13)

$false \, negative \, rate =1-\frac{\left(\frac{D_p}{N_p}\right)}{\left(\frac{D_p}{N_p}\right)+\left(\frac{D_n}{N_n}\right)} * 100$                 (14)

Table 2. Performance evaluation table for InceptionResnetv2

Model

Classification

Precision

Recall

F1-Score

Accuracy

Inception Resnetv2

NC vs MCI

91.23

90.35

89.51

91.85

MCI vs AD

92.25

89.51

91.86

90.33

AD vs NC

91.13

89.58

88.45

90.78

Table 3. Performance evaluation table for Resnet50v2

Model

Classification

Precision

Recall

F1-Score

Accuracy

Resnet50v2

NC vs MCI

91.24

91.63

92.95

92.45

MCI vs AD

90.54

90.35

89.45

90.65

AD vs NC

92.45

91.41

91.95

91.25

Table 4. Accuracy comparison of different models on Alzheimer’s disease MRI images

Models

Training Accuracy (%)

Testing Accuracy (%)

DenseNet121 [3]

87.5

86.5

MobileNetV2 [30]

88.4

87.3

VGG16 [33]

83.5

84.5

Xception [35]

86.5

83.8

InceptionResNetV2

90.9

90.7

Proposed Model

92.15

91.25

Both InceptionResnetV2 and ResNet50V2 achieved strong performance on all evaluation metrics, as detailed in Tables 2 and 3. Notably, ResNet50V2 excelled in classification, achieving an impressive 91.25% testing accuracy. This surpassed even state-of-the-art models like VGG16 and Xception, as shown in the same tables. In Table 4 with its outstanding 92.15% training accuracy and top 91.25% testing accuracy, ResNet50V2 emerged as the clear winner for classifying Alzheimer's disease.

Figure 5 depicts comparative results graphically: ResNet50v2 reigns supreme in classifying 3D MRI images! Among all the models tested, it boasts the highest training and testing accuracy, a true testament to its prowess. While InceptionResNetV2 follows closely behind, securing the second-highest accuracy, ResNet50v2 clearly sets the benchmark. Compared to other contenders like VGG16, DenseNet121, Xception, and MobileNetV2, ResNet50v2 emerges as the undisputed champion. This visual illustration reinforces the data, leaving no doubt about ResNet50v2's exceptional capabilities in tackling the challenge of accurate Alzheimer's disease detection through MRI analysis.

Figure 5. Graphical representation for comparative results of Alzheimer’s disease classification

5. Conclusions

Deep learning shines in identifying Alzheimer's disease from MRI scans, as this study demonstrates. Unlike humans, complex deep neural networks excel at navigating vast, intricate datasets, offering a powerful, data-driven approach to problem-solving in medical research. Their potential lies in automating tasks for neurologists while minimizing human error. Here, we applied transfer learning to classify MRI images into two categories using InceptionResNetV2 and Resnet50v2, both trained on existing datasets. Both models successfully categorized the data, with the proposed model showcasing exceptional performance: 92.15% training accuracy and 91.25% testing accuracy, surpassing other models. These results solidify Resnet50v2 with transfer learning as a champion for classifying 3D MRI images. Looking ahead, exploring advanced deep learning models and hyperparameter tuning on diverse datasets holds immense promise for even more accurate Alzheimer's disease detection.

  References

[1] Jain, R., Jain, N., Aggarwal, A., Hemanth, D.J. (2019). Convolutional neural network based Alzheimer’s disease classification from magnetic resonance brain images. Cognitive Systems Research, 57: 147-159. https://doi.org/10.1016/j.cogsys.2018.12.015

[2] Cui, R., Liu, M. (2018). Hippocampus analysis by combination of 3-D DenseNet and shapes for Alzheimer's disease diagnosis. IEEE Journal of Biomedical and Health Informatics, 23(5): 2099-2107. https://doi.org/10.1109/JBHI.2018.2882392

[3] Abrol, A., Bhattarai, M., Fedorov, A., Du, Y., Plis, S., Calhoun, V., Alzheimer’s Disease Neuroimaging Initiative. (2020). Deep residual learning for neuroimaging: An application to predict progression to Alzheimer’s disease. Journal of Neuroscience Methods, 339: 108701. https://doi.org/10.1016/j.jneumeth.2020.108701

[4] Huang, Y., Xu, J., Zhou, Y., Tong, T., Zhuang, X., Alzheimer’s Disease Neuroimaging Initiative (ADNI). (2019). Diagnosis of Alzheimer’s disease via multi-modality 3D convolutional neural network. Frontiers in Neuroscience, 13: 448373. https://doi.org/10.3389/fnins.2019.00509

[5] Goceri, E. (2019). Diagnosis of Alzheimer's disease with Sobolev gradient-based optimization and 3D convolutional neural network. International Journal for Numerical Methods in Biomedical Engineering, 35(7): e3225. https://doi.org/10.1002/cnm.3225

[6] Klöppel, S., Stonnington, C.M., Chu, C., Draganski, B., Scahill, R.I., Rohrer, J.D., Fox, N.C., Jack, C.R., Ashburner, J., Frackowiak, R.S. (2008). Automatic classification of MR scans in Alzheimer's disease. Brain, 131(3): 681-689. https://doi.org/10.1093/brain/awm319

[7] Lerch, J.P., Pruessner, J., Zijdenbos, A.P., Collins, D.L., Teipel, S.J., Hampel, H., Evans, A.C. (2008). Automated cortical thickness measurements from MRI can accurately separate Alzheimer's patients from normal elderly controls. Neurobiology of Aging, 29(1): 23-30. https://doi.org/10.1016/j.neurobiolaging.2006.09.013

[8] Zhang, D., Wang, Y., Zhou, L., Yuan, H., Shen, D., Alzheimer's Disease Neuroimaging Initiative. (2011). Multimodal classification of Alzheimer's disease and mild cognitive impairment. Neuroimage, 55(3): 856-867. https://doi.org/10.1016/j.neuroimage.2011.01.008

[9] Silveira, M., Marques, J. (2010). Boosting Alzheimer disease diagnosis using PET images. In 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, pp. 2556-2559. https://doi.org/10.1109/ICPR.2010.626

[10] Suk, H.I., Lee, S.W., Shen, D., Alzheimer’s Disease Neuroimaging Initiative. (2015). Latent feature representation with stacked auto-encoder for AD/MCI diagnosis. Brain Structure and Function, 220: 841-859. https://doi.org/10.1007/s00429-013-0687-3

[11] Long, X., Wyatt, C. (2010). An automatic unsupervised classification of MR images in Alzheimer's disease. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, pp. 2910-2917. https://doi.org/10.1109/CVPR.2010.5540031

[12] Liu, S., Liu, S., Cai, W., Che, H., Pujol, S., Kikinis, R., Feng, D., Fulham, M.J. (2014). Multimodal neuroimaging feature learning for multiclass diagnosis of Alzheimer's disease. IEEE Transactions on Biomedical Engineering, 62(4): 1132-1140. https://doi.org/10.1109/TBME.2014.2372011

[13] Gutman, B., Wang, Y., Morra, J., Toga, A.W., Thompson, P.M. (2009). Disease classification with hippocampal shape invariants. Hippocampus, 19(6): 572-578. https://doi.org/10.1002/hipo.20627

[14] Liu, F., Shen, C. (2014). Learning deep convolutional features for MRI based Alzheimer’s disease classification. Computer Vision and Pattern Recognition. https://doi.org/10.48550/arXiv.1404.3366

[15] Li, R., Zhang, W., Suk, H.I., Wang, L., Li, J., Shen, D., Ji, S. (2014). Deep learning based imaging data completion for improved brain disease diagnosis. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2014: 17th International Conference, Boston, MA, USA, pp. 305-312. https://doi.org/10.1007/978-3-319-10443-0_39

[16] Payan, A., Montana, G. (2015). Predicting Alzheimer's disease: A neuroimaging study with 3D convolutional neural networks. arXiv preprint arXiv:1502.02506. https://doi.org/10.48550/arXiv.1502.02506.

[17] Karas, G.B., Scheltens, P., Rombouts, S.A., Visser, P.J., van Schijndel, R.A., Fox, N.C., Barkhof, F. (2004). Global and local gray matter loss in mild cognitive impairment and Alzheimer's disease. Neuroimage, 23(2): 708-716. https://doi.org/10.1016/j.neuroimage.2004.07.006

[18] Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. Machine Learning, 2048-2057. https://doi.org/10.48550/arXiv.1502.03044

[19] Studholme, C., Drapaca, C., Iordanova, B., Cardenas, V. (2006). Deformation-based mapping of volume change from serial brain MRI in the presence of local tissue contrast change. IEEE Transactions on Medical Imaging, 25(5): 626-639. https://doi.org/10.1109/TMI.2006.872745

[20] Wang, X., Gao, L., Song, J., Shen, H. (2016). Beyond frame-level CNN: Saliency-aware 3-D CNN with LSTM for video action recognition. IEEE Signal Processing Letters, 24(4): 510-514. https://doi.org/10.1109/LSP.2016.2611485

[21] Ahmed, O.B., Benois-Pineau, J., Allard, M., Catheline, G., Amar, C.B., Alzheimer's Disease Neuroimaging Initiative. (2017). Recognition of Alzheimer's disease and Mild Cognitive Impairment with multimodal image-derived biomarkers and Multiple Kernel Learning. Neurocomputing, 220: 98-110. https://doi.org/10.1016/j.neucom.2016.08.041

[22] Zhang, J., Gao, Y., Gao, Y., Munsell, B.C., Shen, D. (2016). Detecting anatomical landmarks for fast Alzheimer’s disease diagnosis. IEEE Transactions on Medical Imaging, 35(12): 2524-2533. https://doi.org/10.1109/TMI.2016.2582386

[23] Liu, M., Cheng, D., Wang, K., Wang, Y., Alzheimer’s Disease Neuroimaging Initiative. (2018). Multi-modality cascaded convolutional neural networks for Alzheimer’s disease diagnosis. Neuroinformatics, 16: 295-308. https://doi.org/10.1007/s12021-018-9370-4

[24] Liu, M., Zhang, J., Adeli, E., Shen, D. (2018). Landmark-based deep multi-instance learning for brain disease diagnosis. Medical Image Analysis, 43: 157-168. https://doi.org/10.1016/j.media.2017.10.005

[25] Suk, H.I., Lee, S.W., Shen, D., Alzheimer’s Disease Neuroimaging Initiative. (2017). Deep ensemble learning of sparse regression models for brain disease diagnosis. Medical Image Analysis, 37: 101-113. https://doi.org/10.1016/j.media.2017.01.008

[26] Suk, H.I., Lee, S.W., Shen, D., Alzheimer's Disease Neuroimaging Initiative. (2014). Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage, 101: 569-582. https://doi.org/10.1016/j.neuroimage.2014.06.077

[27] Lian, C., Liu, M., Zhang, J., Shen, D. (2018). Hierarchical fully convolutional network for joint atrophy localization and Alzheimer's disease diagnosis using structural MRI. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(4): 880-893. https://doi.org/10.1109/TPAMI.2018.2889096

[28] Ortiz, A., Munilla, J., Gorriz, J.M., Ramirez, J. (2016). Ensembles of deep learning architectures for the early diagnosis of the Alzheimer’s disease. International Journal of Neural Systems, 26(07): 1650025. https://doi.org/10.1142/S0129065716500258

[29] Oh, K., Chung, Y.C., Kim, K.W., Kim, W.S., Oh, I.S. (2019). Classification and visualization of Alzheimer’s disease using volumetric convolutional neural network and transfer learning. Scientific Reports, 9(1): 18150. https://doi.org/ 10.1038/s41598-019-54548-6

[30] Goceri, E. (2019). Diagnosis of Alzheimer's disease with Sobolev gradient-based optimization and 3D convolutional neural network. International Journal for Numerical Methods in Biomedical Engineering, 35(7): e3225. https://doi.org/10.1002/cnm.3225

[31] Li, F., Liu, M., Alzheimer's Disease Neuroimaging Initiative. (2018). Alzheimer's disease diagnosis based on multiple cluster dense convolutional networks. Computerized Medical Imaging and Graphics, 70: 101-110. https://doi.org/10.1016/j.compmedimag.2018.09.009

[32] Bi, X., Zhao, X., Huang, H., Chen, D., Ma, Y. (2020). Functional brain network classification for Alzheimer’s disease detection with deep features and extreme learning machine. Cognitive Computation, 12: 513-527. https://doi.org/10.1007/s12559-019-09688-2

[33] Hosseini-Asl, E., Gimel'farb, G., El-Baz, A. (2016). Alzheimer's disease diagnostics by a deeply supervised adaptable 3D convolutional network. arXiv preprint arXiv:1607.00556. https://doi.org/10.48550/arXiv.1607.00556

[34] Rieke, J., Eitel, F., Weygandt, M., Haynes, J.D., Ritter, K. (2018). Visualizing convolutional networks for MRI-based diagnosis of Alzheimer’s disease. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications: First International Workshops, MLCN 2018, DLF 2018, and iMIMIC 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, pp. 24-31. https://doi.org/10.1007/978-3-030-02628-8_3

[35] Huang, Y., Xu, J., Zhou, Y., Tong, T., Zhuang, X., Alzheimer’s Disease Neuroimaging Initiative (ADNI). (2019). Diagnosis of Alzheimer’s disease via multi-modality 3D convolutional neural network. Frontiers in Neuroscience, 13: 448373. https://doi.org/10.3389/fnins.2019.00509

[36] Wen, J., Thibeau-Sutre, E., Diaz-Melo, M., Samper-González, J., Routier, A., Bottani, S., Dormont, D., Durrleman, S., Burgos, N., Colliot, O., Alzheimer's Disease Neuroimaging Initiative. (2020). Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducible evaluation. Medical Image Analysis, 63: 101694. https://doi.org/10.1016/j.media.2020.101694

[37] Abrol, A., Bhattarai, M., Fedorov, A., Du, Y., Plis, S., Calhoun, V., Alzheimer’s Disease Neuroimaging Initiative. (2020). Deep residual learning for neuroimaging: an application to predict progression to Alzheimer’s disease. Journal of Neuroscience Methods, 339: 108701. https://doi.org/10.1016/j.jneumeth.2020.108701