Detecting Alzheimer’s disease (AD) from MRI scans remains a significant challenge in medical imaging due to the subtle, low-contrast anatomical changes associated with neurodegeneration. Real-world scans often suffer from noise, motion artifacts, and scanner variability, all of which can severely degrade deep learning (DL) model performance. In our previous work on MRI-based 4-class brain tumor classification, we developed a robustness methodology that employed Gaussian blur and Gaussian noise as augmentation techniques to counteract the impact of various image degradations-including multiple noise types, blurring methods, and simulated patient motion. In this study, we adapt and apply the same robustness strategy to a 4-class AD classification task using a publicly available Kaggle MRI dataset. To address class imbalance, we employed a Conditional Wasserstein GAN with Gradient Penalty (WGAN-GP) to generate a balanced training set of $mathbf2, 5 6 0$ images per class, while retaining the original, unaugmented test set. A fine-tuned EfficientNetV2B0 model achieved 99 % accuracy on the augmented training data; however, when evaluated under real-world challenges, performance dropped to $mathbf4 4. 3 3 %$-significantly lower than the 69.66 % achieved in the brain tumor benchmark. This stark contrast underscores the inherent difficulty of AD classification, exacerbated by low-resolution axial-only scans and minimal interclass visual differences. Although the robustness methodology yielded partial performance recovery, the results highlight the need for higher-resolution, multi-planar imaging and domain-specific preprocessing strategies. This study advances the understanding of robustness in DL-based neuroimaging and underscores the importance of tailored augmentation pipelines for reliable Alzheimer’s diagnosis in clinical settings.