Optimizing Brain Tumor Detection Using Deep Learning: A Comparison of Adam and SGD Optimizers

Goal -

The primary goal of this study is to analyze the effect of different optimizers on the accuracy of a brain tumor detection model and classify brain tumors using deep learning techniques such as Convolutional Neural Networks (CNN) and Artificial Neural Networks (ANN).

Introduction-

Brain tumors are among the most aggressive diseases, affecting both children and adults. Timely detection and classification of brain tumors are crucial for effective treatment and planning.

Tumors are classified into various types, including:

  • Benign Tumors
  • Malignant Tumors
  • Pituitary Tumors

Magnetic Resonance Imaging (MRI) is the most effective technique for detecting brain tumors. Radiologists examine these MRI images for tumor diagnosis.

With advancements in machine learning and deep learning, automated classification techniques can aid in improving the accuracy of tumor detection, especially in the early stages.

Dataset Description-

This dataset, sourced from Kaggle, is divided into training and testing sets with the following image distribution:

  • Training Set: 2,870 images
  • Testing Set: 394 images

Folder Structure-

Training:

  • glioma_tumor
  • meningioma_tumor
  • no_tumor
  • pituitary_tumor

Testing:

  • glioma_tumor
  • meningioma_tumor
  • no_tumor
  • pituitary_tumor

Implementation-

The model is built using Python and libraries such as Pandas, NumPy, Matplotlib, and PyTorch. The architecture is based on a modified version of AlexNet, with added convolutional and fully connected layers to optimize performance.

The ImageFolder and DataLoader functions in PyTorch are used to read images from different folders and create batches for faster processing. Since images have varying dimensions (e.g., 512x512, 350x350), they are resized to a uniform size of 512x512 using the Resize function from the transforms module to ensure consistency for CNN input.

Model Summary -

The model is a CNN-based architecture, similar to AlexNet but with additional layers to capture deeper features for more accurate classification.

Article content

Optimizers-

torch.optim is a package used to implement various optimization algorithms.

This study mainly focuses on finding the optimal batch size and optimizer function. The two optimizers used are Adaptive Moment Estimation (Adam) and Stochastic Gradient Descent (SGD).

Adam Optimizer:

  • Uses adaptive learning rates for each parameter.
  • Computational cost is higher due to additional calculations for adaptive learning rates.
  • Faster convergence for complex architectures.

SGD Optimizer:

  • Uses a fixed learning rate.
  • Lower computational cost since it only relies on gradients.
  • Slower convergence but can generalize better in some cases.

Training Time Comparison-

The training time required for each combination of optimizer and batch size is recorded and visualized. This provides insights into the computational efficiency and effectiveness of each method at different batch sizes.

Article content

Visualizations-

  • Batch_Size=1


Article content


Article content

  • Batch_Size = 32


Article content


Article content

  • Batch_Size = 64


Article content


Article content

Conclusion & Findings -

  1. Effect of Batch Size: When the batch size is 1, the model behaves similarly to Stochastic Gradient Descent (SGD) because it updates parameters after each individual training example. This leads to more erratic accuracy and loss values.
  2. Mini-Batch Gradient Descent: To implement mini-batch gradient descent, the batch size must be set to greater than 1. Batch gradient descent requires the batch size to equal the entire dataset.
  3. SGD vs. Adam: For batch sizes of 32 and 64, the accuracy of the SGD model improves quickly, whereas Adam shows gradual improvement. This indicates that Adam may require more epochs to achieve optimal accuracy. However, Adam converges slower than SGD due to the additional complexity in its adaptive learning rates.
  4. Convergence: The SGD model converges faster but exhibits more instability due to noisy gradient updates. In contrast, Adam, while slower, provides a more stable convergence.
  5. Recommendations:

  • For limited computational resources, SGD is computationally efficient but may lead to erratic results.
  • For higher accuracy and computational power, Adam provides better results in the long run.
  • Mini-Batch Gradient Descent strikes a balance between SGD and batch gradient descent, offering stability and efficiency.

Link to Project

To view or add a comment, sign in

Others also viewed

Explore content categories