From the course: Deep Learning with Python: Optimizing Deep Learning Models
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
Mini-batch gradient descent - Python Tutorial
From the course: Deep Learning with Python: Optimizing Deep Learning Models
Mini-batch gradient descent
- [Presenter] Mini-batch gradient descent aims to combine the advantages of both batch gradient descent and stochastic gradient descent by updating the model parameters based on the gradient computed from a small batch of training examples. This batch size is typically larger than one as an SGD, but smaller than the total dataset as in batch gradient descent. Picture this as navigating down the hill using information from a small group of nearby paths. This approach allows for both the speed of SGD and the stability of batch gradient descent. One of the primary benefits of mini-batch gradient descent is its computational efficiency. By processing batches of data, it leverages the power of vectorization and optimized hardware like GPUs and TPUs. This can significantly speed up computations compared to processing single samples as in SGD. The use of many batches allows the algorithm to make efficient use of memory hierarchies and parallel processing capabilities, reducing the time per…
Contents
-
-
-
-
-
(Locked)
Common loss functions in deep learning5m 4s
-
(Locked)
Batch gradient descent3m 32s
-
(Locked)
Stochastic gradient descent (SGD)2m 55s
-
(Locked)
Mini-batch gradient descent3m 37s
-
(Locked)
Adaptive Gradient Algorithm (AdaGrad)4m 43s
-
(Locked)
Root Mean Square Propagation (RMSProp)2m 40s
-
(Locked)
Adaptive Delta (AdaDelta)1m 47s
-
(Locked)
Adaptive Moment Estimation (Adam)3m 8s
-
(Locked)
-
-
-