From the course: Deep Learning with Python: Optimizing Deep Learning Models

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Mini-batch gradient descent

Mini-batch gradient descent

- [Presenter] Mini-batch gradient descent aims to combine the advantages of both batch gradient descent and stochastic gradient descent by updating the model parameters based on the gradient computed from a small batch of training examples. This batch size is typically larger than one as an SGD, but smaller than the total dataset as in batch gradient descent. Picture this as navigating down the hill using information from a small group of nearby paths. This approach allows for both the speed of SGD and the stability of batch gradient descent. One of the primary benefits of mini-batch gradient descent is its computational efficiency. By processing batches of data, it leverages the power of vectorization and optimized hardware like GPUs and TPUs. This can significantly speed up computations compared to processing single samples as in SGD. The use of many batches allows the algorithm to make efficient use of memory hierarchies and parallel processing capabilities, reducing the time per…

Contents