Suitable Acceleration Unit of Deep Learning with Google Colab Cloud

Aram Mostowfi

Published Mar 17, 2019

CPU, GPU or TPU?

Always freshers getting confused among different options of available google colab processing and accelerating units.

In some situations, you might want to use GPUs or CPUs on Compute Engine instances to run your machine learning workloads. In general, you can decide what hardware is best for your workload based on the following guidelines:

CPUs

Quick prototyping that requires maximum flexibility
Simple models that do not take long to train
Small models with small effective batch sizes
Models that are dominated by custom TensorFlow operations written in C++
Models that are limited by available I/O or the networking bandwidth of the host system

GPUs

Models that are not written in TensorFlow or cannot be written in TensorFlow
Models for which source does not exist or is too onerous to change
Models with a significant number of custom TensorFlow operations that must run at least partially on CPUs
Models with TensorFlow ops that are not available on Cloud TPU
Medium-to-large models with larger effective batch sizes

TPUs

Models dominated by matrix computations
Models with no custom TensorFlow operations inside the main training loop
Models that train for weeks or months
Larger and very large models with very large effective batch sizes

Cloud TPUs are not suited to the following workloads:

Linear algebra programs that require frequent branching or are dominated element-wise by algebra. TPUs are optimized to perform fast, bulky matrix multiplication, so a workload that is not dominated by matrix multiplication is unlikely to perform well on TPUs compared to other platforms.
Workloads that access memory in a sparse manner might not be available on TPUs.
Workloads that require high-precision arithmetic. For example, double-precision arithmetic is not suitable for TPUs.
Neural network workloads that contain custom TensorFlow operations written in C++. Specifically, custom operations in the body of the main training loop are not suitable for TPUs.

Suitable Acceleration Unit of Deep Learning with Google Colab Cloud

Aram Mostowfi

CPU, GPU or TPU?

CPUs

GPUs

TPUs

Reference : Google.Colab

More articles by Aram Mostowfi

Others also viewed

HPC & AI Training Infrastructure — Sovereign Compute for Scientific Discovery

OpenAI’s $10B Bet Isn’t About Faster Chips

AI Workloads on Kubernetes

Kubeflow on GPU Enabled AWS-EKS Cluster

The GPU Blindspot: The Critical Skill AI PMs Can’t Afford to Miss

AI Microservices? NVIDIA NIM and the Future of Scalable AI Infrastructure

Setting up a deep learning GPU server— the right way

You don’t need to distribute your ML algorithm --Part II

Optimizing GPU Memory: 3 Techniques for Efficient Local Training using Pytorch

Computing Hardware: A Look at the Future

Explore content categories

CPU, GPU or TPU?

CPUs

GPUs

TPUs

Reference : Google.Colab

More articles by Aram Mostowfi

Measure the running time of a Python code

Others also viewed

HPC & AI Training Infrastructure — Sovereign Compute for Scientific Discovery

OpenAI’s $10B Bet Isn’t About Faster Chips

AI Workloads on Kubernetes

Kubeflow on GPU Enabled AWS-EKS Cluster

The GPU Blindspot: The Critical Skill AI PMs Can’t Afford to Miss

AI Microservices? NVIDIA NIM and the Future of Scalable AI Infrastructure

Setting up a deep learning GPU server— the right way

You don’t need to distribute your ML algorithm --Part II

Optimizing GPU Memory: 3 Techniques for Efficient Local Training using Pytorch

Computing Hardware: A Look at the Future

Explore content categories