Setting up a Computer for Deep Learning Programming with Keras, TensorFlow, CUDA and VSCode.
Although, for ML engineers and Data Scientists, setting up TensorFlow (TF) is like running a hot knife through butter. However, I have noticed beginners still struggle to set it up correctly and in many instances it causes frustration. Here is a quick guide to installing and configuring TF with VSCode on Windows.
Elementaries
This section aims to declutter the elementary terminologies about the development environment that an ML engineer must know. TensorFlow(TF) is an OpenSource cross-platform, multi-language library for Deep Learning. TF can run both on CPU and GPU (i.e., a CUDA-capable Nvidia GPU); running on GPU results in optimal performance. This article aims to set up the TF environment for the following specs.
The above figure depicts the summary of the technology stack required to get a Windows system ready for running a Deep Learning program using a Python3.6+ interpreter. The following is a summary of the stack with the purposes of its constituents.
In summary, When running a Deep Learning python program, import Keras which uses TensorFlow as a backend that leverages CUDNN to run the code on GPU by using the CUDA API.
Installation
Follow the steps below to set up a Deep Learning programming environment.
Tensorflow-GPU installation
# step 5.1. Create vitual envionment 'tf
conda create --name tf python=3.9
# step 5.2. Exit from the 'base' environment and activate 'tf'
conda deactivate
conda activate tf
# step 5.3. Update all Conda packages
conda update --all
# step 5.4. Install CUDA and CUDNN
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
# step 5.5. Update Python package manager (Pip) within the 'tf' environemnt
python -m pip install --upgrade pip
# step 5.6. Install TensorFlow and associate packages
python -m pip install "tensorflow<2.11" keras
# step 5.7. Verify
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"''
6. In the Anaconda Prompt locate the "conda" command and update the Path environment variable with the location of the "conda" command. This will allow CMD or Powershell to call the "conda" command without Anaconda Prompt. Without this, the VSCode will not be able to find the "conda" command to while running a python file from the "conda" environment.
(base) C:\Users\sapta>where conda
C:\Users\sapta\miniconda3\Library\bin\conda.bat
C:\Users\sapta\miniconda3\Scripts\conda.exe
C:\Users\sapta\miniconda3\condabin\conda.bata
Recommended by LinkedIn
C:\Users\sapta>conda --version
conda 4.12.0
7. Open PowerShell and set the Execution Policy to unrestricted for all. Without this, VSCode will not be able to activate the "tf" environment.
PS C:\Windows\system32> Set-ExecutionPolicy Unrestricted
Execution Policy Change
The execution policy helps protect you from scripts that you do not trust. Changing the execution policy might expose
you to the security risks described in the about_Execution_Policies help topic at
https:/go.microsoft.com/fwlink/?LinkID=135170. Do you want to change the execution policy?
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is "N"): A
PS C:\Windows\system32>
VSCode Configuration
print('hello world !!')
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Preprocess the data (these are NumPy arrays)
x_train = x_train.reshape(60000, 784).astype("float32") / 255
x_test = x_test.reshape(10000, 784).astype("float32") / 255
y_train = y_train.astype("float32")
y_test = y_test.astype("float32")
# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]
model.compile(
optimizer=keras.optimizers.RMSprop(), # Optimizer
# Loss function to minimize
loss=keras.losses.SparseCategoricalCrossentropy(),
# List of metrics to monitor
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
print("Fit model on training data")
history = model.fit(
x_train,
y_train,
batch_size=64,
epochs=2,
# We pass some validation for
# monitoring validation loss and metrics
# at the end of each epoch
validation_data=(x_val, y_val),
)
# Evaluate the model on the test data using `evaluate`
print("Evaluate on test data")
results = model.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results)
# Generate predictions (probabilities -- the output of the last layer)
# on new data using `predict`
print("Generate predictions for 3 samples")
predictions = model.predict(x_test[:3])
print("predictions shape:", predictions.shape)
Output
tf) PS C:\Users\sapta\OneDrive\Desktop\UniML> c:; cd 'c:\Users\sapta\OneDrive\Desktop\UniML'; & 'C:\Users\sapta\miniconda3\envs\tf\python.exe' 'c:\Users\sapta\.vscode\extensions\ms-python.python-2022.20.0\pythonFiles\lib\python\debugpy\adapter/../..\debugpy\launcher' '64268' '--' 'c:\Users\sapta\OneDrive\Desktop\UniML\tesr.py
2022-12-08 01:41:45.977775: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-08 01:41:47.221220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13577 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Ti Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11490434/11490434 [==============================] - 2s 0us/step
Fit model on training data
Epoch 1/2
2022-12-08 01:41:53.037569: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
782/782 [==============================] - 6s 4ms/step - loss: 0.3440 - sparse_categorical_accuracy: 0.9025 - val_loss: 0.2009 - val_sparse_categorical_accuracy: 0.9432
Epoch 2/2
782/782 [==============================] - 3s 3ms/step - loss: 0.1598 - sparse_categorical_accuracy: 0.9517 - val_loss: 0.1493 - val_sparse_categorical_accuracy: 0.9553
Evaluate on test data
79/79 [==============================] - 0s 3ms/step - loss: 0.1527 - sparse_categorical_accuracy: 0.9524
test loss, test acc: [0.15272361040115356, 0.9524000287055969]
Generate predictions for 3 samples
1/1 [==============================] - 0s 129ms/step
predictions shape: (3, 10)
(tf) PS C:\Users\sapta\OneDrive\Desktop\UniML> '
Congratulations!! your system can now learn Deep... with GPU!!
References