Neural Network and Deep Learning (Part 2) - Python Code

Ignatius Herwasdhyala Wisesa

Published Jan 7, 2025

Overview

In the first part of my neural network series, I provided an overview of neural networks. In this part, I will explain how to implement these concepts using Python. We'll leverage the TensorFlow module, which provides mathematical functions to process each neuron, and the sklearn module, which helps normalize data and evaluate the results produced by the neural network.

Background

In this example, I use air particle data from an area to demonstrate the process. The goal is to analyze the data over time, classify it based on CO (carbon monoxide) levels, and predict NOx (Nitrogen Oxides) concentrations.

Two key aspects of this analysis involve classification and prediction:

Classification: Determining whether carbon monoxide (CO), exceeds a certain threshold is essential for health advisories. In this article example we will classify whether a particular day is classified as having "high" CO or "low" CO.
Prediction: Predicting nitrogen oxides (NOx) concentration, provides insights into future air quality. In this article example we will predict what the concentration of NOx in the air will be on a particular day based on known parameters.

This is an example of the data we currently have.

Classification Task

Now I will explain how to do the classification task.

1) Import the necessary modules

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score


import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2

2) Load the Data and Correct Incorrect Data for Further Processing

# Load the dataset
file_path = 'ClassPred.xlsx'  # Change this to the correct file path
data = pd.read_excel(file_path, engine='openpyxl')


# Replace -1 values with NaN for easier manipulation
data.replace(-1, np.nan, inplace=True)


# Calculate the mean of each column (ignoring NaN values)
column_means = data.mean()


# Save the means to a file
dump(column_means, 'column_means.pkl')


# Replace NaN values with the mean of each column
data.fillna(column_means, inplace=True)


# Remove any non-numeric columns, including date and time
data = data.select_dtypes(include=[np.number])

3) Prepare a threshold. To keep it simple, in this example the threshold used is the mean of the CO values in the data.

# Calculate the mean of CO concentrations
co_mean = data['CO(GT)'].mean()
print(f"Mean CO concentration for thresholding: {co_mean}")


# Binary classification threshold (adjustable as needed)
threshold = co_mean
data['CO_high'] = (data['CO(GT)'] > threshold).astype(int)

4) Drop CO from the data so that it is not included as a parameter that affects the CO value. The CO concentration value is influenced by the concentration of other air particles.

# Drop 'CO(GT)' from the features
data = data.drop(columns=['CO(GT)'])

5) We then divide the data into two parts: training and testing datasets. The training data is used to train the neural network, while the testing data is used to evaluate whether the neural network's output, based on the training data, is accurate enough for classification. Typically, the data is split in an 80:20 ratio, but a 90:10 split can also be used, depending on the specific requirements of the task.

# Split the data into training and testing sets
X = data.drop('CO_high', axis=1)  # Features
y = data['CO_high']  # Target variable
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  # 20% test, 80% train

6) Next, we perform normalization on the data. Normalized data refers to numeric data that has been standardized and cleaned. Normalization is essential for reducing bias in the data and ensuring that all features are on a similar scale, which improves the performance and stability of the neural network.

# Normalization of the dataset (only numeric columns)
numeric_cols = X_train.columns  # Select only input feature columns
scaler = MinMaxScaler()
scaler.fit(X_train[numeric_cols])  # Fit scaler to training input features only
X_train[numeric_cols] = scaler.transform(X_train[numeric_cols])  # Transform the training input features
X_test[numeric_cols] = scaler.transform(X_test[numeric_cols])  # Transform the test input features

7) With the data now fully cleaned, we arrive at the most exciting and crucial part of this article: training the neural network.

# Define the neural network model with an additional layer
model = Sequential()
model.add(Dense(16, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=l2(0.01)))
model.add(Dense(1, activation='sigmoid'))

These code can be explain as:

model = Sequential()

This creates a sequential layered model. Can add more layers of neurons (nodes) within the sequential model.

model.add(Dense(16, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=l2(0.01)))

This step adds a fully connected (dense) layer with 16 neurons (nodes). Each neuron connects to every input feature or neuron from the previous layer (as we see in the first article, all nodes are fully connected). ReLU (Rectified Linear Unit) is used as an activation function to the layer’s output. ReLU introduces non-linearity (not black-or-white scoring) and usually a good activation function for the first hidden layer.

kernel_regularizer=l2(0.01)

This code adds L2 regularization to the weights of this layer, penalizing large weight values to help prevent overfitting. Overfitting occurs when the neural network model learns not only the underlying patterns in the training data but also the noise and specific details that are irrelevant.

model.add(Dense(1, activation='sigmoid'))

This step adds the output layer to the model, which uses sigmoid as its activation function. Sigmoid output 0 or 1, simplifies output and is suitable for classification task.

8) Now that we have defined the various parameters suitable for the neural network model, we can proceed to initialize and run our neural network.

# Instantiate the optimizer with a custom learning rate
custom_lr = 0.001  # Example: Setting the learning rate to 0.001
optimizer = Adam(learning_rate=custom_lr)


# Compile the model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])


# Train the model
history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_split=0.2)

These code can be explain as:

Recommended by LinkedIn

SHARING KNOWLEDGE MAKES HUMANITY PROGRESS

Felipe Serna Barbosa 5 years ago

A Step-by-Step Guide to Recurrent Neural Networks…

Robert Bergmann 1 year ago

Deep Learning Part 4: Building a realistic neural…

Sadat Rafsanjani 10 months ago

custom_lr = 0.001  # Example: Setting the learning rate to 0.0005
optimizer = Adam(learning_rate=custom_lr)

The learning rate determines how quickly the neural network model learns from the results it produces. A value of 1 indicates that the old values are completely replaced with the new ones. It is generally recommended for developers to start with a small learning rate, observe the performance of the neural network, and gradually increase it as confidence in the model grows. An alternative approach is to use the Adam Optimizer, which automatically adjusts the learning rate. The Adam Optimizer increases the learning rate as the model stabilizes, making it a popular and effective choice for training neural networks.

# Compile the model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

The model is then compiled using the binary cross-entropy loss function. The binary cross-entropy function is used because it is well-suited for binary classification tasks. A loss function itself is a mathematical tool that measures the difference between the predicted output of a machine learning model and the actual target values (a 20% split from training data, explained in the next code snippet). In this case, the binary cross-entropy loss function is used to evaluate the model's performance. It provides feedback to the model during training, guiding the optimization process to improve its predictions over time.

# Train the model
history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_split=0.2)

The model.fit method is used to train the neural network by iteratively passing the training data (x_train, y_train) through the model and adjusting its weights to minimize the loss function. Here's a breakdown of its key components:

x_train: The input features for training, representing the preprocessed dataset that the model uses to learn patterns.
y_train: The target labels for training, containing the actual values that the model attempts to predict.
Epochs: Specifies the number of complete passes through the entire training dataset. For example, with 100 epochs, the model will see the entire dataset 100 times during training. While more epochs can help the model learn better, training for too many epochs can lead to overfitting.
Batch size: Defines the number of samples processed before the model updates its weights in each iteration. In this case, the model processes 32 samples at a time and updates its weights after each batch. Using batches helps the optimizer converge more smoothly by averaging gradients over a batch.
Validation split: Represents the percentage of data reserved for the validation set. This set is used to evaluate the model's performance at the end of each epoch. The goal is to improve the model's accuracy with each epoch while avoiding overfitting, ensuring it generalizes well to unseen data.

This method allows the model to learn progressively and improve its predictions by minimizing the difference between predicted and actual values.

9) After training, we will evaluate the results produced by the model.

# Evaluate the model
_, accuracy = model.evaluate(X_test, y_test)
y_pred = (model.predict(X_test) > 0.5).astype(int)  # Predict the class labels


# Confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", conf_matrix)


# Plotting the training and validation loss
plt.figure(figsize=(6, 4))
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss Plot for the Classification Task')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()


# Plotting the training and validation accuracy
plt.figure(figsize=(6, 4))
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Accuracy Plot for the Classification Task')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

These code can be explain as:

_, accuracy = model.evaluate(X_test, y_test)

Once the model is trained, its performance is evaluated by comparing the predictions to the test dataset. This step calculates the loss and measures accuracy based on the test dataset, providing insights into how well the model generalizes to unseen data.

y_pred = (model.predict(X_test) > 0.5).astype(int)  # Predict the class labels

This code predicts the probabilities for the positive class (ranging between 0 and 1) for each sample in x_test. In the context of binary classification, probabilities greater than 0.5 are converted to 1 (positive class), while those 0.5 or below are converted to 0 (negative class).

The remaining code evaluates the model using a confusion matrix, accuracy, precision, and visualizations such as the loss and validation plots.

Classification Task Results

This is the result of a neural network model that has been trained on test data.

The Confusion matrix shows:

True Negatives (TN): 992
False Positives (FP): 100
False Negatives (FN): 118
True Positives (TP): 462

Which can be interpreted as:

High TN (992): The model is very good at identifying negative cases.
Moderate TP (462): The model is reasonably good at identifying positive cases but misses some.
Errors (FP + FN): There are a total of 218 errors (100 false positives + 118 false negatives).

Accuracy: 86.96% of all predictions (both positive and negative) are correct.

Precision: Of all the samples predicted as positive, 82.21% are actually positive.

For most tasks, these results are sufficient and ready to be presented.

Loss plot and accuracy plot show how the model learns over time and can also indicate whether the model has learned well enough and generalized the data well enough.

Loss Plot:

Blue line represents the model's loss on the training dataset. The loss decreases over epochs, indicating that the model is learning and fitting the training data.
Orange line represents the model's loss on the validation dataset. The validation loss closely follows the training loss, which is a good sign that the model is generalizing well and not overfitting.

Both training and validation loss decrease steadily and stabilize as epochs progress, showing that the model has converged.

Accuracy Plot:

Blue line shows the accuracy of the model on the training dataset. The training accuracy steadily increases and plateaus, indicating that the model is learning effectively.
Orange line represents the model's accuracy on the validation dataset. The validation accuracy follows the training accuracy closely, which is a sign of good generalization.

Both training and validation accuracy plateau at around 87%, suggesting strong predictive performance.

In general, this is the basic process of creating a neural network model for classification. In a future post, I will explain how to create a similar model for predicting continuous values. Both classification and prediction are highly versatile and powerful techniques applicable to a wide range of tasks.

Accelerate Growth 1y

This is a fascinating continuation of your series! The practical applications you're discussing are essential for anyone looking to deepen their understanding of neural networks. It will be interesting to see how the implementation with TensorFlow and sklearn unfolds. What specific challenges do you anticipate when working with these tools?

1 Reaction

To view or add a comment, sign in

Neural Network and Deep Learning (Part 2) - Python Code

Ignatius Herwasdhyala Wisesa

Recommended by LinkedIn

More articles by Ignatius Herwasdhyala Wisesa

Others also viewed

Neural Network Regression Implementation and Visualization in Python

Building Your First Neural Network: A Step-by-Step Tutorial

Deep Learning: The Training

Neural Style Transfer using Python and Keras

Deep Learning training

Training, Validation & Accuracy in PyTorch

📊 PYTHON + DEEP LEARNING TIP – Forecasting the Future Price of a Product Using an LSTM Neural Network

Creating Deep Neural Net using TensorFlow

First Step Towards Deep Learning: Neural Networks

Text Extraction in Python with Neural Networks: Deep Learning for Image Processing

Explore content categories

Recommended by LinkedIn

More articles by Ignatius Herwasdhyala Wisesa

Neural Network and Deep Learning (Part 1) - Simple Overview

TANYANAKES

Devkraf

Others also viewed

Neural Network Regression Implementation and Visualization in Python

Building Your First Neural Network: A Step-by-Step Tutorial

Deep Learning: The Training

Neural Style Transfer using Python and Keras

Deep Learning training

Training, Validation & Accuracy in PyTorch

📊 PYTHON + DEEP LEARNING TIP – Forecasting the Future Price of a Product Using an LSTM Neural Network

Creating Deep Neural Net using TensorFlow

First Step Towards Deep Learning: Neural Networks

Text Extraction in Python with Neural Networks: Deep Learning for Image Processing

Explore content categories