Color and Classification

Rajeev M A

Published Jan 5, 2022

Classification of image is one of the basic task in Machine Learning (ML), specifically in Computer Vision (CV). Many deep learning / shallow learning architectures exist today to solve this problem with high accuracy. Many questions still does not have clear answer in this simple task. Some examples include

What do these architectures learn from the image? Why are we not able to explain it?
Why are they susceptible to adversarial attacks?
How invariant are they to translation, rotation, scaling, color, & illumination?
What order do they learn and do they give importance to order while learning like shape, relationship, aggregation, color, & illumination?
What happens if same number of images are trained on buckets with two different formats like jpg and png? Why are the accuracies different?
Does current neural network architectures give more importance to pixel values?
Does the training speed improve if the pixel values are bucketed?
Does the memory requirements reduce if the pixel values are bucketed?
Under what situations is color really important? For example to answer questions like, is this apple red or green?
Is black and white images more than sufficient for many image classification tasks that we perform today?

Having clear answers to such fundamental questions will allow us to take the field to the next level.

Today AI is more artificial and less intelligent. The goal is to make it less artificial and more intelligent.

The most common representation of image is a matrix where each of RGB can take values from 0 to 255 (256 values). For example on an image of size 300x300x3, each pixel can take a value of 0 to 255. The permutation and combination a neural network architecture needs to learn is too high given 300x300x3 image. It is of the order 256**(3*300*300) which is an insane value. Does a human eye understand all this permutation and combination to classify an image?

What are the ways we can reduce the permutation and combination? One simple scheme is to bucket pixel values in a range of 16 or 5 as below. All other values will be mapped to the nearest value in the below bucket list. You loose lot of information in such a scheme, but do you really loose accuracy very much? Does bucketizing help us to generalize better and protect us from adversarial attacks with some loss in accuracy?

Today we equate accuracy to generalization which is questionable. Generalization is the ability to abstract information in such a way that the future outcomes are rarely impacted even if the underlying data representation changes.

b_255 - Original Image - [0, 1, 2, 3, 4, .............. 251, 252, 253, 254, 255]

b_16 - Bucket range of 16 - [0, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 255]

b_5 - Bucket range of 5 - [0, 50, 100, 150, 200, 250]

Image formats do matter while saving them in bucket ranges. The difference between lossy and lossless compression needs to be understood. For example a format like jpg might not be able to store/restore the exact pixel values. Format like png will be better in case we want to study the impact of bucketizing exactly. If we are training the same million images in 2 different formats which are bucketized (jpg and png), do we get the same accuracy provided we keep all other parameters constant? The answer is no since the permutation and combination of pixel values you might have to deal in a jpg format might be higher than in a png format even though it is the same million images we have trained on using the same neural network topology with the same number of epochs. The outcomes are different with different image formats even though the information is exactly same from the perspective of a human eye.

Recommended by LinkedIn

Mathematical modeling in AI

Gowri Sathasivam 1 year ago

Beyond Euclidean Space: Topological Deep Learning for…

Dr. K. Sreedhar 7 months ago

Convolutional Sequence-to-Sequence Networks

Sridevi Bonthu 5 years ago

What is the impact of bucketizing image pixels on image classification? The stats of data used for training and testing are given below. The images are randomly collected from internet and are not part of any standard dataset. The b_16 and b_5 images are of format jpg which will not be exactly a bucket range of 16 and 5 respectively. The results obtained here should be replicable even on different images and different image formats like png which is exact.

Number of classes (Training): 981, Total images (Training): 424594

Number of classes (Testing): 981, Total images (Testing): 107867

Neural Network Topology: resnet18

A series of experiment was conducted to answer some fundamental questions.

What is the impact of bucketizing? Does information loss on images impact the image classification significantly or in proportion?
What is the outcome of training an image on different bucket ranges and inferring on different bucket ranges?
How does Top 1 accuracy and Top 5 accuracy compare with different bucket ranges?
Is the difference between Top 5 accuracy narrowed down between original image and a bucket of 5 compared to Top 1 accuracy?
Is bucketizing image pixels allow a robust mechanism against adversarial attacks since neural network deals with less permutation and combination in pixel values?
Is bucketizing images brings similar benefits to post quantization of weights in terms of reducing the variation of weights of trained neural networks?

The above results shows that bucketizing does have an impact on accuracy (not generalization), but it is not in proportion to the amount of information loss due to bucketization. For example restricting each pixel value from [0 .... 255] to just 6 pixel value [0, 50, 100, 150, 200, 250] which is approximately 42.67:1 from information loss where as the accuracy reduction is hardly 2 to 3% maximum. Loss of large amounts of information due to bucketing does not lead to an equal reduction in accuracy.

Training a model using original image (b_255) and inferring them on bucketed images (b_16 or b_5) has major impact on accuracy. The reverse of training on bucketed images (b_16 or b_5) and inferring on original image (b_255) has lesser accuracy loss. The best accuracy is obtained always when training and inferring on same bucket size images. Training on b_5 and inferring on b_16 surprisingly reduces accuracy by large margins. Is this really because intersection of pixels [0, 50, 100, 150, 200, 250] and [0, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 255] is very limited. This brings in an interesting question. How sensitive are these neural networks to pixel values? Why did the model trained on b_5 perform worst in images of b_16? Is it really because intersection of pixel values are limited?

The Top 1 accuracy loss from b_255 to b_5 is slightly higher compared to that of Top 5 accuracy. This means that for most practical purposes, the accuracy loss is minimal compared to the information loss from the image.

Robustness against adversarial attack is important when such technologies are deployed in large scale. Bucketing reduces the permutation and combination of pixel values that a neural network has to understand. No experiment was conducted to understand if b_5 is more robust than b_255 or b_16 when it comes to adversarial attacks.

The concept of bucketizing image pixels are similar to post quantization of weights in some sense. No experiment was conducted to understand the impact of speed and accuracy of both concepts and how they compare in terms of accuracy.

We are in the early stages of understanding AI and its limitations. Like any other technology it has potential to influence future systems and the way humans interact and function. Today we tend to see two sets of people when talking about AI. One who is highly optimistic without knowing/acknowledging the limitations of current AI technology. The second set of people thinks that the hype around AI is high and an AI winter is coming. I would like to take an intermediate view of technology as opposed to either extremes. Fundamental understanding of any technology will allows us to improve it over a period of time. The article was one such to understand what is happening under the hood from image classification perspective. It is a hope that we will have lesser parameter models in the future which are more robust in production.

Anandakrishnan Changoth 4y

This is a great

1 Reaction

Anita Sajo 4y

Great article, Rajeev M A . Good to see a quantifiable review on this topic. The phrase AI is more artificial and less intelligent catches the eye. In today's world , almost every other product claims to be driven by AI , without much substantiation of accuracy.

2 Reactions

Ramesh Putalapattu 4y

NIce one. Have not come across this term "bucketize" before especially in the image pixel context. It looks to be similar to the quantization of image pixel values. Any specific reason to prefer "bucketize" over "quantize"?

Color and Classification

Rajeev M A

Recommended by LinkedIn

More articles by Rajeev M A

Others also viewed

Uses of Linear Algebra in AI

01 DenseNet (Computer Vision)

Autoencoders 101: The What, Why, and How

Mastering Regularization: The Complete Guide to All Strategies

Variational Autoencoders (VAEs)

From Fluid Dynamics to Deep Learning: The Secret History of the BackPropagation algorithm

How LLMs work — step by step (a friendly, "accurate" guide for non-experts)

Machine Learning: An Exploration of Optimization Techniques

How far have we come with Artificial Intelligence?

The exciting new world of machine learning in healthcare

Challenges and Benefits of Deep Learning in AI

How to Optimize Machine Learning Performance

Neural Network Architectures

Understanding the Real vs AI-Generated Images Discussion

Explore content categories

Recommended by LinkedIn

More articles by Rajeev M A

AI: Between Innovation, Hype, and Economic Reality

Application of AI in Media Sector

Bridging the Gap: Industry and Academia in AI/ML

Vibe Coding: Where the AI Magic Fizzles Out

The 7R Model of AI Evolution: From Retrieval to Retroponitic

There is No Innovation Without an Invoice

Generative AI

Applications of Artificial Intelligence in the Power Sector

Stochasticity in Business Process

MLOps

Others also viewed

Uses of Linear Algebra in AI

01 DenseNet (Computer Vision)

Autoencoders 101: The What, Why, and How

Mastering Regularization: The Complete Guide to All Strategies

Variational Autoencoders (VAEs)

From Fluid Dynamics to Deep Learning: The Secret History of the BackPropagation algorithm

How LLMs work — step by step (a friendly, "accurate" guide for non-experts)

Machine Learning: An Exploration of Optimization Techniques

How far have we come with Artificial Intelligence?

The exciting new world of machine learning in healthcare

Similar topics

Challenges and Benefits of Deep Learning in AI

How to Optimize Machine Learning Performance

Neural Network Architectures

Understanding the Real vs AI-Generated Images Discussion

Explore content categories