A note on generalizability in machine learning

Gustaf Kylberg

Published May 27, 2020

Does your feature space or latent space describe what you think it does? Is the discrimination between classes based on properties you would assume as an human observer or did they model catch on to a irrelevant difference? Are you ready to move from the comfy surroundings of well defined datasets to assess the generalisation capability of your model in the murky waters of new, previously unseen data?

These are questions I've returned to many times working with machine learning and image processing.

I had an unsettling feeling when evaluating the discrimination power of a number of texture descriptors. What if they showed substantial differences between classes in other aspects that what we perceived as texture differences. I normalised the intensity across the datasets to have the same mean and standard deviation to reduce the risk of some obvious sources of bias.

In the same line of thoughts I decided to create a small dataset back in 2013 while writing on my PhD thesis (link). My hypothesis was that the texture descriptors I investigated could pick up differences not only in texture but also in imaging conditions including noise levels. Despite the standard intensity normalisation.

I used by DLSR to take photos of two samples of canvas, representing two similar, but still different textures. I took images, varying the ISO setting of the camera. Changing the ISO setting on a digital camera changes the analog amplification of the signal from the sensor elements. This is why higher ISO levels allows for shorter exposure times but results in higher levels of noise in the images. In short, I used higher ISO levels to acquire images with gradually increasing levels of noise. In the figure below you see an example of a texture patch acquired with increasing ISO setting.

I applied the texture descriptors I was investigating followed by Linear Discriminant Analysis (LDA) (link to Wiki) on the respective featurespace, considering each texture-ISO combination as a separate class. LDA finds a linear subspace while minimising intra-class variance and maximising inter-class variance. This means that we will get a feeling for the possibility of discriminating between the classes given the particular feature space.

The figure below is Fig. 4.9 from my thesis and shows the two first dimensions of the LDA space of each respective featurespace. Here I also subsampled the texture patches until there were very little difference between the images, from 192x192 pixels large texture patches down to 12x12 pixels.

Without spending time on the individual texture descriptors it's worth mentioning that they represent a few different types of texture descriptors including filterbanks, co-occurrence matrix based and local binary patterns based.

It becomes evident that more or less all of these descriptors can pick up differences between not only the two texture classes but also between each one of the eight ISO levels. Without including this notion in training a machine learning model to do texture classification on this type of data would result in poor generalizability in terms of coping with different noise levels. Varying imaging conditions resulting in varying noise levels is a very common case for real world applications.

Learnings

Some of my learnings from this experiment are:

Try to understand your latent space or feature space in your machine learning application.
Plot and visualize your data often. Much can be learnt before trying to validate complex models.
Consider model generalizability for the heterogeneity of new, real world data. For example, what imaging conditions and resulting effects are the nodel likely to encounter?
Challenge your model, see if it picks up meaningful differences and ignores irrelevant aspects.

Conclusions

I would be happy if revisiting these ideas and writing these notes helped you in any way, especially if it means that you start plotting and visualizing your data even more from now on!

Oh, one more thing, I made the texture dataset I used here available for everyone via my website (link). The rest of the texture datasets from my thesis are also there (link).

Best regards,

Gustaf Kylberg, PhD

A note on generalizability in machine learning

Gustaf Kylberg

Learnings

Conclusions

Others also viewed

Advanced Ensemble Techniques in Machine Learning: A Deep Dive

Building a Faster ImageGPT Processor: Contributing to Hugging Face

Variational Autoencoders: The Mental Model!

Today's best foundation models vs the NY Times Connections Puzzle. Who will win?

Robust Edge Detection with good detail fidelity.

Understanding Machine Learning Algorithms: Training Time and Inference Time Complexity

The Magic Behind Ensemble Methods in Machine Learning

Choosing Between Machine Learning and Rule-Based Algorithms: Practical Insights

Three Ways to Build Machine Learning Models in Keras

This is completely generated with AI to explain EML function

Explore content categories