Deep Learning: Deep Dreaming

Deep Learning: Deep Dreaming

Introduction

Deep Learning has remarkable recent progress in many fields such as image classification and speech recognition. Deep Learning architectures usually have many layers. The medium sized network typically consists of 10-30 stacked layers of artificial neurons.

 After training (exposing many examples to the network), each layer progressively understands higher and higher-level features until the final layer. In case of image classification, the final layer makes a decision on what the image shows.

For example, the first layer maybe looks or gets excited by basic edges, while intermediate layers interpret more complex features such as shapes or components. The final few layers assemble those shapes into complete objects like faces and animals.

"One of the challenges of neural networks is understanding what exactly goes on at each layer or even at each neuron."

Here, I used the well known VGG16 network. It is a convolutional neural network architecture named after the Visual Geometry Group from Oxford. It was used to win the ILSVR (ImageNet) competition in 2014. To this day is it still considered to be an excellent vision model. The network can be downloaded from here 

In case of image classification, one way to understand what happens on neuron level, we can turn the network upside down! and ask the network to optimize a random input image to maximize the output of certain neuron. Then the image we will get will tell us when this particular neuron will get excited and fire.

Here is a sample of how input image is optimized for 9 neurons in layer 5_3 of the VGG16 network. you can see feathers, eyes, faces and shapes of different creatures.  

The input random image can be optimized using gradient ascent. Note, in this case we are dealing with already trained network. We are not searching for good parameters as usual, instead we are searching for good input (image) that maximizes a certain neuron output.

 As discussed, in case of image classification, neurons in first layers usually are excited (fire) by certain textures and edges, and neurons in final layers are excited by almost complete shapes (classes). 

Moreover, we can ask it to enhance an input image in such a way as to select a certain class. Here is an example of sea sneak image:

The result is the average of sea snake images during the training phase, it also includes the surrounding textures around the sea snake. 

Why is this important?

By examining these outputs, we will have a clear understanding how the network take its decisions and make classifications.  

Deep Dreaming

We feed the network some image (not a random image) let the network analyze and recognize the image through a layer of our selection and ask the network to enhance whatever it detected.

Making the "dream" images is very simple. Essentially it is just a gradient ascent process that tries to maximize the L2 norm of activations of a particular DNN layer.

Imaging the input image is a cloud image, some clouds may seem like a bird, then the final layer of the network will recognize this cloud as a bird and try to enhance this bird and so one.

Lower layers on the other hand, tend to produce strokes and patterns, because these layers are trained to recognize edges.

Here are some examples:

Original Image

 

Dream using early layers (edges)

 

 Dream using middle layers (simple shapes)

 

  Dream using final layers (faces of animals and birds, scary!)

 

In USA :)

 

Want to Generate your own dreams?

Here is the deep dream generator, enjoy!

Source code is here

 

Regards,

Thanks Ibrahim, it's really interesting and the way you described it made me understand more. Actually the final images give a good idea about how the multi-layers concept works in the real world.

This is such an excellent article, keep it up Ibrahim Sobh

To view or add a comment, sign in

More articles by Ibrahim Sobh - PhD

Others also viewed

Explore content categories