Replicating Visual System with Machine

Visual Context:

We see outer world through our eyes. This information in form of light is passed to our brain. How do we recognize the object we see? How do we distinguish human as male or female? How do we say a round thing with square patches on it is a Foot Ball rather than a ball?

Vision is hard part of brain as it involves one third of the brain to process the information with some complex operations. These operations include recognition of objects seen through eyes. From childhood to current age we do continuous learning to identify objects. And based on that learning we identify the objects when seen through eyes. The typical process of visual system can be shown below highlighting the eyes, brain part and information exchange:

Figure No.1 Visual System

Deep learning exploits gigantic data-sets to produce powerful models. But what can we do when our data-sets are comparatively small? Transfer learning by fine-tuning deep nets offers a way to leverage existing data-sets to perform well on new tasks. So, broadly speaking transfer learning is a way to use learned deep neural networks trained on other set of data and use their learning capability to train on other data sets.

Machine Perception:

The following below is demonstration of Machine perception; seeing object through camera and predict their names. We have performed this task on two different domains, object recognition and facial recognition.

Using pre-trained model, we have classified objects. This classification includes placing object in front of web-cam and feed this image in the deep learning model. We have placed a ball in front of web cam for this purpose; figure No.2 shows the image and get prediction.

Figure No.2. Ball captured through web-camera

Prediction we have got from the model is :

Its amazing that I have placed a colored ball in front of camera, and model predicted it as bubble & balloon. But if we can observe in the above picture of ball, it looks like bubble because of its round shape and color distribution.

This second example includes Facial Recognition. For this purpose we have scraped images from the Facebook and used transfer learning concepts as we have freeze top layers of InceptionV3 model and trained it with two new classes of images 'Hamza', 'Talha'.

The prediction is made in two ways (i). reading file local disk for testing purpose (ii) use web-cam to capture image. The both images can be seen in Figure No 3 & 4 respectively.

Figure No.3. Image uploaded from local drive.

The prediction of this image obtained from the model is [1,0] indicates that model favors to first class and predicted this image as 'Hamza'. The result is shown below also:

Figure No.4. Image captured through web-cam

The prediction from web -cam object is shown below:

  We can see above that it predicted in favor of first class i.e. 'Hamza'.





To view or add a comment, sign in

More articles by Hamza Zafar

  • Pandas or SFrame

    I am learning to do data analysis in python. I have encountered two libraries which perform similar task Pandas and…

    1 Comment

Others also viewed

Explore content categories