From the course: Hands-On AI: Image Processing with Python

Image representation

- [Instructor] There are a few ways you may represent images in a computer. One of the most popular is the "Raster Image Representation", which is the type of images we will see in this course. Up ahead, we will get to know raster images and some important details about them. First, a raster image consists of a matrix of pixels, which are the small dots or squares that compose a picture like this smiley. Images may encode color, but they may be gray scale or even black and white. There are several file formats used to store pictures. Some of these formats involve compression, but not all of them. And lastly, an image may be stored at different resolutions. Let's start tweaking some pictures. In this notebook, let me show you how you can create raster images in gray scale and color with integers and floating point numbers. Let's look at the first cell where I'll create a simple gray scale image. Notice that I'm importing NumPy and Matplotlib's Pyplot Module in lines 4 and 5. In line 6, the assignment of retina to figure formats is done to get the best resolution from Matplotlib's imshow, which is the image show function. You'll see this line in every notebook ahead. So next, I'm creating a three by three NumPy array, and I'm initializing it with numbers between 0 and 255. These numbers represent shades of gray, where 0 is black and 255 is white. Look at the comments to make sense of these shades. Lastly, in line 12, I'm displaying the image with Pyplot's imshow function. This function takes an image object as an argument, and optionally a color map. For gray scale, we'll use the gray color map. Let's run it. Notice that the plot shows the row and column coordinates. Since this is a three by three matrix, the image is extremely magnified. In the next cell, I am printing the image with the print function to see how it looks in a text representation. I'm also printing the data type of one pixel. As you can see, it's a matrix of 64 bit integers. Moving on to the next cell, let me show you the same thing for a three by three color image. This time, each pixel is encoded as a red, green, blue triad. That is to say, that this image has three channels of intensity per pixel. Each number represents the intensity of its color. We'll talk about colors shortly. Look at line 10 and notice that this time I turned off the displaying of the X and Y axis in Pyplot. Also, notice that the image show function doesn't need a color map anymore because this image has all the color information it needs. Let's run it. You may want to take a moment to pause the video and confirm that these colors match the ones in the comments above. And once again, let's see its text representation and the type of an intensity value. As expected, this is a three dimensional array of 64 bit integers. Lastly, let me show you the same image, but encoded with floating point numbers between 0 and 1. This is the same image we saw before because the image show function uses the maximum value in the array as the maximum intensity. Yep, that's the same image. And now let's look at its text representation and type. This image is made of 64 bit floating point numbers. Oh yeah, let's move on.

Contents