The JPEG Compression Algorithm
A *.jpg image is based on a variation of the 2-Dimensional Fourier Transform. A 2-Dimensional Fourier Transform is required for image files since images are 2-D. The 2-D transform is equivalent to taking the l-D transform of each row, and then taking the l-D transform of each column, for an image matrix.
The problem with the Fourier Transform is that it leaves us with a real component and an imaginary component for each pixel in the image. By symmetrically extending the image before taking the 2-D Fourier transform, the imaginary, or sine, terms can be forced to zero.
The images used in this post are raw 8-bit grayscale images.
The Discrete Cosine Transform (DCT)
The DCT is simply a separate mathematical method for generating the
Cosine Transform without explicitly using the Fourier Transform. It is
based on defining a matrix [C] such that
If an image is divided into blocks of size 8x8, then the matrix C must also be 8x8 and becomes
If a matrix U represents the pixels in an 8x8 block of an image, then the DCT is define by the equation
[V] = [C] [U] [CT]
using matrix multiplication and [CT] as the transpose of matrix [C].
[V] becomes a sparse matrix with few large coefficients. For example,
which represents the first 8x8 sub block of a grayscale test file. The actual transform file, V, is rounded to integer values and becomes
which contains a large number of zeroes, but retains almost all of the information in the original sub image [U]. Entropy analysis of the original matrix [U] reveals that a compression of 1.77 to 1 is possible using Huffman encoding. However, a compression of 25.6 to 1 is possible using Huffman encoding of the matrix [V].
This method is the basis of the Joint Photographic Expert Group (JPEG) algorithm in which images are divided into 8x8 sub blocks; each sub block is processed independently by a DCT algorithm, and then the resulting file is Huffman encoded.
A compression scheme using the DCT can be summarized in the following steps:
- Divide the image into 8x8 blocks.
- Perform the DCT on each 8x8 block.
- Replace values close to zero with zero.
- Huffman encode the resulting file.
Decompression involves the following:
- Huffman decode the compressed file.
- Perform the inverse DCT on each 8x8 block.
The entire test image was compressed at a rate of 25:1.
The JPEG algorithm is basically an 8x8 DCT with some enhancements. JPEG is generally considered feasible for compression of still images in the 10:1 to 25:1 range.
Another standard, Motion Picture Expert Group (MPEG), is used for motion video and exploits the images differences frame-by-frame. MPEG is considered feasible for the compression of real time video images at the rate of 50:1.
I remembered my master lessons.. You have beautiful share..👍🏼