Optimize docker builds

Rohit Kumar Shaw

Published Apr 19, 2023

Background:

Optimizing Docker builds is important for several reasons. First, it helps to reduce the size of Docker images, which can lead to faster image builds, deployments, and downloads. Smaller images also consume less disk space and bandwidth, which can be beneficial in resource-constrained environments, such as when deploying to edge devices or running Docker images in a cloud-based container registry.

Second, optimizing Docker builds can improve the performance of containerized applications. By minimizing the number of layers in Docker images, optimizing build steps, and reducing the amount of unnecessary files or dependencies, Docker images can be built and executed more efficiently, resulting in faster startup times, lower resource usage, and improved application performance.

There are multiple ways to optimize docker images. In this article we are going to talk about :

Minimizing the number of layers (Based on my last post)
Multi-Stage Builds (Along with Minimal base images)

There are several other techniques that can be employed to further optimize Docker image management, such as signal handling, optimizing the number of running processes and more. However, we can delve into these topics in future articles for a more comprehensive discussion on optimizing Docker image builds.

Now, without further ado, let's begin our exploration.

Minimizing the number of layers

As we are aware, Docker utilizes a layered architecture to construct Docker images based on the instructions provided in the Dockerfile. Instructions like `FROM`, `RUN`, `ADD`, `COPY`, and `WORKDIR` explicitly creates layers. However, there are other commands that can also generate layers depending on how they are combined with other instructions. For instance, the `USER` instruction by itself does not create a layer, but when used in conjunction with instructions such as `RUN` or `COPY`, it can result in the creation of additional layers.

Therefore, it is crucial to be mindful of the number and order of instructions used in a Dockerfile to minimize the total number of layers and optimize the size and efficiency of the resulting Docker image. Careful consideration of instruction usage can significantly impact the performance and resource utilization of Docker images, ultimately leading to more efficient and optimized containerized applications.

So, how can we minimize the number of layers in Docker images?

Here are some best practices:

Combine RUN Instructions: Avoid using multiple RUN instructions for installing dependencies or executing commands. Instead, chain them together using `&&` or `\` character within a single RUN instruction. This way, Docker will create a single layer for all the commands, reducing the number of layers in the final image.
Limit Use of `ADD` and `COPY`: Avoid using `ADD` or `COPY` instructions for every file or directory individually. Instead, use wildcards or tarballs to copy multiple files or directories in a single instruction. This helps in reducing the number of layers and improving image build time.
Remove Unnecessary Dependencies: Keep the image clean by removing unnecessary dependencies, files, and directories after installing packages or building artifacts. For eg. dev dependencies are not needed in the production runtime container, so remove them. This helps in reducing the overall size of the image and optimizing its performance.
Use Multi-Stage Builds: Docker allows multi-stage builds, where you can use multiple FROM instructions in a single Dockerfile. We will talk about this in a while.

When optimizing Docker images by minimizing the number of layers, it's important to consider Docker's extensive caching mechanism, which helps reduce build times for unchanged layers. Therefore, when merging layers, it's essential to avoid combining layers that frequently change with those that hardly change. This allows Docker to effectively utilize its caching mechanism and optimize the build process while still maintaining the desired layer separation for efficient image management.

Multi-Stage Builds

Multi-stage build in Docker is a technique where multiple build stages are defined in a single Dockerfile to create a final Docker image. Each stage can have its own set of instructions, and the output of one stage can be used as the input for the next stage. This allows for efficient and streamlined Docker image builds, where intermediate build artifacts are discarded, resulting in smaller and more optimized final Docker images.

Recommended by LinkedIn

Understanding How Docker Works: A Comprehensive Guide

Asharib Kamal 1 year ago

Docker vs. Containerd: Understanding the Shift in…

Yoav Lax 1 year ago

Docker vs. Containerd: What You Need to Know

Yair Ginat 1 year ago

This technique elevates the best practice of avoiding unnecessary files in Docker images, such as using minimal Docker images, to a higher level of optimization. It empowers us to include only the files that are truly required during runtime, such as excluding dev dependencies or elements that are only needed during the build process but not during runtime. For example, in a React project, we may only need the dist folder if we are not using a native React server like react-scripts, resulting in a smaller build size as we eliminate unnecessary files.

For eg.

FROM node:18 as buil
WORKDIR /usr/code
COPY package* .
RUN npm install && npm i -g serve
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["serve", "-s", "build", "-l", "3000"]d

In this approach, we initially used a complete node 18 image as a base image and then proceeded to install the dependencies, build the project, and ultimately serve it - all within a single image. As a result, the Docker image contains the source code, node_modules, and the build. However, the source code and node_modules are not required for running the application, making it unnecessary to include them in the Docker image.

Let's see the image size of this build, that's 1.36GB

Now let's try breaking this into stages, one's build and another main (or as I like to call it runtime image)

FROM node:18 as build
WORKDIR /usr/code
COPY package* .
RUN npm install
COPY . .
RUN npm run build


FROM node:alpine as main
WORKDIR /usr/code
COPY --from=build /usr/code/build /usr/code/build
RUN npm i -g serve
EXPOSE 5000
CMD ["serve","-s","build","-l","5000"]d

In this case, we have split the Docker image into two distinct stages - one for the build process and the other for the final deployment, containing only the necessary build and serve dependencies.

We utilized the FROM command to transfer the build from the build stage to the main stage, but with an added parameter that specifies which build to copy from and what to copy.

Let's see the image size of this build, that's 188Mb.

Its a total reduction of ~85%.

NOTE: We used the node:18 base image for the build stage but we opted for node:alpine for the main stage. This is because the main stage is only intended to serve the image and doesn't require any additional dependencies. In fact, we could use an even smaller image that's specifically designed for serving, such as Nginx, which weighs in at a mere 60MB.

Refer Demo project at docker-demo@code-rks github

Thanks for reading my article! If you found it helpful, feel free to follow Rohit Kumar Shaw for more content like this.

Roman B. Gardner 7mo

Great article! I really appreciated how clearly it explains strategies to optimize Docker builds, making complex container optimization concepts much easier to understand. It perfectly ties into Docker container optimization and best practices for Docker, especially when aiming to optimize Docker images while ensuring Docker image security. While researching this, I found this guide helpful: https://mobisoftinfotech.com/resources/blog/devops/docker-image-optimization-guide – it shares practical Docker deployment tips for better performance and efficiency. Which optimization technique do you think provides the most significant improvement in build speed without compromising image security?

Pranav Chaturvedi 3y

Very insightful article! 👏

Dr. Vikas Kolekar 3y

Thanks for sharing

1 Reaction

Shekhar Chaugule 3y

Amazing Rohit Kumar Shaw 🚀

2 Reactions

Khushboo Verma 3y

Insightful read. Keep them coming! 👏🏼

5 Reactions

See more comments

To view or add a comment, sign in

Optimize docker builds

Rohit Kumar Shaw

Background:

Minimizing the number of layers

So, how can we minimize the number of layers in Docker images?

Multi-Stage Builds

Recommended by LinkedIn

More articles by Rohit Kumar Shaw

Others also viewed

Tools and Skills for .NET 8 - Chapter 15, Containerization Using Docker

Containers and Docker: Building Blocks for Modern Applications

Docker 🐋

Understanding Containers: Building, Architecture, and Functionality💭💭

Monolith to Microservice: Architecture Behind V2 Webhooks

From Architecture to Runtime

Domain Driven Design for Large Infrastructure as Code Projects

Battling the Invisible Enemy: Strategies to Overcome Memory Leaks

Practical Docker Image Optimization: Lessons From a 50GB Mistake

The Rise of Containerization and Docker

Explore content categories

Background:

Minimizing the number of layers

So, how can we minimize the number of layers in Docker images?

Multi-Stage Builds

Recommended by LinkedIn

More articles by Rohit Kumar Shaw

The Agent's Hands: How AI Coding Assistants Use Tools to Actually Do Things

The Heartbeat: Implementing the Control Loop

There Is No Magic. Just a Very Fast Loop.

The Load Balancer That Learned to Let Go

Your Load Balancer Is Quietly Becoming Your Bottleneck

The Network Is the System. Now Fight Back.

The Network Is Not a Function Call

The Invisible Detour: What Happens When the Internet’s GPS Lies?

The Invisible Hand: A Deep Dive into How the Internet Actually Routes Your Data

Understanding the Docker Container Lifecycle in Depth

Others also viewed

Tools and Skills for .NET 8 - Chapter 15, Containerization Using Docker

Containers and Docker: Building Blocks for Modern Applications

Docker 🐋

Understanding Containers: Building, Architecture, and Functionality💭💭

Monolith to Microservice: Architecture Behind V2 Webhooks

From Architecture to Runtime

Domain Driven Design for Large Infrastructure as Code Projects

Battling the Invisible Enemy: Strategies to Overcome Memory Leaks

Practical Docker Image Optimization: Lessons From a 50GB Mistake

The Rise of Containerization and Docker

Similar topics

Tips for Optimizing Images to Improve Load Times

Docker Container Management

Explore content categories