Optimize docker builds
Background:
Optimizing Docker builds is important for several reasons. First, it helps to reduce the size of Docker images, which can lead to faster image builds, deployments, and downloads. Smaller images also consume less disk space and bandwidth, which can be beneficial in resource-constrained environments, such as when deploying to edge devices or running Docker images in a cloud-based container registry.
Second, optimizing Docker builds can improve the performance of containerized applications. By minimizing the number of layers in Docker images, optimizing build steps, and reducing the amount of unnecessary files or dependencies, Docker images can be built and executed more efficiently, resulting in faster startup times, lower resource usage, and improved application performance.
There are multiple ways to optimize docker images. In this article we are going to talk about :
There are several other techniques that can be employed to further optimize Docker image management, such as signal handling, optimizing the number of running processes and more. However, we can delve into these topics in future articles for a more comprehensive discussion on optimizing Docker image builds.
Now, without further ado, let's begin our exploration.
Minimizing the number of layers
As we are aware, Docker utilizes a layered architecture to construct Docker images based on the instructions provided in the Dockerfile. Instructions like `FROM`, `RUN`, `ADD`, `COPY`, and `WORKDIR` explicitly creates layers. However, there are other commands that can also generate layers depending on how they are combined with other instructions. For instance, the `USER` instruction by itself does not create a layer, but when used in conjunction with instructions such as `RUN` or `COPY`, it can result in the creation of additional layers.
Therefore, it is crucial to be mindful of the number and order of instructions used in a Dockerfile to minimize the total number of layers and optimize the size and efficiency of the resulting Docker image. Careful consideration of instruction usage can significantly impact the performance and resource utilization of Docker images, ultimately leading to more efficient and optimized containerized applications.
So, how can we minimize the number of layers in Docker images?
Here are some best practices:
When optimizing Docker images by minimizing the number of layers, it's important to consider Docker's extensive caching mechanism, which helps reduce build times for unchanged layers. Therefore, when merging layers, it's essential to avoid combining layers that frequently change with those that hardly change. This allows Docker to effectively utilize its caching mechanism and optimize the build process while still maintaining the desired layer separation for efficient image management.
Multi-Stage Builds
Multi-stage build in Docker is a technique where multiple build stages are defined in a single Dockerfile to create a final Docker image. Each stage can have its own set of instructions, and the output of one stage can be used as the input for the next stage. This allows for efficient and streamlined Docker image builds, where intermediate build artifacts are discarded, resulting in smaller and more optimized final Docker images.
Recommended by LinkedIn
This technique elevates the best practice of avoiding unnecessary files in Docker images, such as using minimal Docker images, to a higher level of optimization. It empowers us to include only the files that are truly required during runtime, such as excluding dev dependencies or elements that are only needed during the build process but not during runtime. For example, in a React project, we may only need the dist folder if we are not using a native React server like react-scripts, resulting in a smaller build size as we eliminate unnecessary files.
For eg.
FROM node:18 as buil
WORKDIR /usr/code
COPY package* .
RUN npm install && npm i -g serve
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["serve", "-s", "build", "-l", "3000"]d
In this approach, we initially used a complete node 18 image as a base image and then proceeded to install the dependencies, build the project, and ultimately serve it - all within a single image. As a result, the Docker image contains the source code, node_modules, and the build. However, the source code and node_modules are not required for running the application, making it unnecessary to include them in the Docker image.
Let's see the image size of this build, that's 1.36GB
Now let's try breaking this into stages, one's build and another main (or as I like to call it runtime image)
FROM node:18 as build
WORKDIR /usr/code
COPY package* .
RUN npm install
COPY . .
RUN npm run build
FROM node:alpine as main
WORKDIR /usr/code
COPY --from=build /usr/code/build /usr/code/build
RUN npm i -g serve
EXPOSE 5000
CMD ["serve","-s","build","-l","5000"]d
In this case, we have split the Docker image into two distinct stages - one for the build process and the other for the final deployment, containing only the necessary build and serve dependencies.
We utilized the FROM command to transfer the build from the build stage to the main stage, but with an added parameter that specifies which build to copy from and what to copy.
Let's see the image size of this build, that's 188Mb.
Its a total reduction of ~85%.
NOTE: We used the node:18 base image for the build stage but we opted for node:alpine for the main stage. This is because the main stage is only intended to serve the image and doesn't require any additional dependencies. In fact, we could use an even smaller image that's specifically designed for serving, such as Nginx, which weighs in at a mere 60MB.
Refer Demo project at docker-demo@code-rks github
Thanks for reading my article! If you found it helpful, feel free to follow Rohit Kumar Shaw for more content like this.
Great article! I really appreciated how clearly it explains strategies to optimize Docker builds, making complex container optimization concepts much easier to understand. It perfectly ties into Docker container optimization and best practices for Docker, especially when aiming to optimize Docker images while ensuring Docker image security. While researching this, I found this guide helpful: https://mobisoftinfotech.com/resources/blog/devops/docker-image-optimization-guide – it shares practical Docker deployment tips for better performance and efficiency. Which optimization technique do you think provides the most significant improvement in build speed without compromising image security?
Very insightful article! 👏
Thanks for sharing
Amazing Rohit Kumar Shaw 🚀
Insightful read. Keep them coming! 👏🏼