Docker: Developing a microservice based system using docker - The know hows and pitfalls

Docker is a light weight application engine that deploys virtual machine like containers which can run processes inside them. Docker containers share system level resources to facilitate easy deployment and multi-tenancy. Docker has its own network and process space, as well as a layered union mount file system. The three main components of docker are - docker client, docker daemon or server (REST API) and containers.

Deploying an application that runs inside a docker container is as easy as the following steps:

  1. Shipping a custom docker image that can be built on top of an off-the-shelf base image
  2. Running a script to build the image or pull it from a registry to spawn containers

Building a complex dockerized system

In today’s world of microservice architecture composed of independent process like services, docker provides a leeway to efficient design, development and deployment. In order to develop a dockerized architecture, one needs to focus in the following areas.

  • Services - Docker centers around Service Oriented Architecture, or SOA. How will one reorganize an application into micro-level, self-sufficient services that can communicate with each other? Let us look at a quintessential end-to-end eCommerce system that is more complex than just a web application. It should have the following components.
  1. Web application / Front end - You need the web engine server (WARs) to be dockerized. The starting point is to write your own Dockerfile to create images for the containers.
  2. Backend / database instance - The database instance itself can be in a container. When a query is submitted, the handler can contact Docker container running the database instance to process.
  3. Scripts and daemons - There might be a monitoring service or trash management or some other daemon running on the system. They can be in one container or more. The principle behind microservice architecture prompts us to isolate such services and run them independently in separate container. Management of too many containers and images spawning them might get tricky. However,  the key design principle is to identify the services to dockerize and find the trade off.
  • Networking - Docker solves the port conflicts in multi-tenancy by dynamic mapping of ports. Each docker container has configurable and statically mapped ports exposed to the user that maps to physical ports in the system, a process that is abstracted by docker. Docker containers also have IPs assigned that are not discoverable outside the host. In case of service co-location being absent, one might also need the host IPs or configure the docker containers with unique discoverable IPs.
  • The data - Docker does not work with all the filesystems. So based on how files and other data is stored in the system, this might become a tall order. But in general, you can expose a volume on the host, or even dockerize a volume, and make it available to dockerized services. So in brief, data can be ported too.
  • Handling - The system might need a resource manager like YARN to allocate containers. Zookeeper or Consul can take care of failover. Consul has built in support for configuration management as well.

The prices to pay

There is no free lunch! Now I am not talking about limitations of architecting a microservice based system in general. I am also not referring to issues that docker community has been discussing in general, such as monitoring, automation etc. I am trying to caste some insight into specific overheads and development challenges probable to occur while dockerizing an application.

Service discovery - Where there is multi-tenancy, there is the problem of discovery. That's an added overhead to begin with, but in long run, it ensures good design, maintenance and performance. Also,there is another good news - as docker is becoming the new big thing, service discovery apps are very proactive in making themselves work for it. Consul or Zookeeper work just fine as they are very curtailed towards SOA.

Missing features  - As evident in the very active Docker Git community, there are a ton of feature requests and work in progress. For instance, container self registration, and self inspect, copying files from host to container, etc. Currently, there are workarounds and  off-the-shelf clients that will serve the purpose though.

Multiple processes - Let uss say that the dockerized application a monitoring daemon that continuously monitors all services. It needs to reside in all the containers monitoring each service. But Docker is yet maturing the multi-process aspect in a container. It is very good with microservices but not that great yet with a hybrid of SOA and legacy. Again, there are off-the-shelf solutions such as a Supervisor daemon that offers support with this.

Data in container - When a container goes down, what happens to its data and current state? Both of these need a back up and recovery strategy and although there are several ideas towards the solution, they are not automated (and hence not very scalable) yet.

Image management - As docker uses layered union file system to manage images, a container layer sits on top of read only platform layer. In case of multiple containers, we can commit changes that will write to these images in form of multiple layers on top of the platform and container layer. It becomes quite hard to manage the layers of images.

To sum it up, docker is an emerging tool to develop microservice based architecture on a large scale. With the evolving features, there is room for experimentation and improvement when it comes to dockerizing complex systems.

To view or add a comment, sign in

Others also viewed

Explore content categories