Thoughts on Requirements for a Continuous Deployment System

Thoughts on Requirements for a Continuous Deployment System

A while back I wrote about the problems associated with using scripts for deployment. Scripts are error prone, require frequent updates, don’t provide meaningful error messages, and when they fail they don’t clean up after themselves. The solution is to use a continuous deployment system. So, whether you’re considering writing a deployment system or would rather adopt an available one, what should you look for? What are the requirements? 

I’m a product manager, so it’s always easiest for me to consider the requirements starting from use cases, so I’m going to start there.

Use Cases 

I could create a long laundry list, but lets start with the basics. Think minimum viable product. These describe a usable system and seem pretty straightforward so I won’t expound the need for each.

  • Deploy a service from my repository to a target environment
  • Update the service when code changes
  • Re-deploy the service
  • Roll-back to a previous release
  • Teardown the service

Continuous Deployment Requirements 

Provide for the most common use cases

Deploy: Clearly the simplest use case, the system must be able to acquire the necessary artifacts from a repository, deploy the code, create any storage, configure any necessary networks, start the components of your service and complete any post-start software steps required.

Configure: I run into users all the time who have older software in containers that requires some configuration after startup. A good deployment system should offer a mechanism for performing this configuration after each container starts.

Tear down: The opposite of Deploy. The system must be able to shut down the service components in an orderly fashion including steps required by 3rd party software, destroy storage, and eliminate network configurations.

Redeploy: This is different from Deploy in that the service is already running. If your app is completely stateless then perhaps this is the same as Teardown followed by Deploy, but that’s not the case in most services we see.

Update: This is basic, your system must update your service in various ways; architecturally (add/remove components), update individual components to the latest versions, reconfigure components, etc. Update needs to take into account dependencies between components so that the service doesn’t go down while updating.

Roll back: In an era when folks are commonly told they should fail forward, truly moving quickly means that roll out of new code must be “safe.” As part of that imho you must be able to roll back easily since failing forward isn’t quantifiable.

Use a declarative model

Descriptive models were controversial a decade ago but today more developers are familiar with the advantages. In order to avoid writing and revising procedural code you you describe the service to be deployed, dependencies, requirements, manual steps, quality checks, and more so the system can determine the best way to achieve them.

Adapt to architectural changes 

Hopefully this is obvious, but if you’re writing code each time you make an architectural change then you don’t have a continuous deployment system, you have a script manager.

Provide Health Checks

Despite all the checks and redundancy built into modern infrastructure failures still occur. A file may not copy correctly. A message may get dropped. Access may be denied for an operation. Or, a new library picked up during a build may not support an older call. Some folks believe there’s little need to preplan for these failures – that you should “fail forward,” but failing forward isn’t quantifiable. How will you answer “when will we be back up?”

To provide the confidence continuous deployment requires a deployment system must provide the ability to check the successful completion of each step in the process. The easiest way to accomplish this is to allow calling out to user specified scripts which allows for quick easy integration with numerous systems you already have deployed. I’m not backtracking from my earlier statement that the whole point is to get away from scripts. These quality checks aren’t part of the deployment engine, but simple go/no-go checks. As such they can be exceptionally short, simple and completely stateless. For more popular systems like Docker, Kubernetes, Jenkins, AWS and others the system should provide pre-integrated checks for the most common use cases.

Support multiple deployment modes

As soon as your service comprises more than a single container there are multiple ways to deploy. Even at just two containers the issue of serial deployment versus parallel deployment can come up. If you have part of your service that uses duplicate containers, for instance web servers, should they all be updated at once or use on of a couple rolling upgrade processes. Most commercial services are far more complex than that of course and a continuous deployment system should be able to accommodate these needs. At a minimum blue/green deployments, A/B testing and canary deployments need to be included.

Support multiple user roles

Sounds simple enough, but various users have different contextual needs. If you’re using certain relational databases, for instance, your DBA may need to log in after a container is started but before the service can start. A release manager might wish to control which releases get deployed to production but allow fully automatic releases to development environments.

Include manual steps in the process

In DevOps we’re always striving to automate everything. In reality, however, getting to full automation is a long journey for anything except brand new services that use only the latest middleware. That’s a luxury most folks don’t have. In order to provide deployment for the largest possible set of services the deployment system must include manual steps in the process; including alerting appropriate users when manual steps are encountered, accepting input that the step was completed or failed, and allowing for an override.

Provide visibility

In the rush to automate everything, I’ve noticed many folks gloss over visibility. I’ve watched DevOps engineers scan logs and log into systems to determine if a script completed successfully. New team members often spend months getting up to speed on the system architecture. Plus, anyone who can’t code is left with no way to determine the activity or state of the system.

To achieve continuous delivery users need to feel deployments are safe and reliable and imho one of the most effective ways to accomplish that is to provide visual interfaces in addition to APIs and a CLI. If designed well visuals can provide understanding of architecture, status and activity at a glance. Sharing becomes simple which allows more team members to contribute.

Deploy multiple copies of your service

Can you run your service locally and then deploy it to 3 regions (NA, LA, EMEA) in a few minutes? This becomes a major bottleneck for teams trying to test local then scale globally. Your deployment system should take care of this for you without necessary rebuilds or waiting for a rewrite of a script to scale differently in the cloud.

Support different target environments

I’m not implying a deployment system has to support every possible target environment as many have few common capabilities making such a system absurdly complex to build and use. On the other hand, a system that ties you tightly to a single instance of a single environment obviously isn’t useful. For instance, when we built our Skopos system we chose containers as the common defining factor, but support multiple targets the utilize the Docker container packaging including all flavors of Docker, ECS and soon K8S.

Integrated troubleshooting

First, we need to be able to tell if something went wrong. That doesn’t mean running a separate script or logging into systems to check their status, but rather we need meaningful error messages from the system telling us if an operation completed. Second, we need a way to find out in greater detail what was actually done so the system must provide logging in context of the operation.

Integrate with your toolchain

API

To enable continuous deployment, when new code is ready a deployment should begin. By implication, that means that either the deployment system must be able to recognize new code on its own or that it has to accept

Plugins

You shouldn’t have to rebuild your DevOps pipeline when you adopt a continuous deployment system. Webhooks and plugins that support your existing applications are essential. Now, just plug in and go.

I wish I could say we recognized the need for a continuous deployment system up front and came up with this list of requirements for engineering, but I can’t. After trying to deploy our own growing containerized application, we realized the pipeline we were building had become a tangled web. At that point the team recognized the need for a system. Even then, our list of requirements wasn’t complete. To come up with this detailed list of requirements that could apply to a variety of companies, we interviewed DevOps teams feeling the same challenges to understand their needs. I’m sure it will grow further as we talk with more users.


To view or add a comment, sign in

More articles by Bert Armijo

  • ROI In 30 Days - New Rules For Budgeting DevOps Tools

    When DevOps discusses adding new tools to company CI/CD toolchain the costs are usually pretty clear. To get budget…

    1 Comment
  • AIOps is the Future of DevOps

    Paradigm shifts often have unanticipated consequences and those consequences can take years to fully understand. Cloud…

    1 Comment
  • Five Stages of Performance Optimization Maturity

    Increasing capacity and improving performance of cloud based applications is not a one size fits all problem. Cloud App…

  • Top Reasons Performance Optimization Projects Fail

    Container or VM resource resource settings can dramatically improve or destroy efficiency (cost/performance) of Cloud…

    1 Comment
  • Process Succeeds Where Projects Fail

    Many people think of CI/CD as simply the automation of build and test using modern API-driven systems. Successful…

  • The Cloud App Optimization Paradox

    Ask anyone how fast they want their app to run and they’ll say as quickly as possible. Ask how much they want to spend…

  • Forgotten Settings Can Boost Application Performance

    Hundreds of parameters in Linux, various pieces of middle-ware and even your cloud platform affect the performance of…

    1 Comment
  • I Can’t Open My AWS Bill

    Like all startups, we spend a LOT of time talking to potential customers, trying to understand their pain points around…

  • The Seductively Simple Metric To Avoid When Tuning Apps on AWS

    A quick search online will find lots of tools that examine your AWS logs or bills and provide a list of vm’s with low…

  • Ops in the Era of DevOps

    Spend a little bit of time reading about DevOps and you’ll no doubt discover the term NoOps. That’s right, the idea…

Others also viewed

Explore content categories