Circleci Workflows experience
We recently updated our automated testing configuration from circleci 2.0 build to circleci workflows.
Result:
Average build time reduced from 7.5 minutes to 4.5 minutes. It is important to highlight that test time reduction correlates with parallelisation level, which inversely correlates complexity of testing configuration. Time reduction is not the sole metric to optimise in such endeavour.
Context:
The project is a python application with tests executed in a docker container. Testing flow is: download dependent python packages (0.5 min with cache, 3 mins without cache) => pylint (1 min) => (6 mins) pytest => version bump (0.1 mins, only on master branch). Linting and testing step can run in parallel. The result is a Fan-Out/Fan-In workflow. We further split our test scripts into two sub tests to increase parallelisation level because it is the longest running task in workflow.
More context (if you are interested):
If you use docker to execute the workflow, each job will download a new image instead of reusing parent job's container. This means our first job does not facility any subsequent jobs. We circumvent this problem by saving dependency files into a cache and restoring the cache before running subsequent jobs. This creates a 15 seconds overhead. Overall time reduction is positive.
defaults: &defaults
working_directory: ~/helloworld
docker:
- image: python:2.7
environment:
MONGO_HOST: ubuntu@localhost
- image: mongo:latest
version: 2
jobs:
build:
<<: *defaults
steps:
- checkout
- restore_cache:
key: helloworld-{{ checksum "requirements.txt" }}
- run: |
virtualenv ~/venv
. ~/venv/bin/activate
pip install -r requirements.txt
npm install
- save_cache:
key: helloworld-{{ checksum "requirements.txt" }}
paths:
- "~/.cache/pip"
- "~/venv"
- "node_modules"
lint:
<<: *defaults
steps:
- checkout
- restore_cache:
key: helloworld-{{ checksum "requirements.txt" }}
- run: |
virtualenv ~/venv
. ~/venv/bin/activate
make lint
test1:
<<: *defaults
steps:
- checkout
- restore_cache:
key: helloworld-{{ checksum "requirements.txt" }}
- run: |
virtualenv ~/venv
. ~/venv/bin/activate
python -m pytest -v -l -m test1 tests
test2:
<<: *defaults
steps:
- checkout
- restore_cache:
key: helloworld-{{ checksum "requirements.txt" }}
- run: |
virtualenv ~/venv
. ~/venv/bin/activate
python -m pytest -v -l -m test2 tests
deploy:
<<: *defaults
steps:
- run: |
pip install bumpversion
git config --global user.name "CircleCI"
git config --global user.email "bumpversion@circleci.com"
bumpversion minor
git push origin master --tags
workflows:
version: 2
build-test-and-deploy:
jobs:
- build
- lint:
requires:
- build
- test1:
requires:
- build
- test2:
requires:
- build
- deploy:
requires:
- lint
- test1
- test2
filters:
branches:
only: master
Discussions:
Overall experience is very positive. We have already migrated other projects that can gain significant time reduction from paralysation.
References:
- flow charts used here are adapted from circleci website.
:)
I just realize you almost never need to bust the cache of pip. If pip and venv are cached separately, time spent on downloading the packages can be saved and it ensures the venv is clean and up-to-date. A dirtier approach would be to simply cache them together and make the invalidation key independent of `requirements.txt`, for example, `key: "v1"`. In the case of a change in `requirements.txt`, the venv is not built from scratch; only the updated dependency is installed.