Using Gitlab Runners in Network Pipelines

Using Gitlab Runners in Network Pipelines

For the past few days I've been looking into Gitlab CI/CD for creating network pipelines. I considered using Jenkins, TravisCI and Concourse but decided to go with Gitlab since I already have most of my repos there and is pretty straight forward to use. Gitlab's documentation on their CI/CD tool is great so I will not go into much detail describing how to set it up.

My idea behind this article is to demonstrate a simple pipeline with two stages using gitlab runners to execute our jobs:

  • Build: Using docker executors we will spin up a container on a remote host to validate/lint YAML syntax (if there's a problem print an error message and quit with a non-zero error code) triggering our CI system to indicate failure. In addition, we will have our docker container run an Ansible playbook that will deploy a given configuration to our Lab.
  • Test: Using the shell executor, which allows us to execute builds locally to the machine where the runner is installed, run a batfish validation against our network to make sure the number of established BGP sessions meet an expected value.

Let's start by creating the following repo structure:

No alt text provided for this image

Directories and Files:

  • Batfish-snapshots: this is where our playbook will dump full router configs for batfish to analyze.
  • deploy: location of our deploy files generated by playbook
  • templates: Jinja2 templates
  • tests: our Python script to validate YAML syntax will go in here
  • vars: YAML variable files
  • gitlab-ci.yml: pipeline configuration file, this is where we define our stages and jobs
  • interface-config.yml: our Ansible playbook

Gitlab Runners:

No alt text provided for this image






As you can see, we have two runners activated for our project, the use of tags (blue labels on the picture) is very important here as this is the way we indicate what runners will be used for our different build jobs.

Defining our pipeline:

Inside our gitlab-ci file we define the structure and order of pipelines. Determine what to execute in our runners along with decisions to make when specific conditions are encountered:

stages:
  - build
  - test

build_job:
      stage: build
      tags:
       - build
      script:
      - cd tests/ && python validate_yaml.py
      - cd ../ && ansible-playbook -i hosts interface-config.yml --extra-vars "wf_ticket=22046"
      
test_job:
     stage: test
     tags: 
      - test 
     script: 
         - python3 /home/lab/jromero-batfish/batfish-assertion.py

Both of our stages are defined here, you can see the use of tags and our scripts which become jobs to run within each stage.

Our Ansible playbook, batfish assertion and YAML linter:

interface-config.yml

---
- name: Generate deploy files from vars  
  hosts: neteng-lab
  connection: local
  gather_facts: no
  roles:
    - juniper.junos
  vars_files:
    - vars/{{inventory_hostname}}.yml


  tasks:
    - name: Generating Deploy Files..
      template:
        src: "{{ item.src }}"
        dest: "{{ item.dest }}"
        mode: 0777
      with_items:
        - {src: 'templates/interface-config.j2',dest: 'deploy/{{wf_ticket}}-{{inventory_hostname}}.conf'}
      delegate_to: localhost


    - name: Push Configuration to Lab
      juniper_junos_config:
        config_mode: "exclusive"
        load: "merge"
        src: "deploy/{{wf_ticket}}-{{inventory_hostname}}.conf"
        commit: false
      register: response
    - name: Print the complete response.
      debug:
        var: response


    - name: "Pull configs for batfish analysis"
      juniper_junos_config:
        retrieve: "committed"
        dest: "batfish-snapshots/configs/{{ inventory_hostname }}"
      register: response
    - name: Print the complete response.
      debug:
      
        var: response

batfish-assertion.py - this is a pybatfish assertion that runs in our shell executor, it analyzes config snapshots to look for at least 1 established BGP session.

import pandas as pd
from pybatfish.client.commands import *
from pybatfish.datamodel import Edge, Interface
from pybatfish.datamodel.answer import TableAnswer
from pybatfish.datamodel.flow import (HeaderConstraints, PathConstraints)
from pybatfish.question import bfq, load_questions


# batfish host
bf_session.host = "localhost"
load_questions()


bf_set_network('neteng-lab')
bf_init_snapshot('/home/lab/repos/network-ci-pipeline/batfish-snapshots', name='neteng-lab', overwrite=True)


pd.set_option('display.min_rows', 400)
pd.set_option('display.max_rows', 400)


bgpSessStat = bfq.bgpSessionStatus(nodes='vmx-01', remoteNodes='vmx-02', status='Established').answer().frame()
print(bgpSessStat)


bgpSessStat[bgpSessStat.Established_Status == "ESTABLISHED"]
assert len(bgpSessStat[bgpSessStat.Established_Status == "ESTABLISHED"]) == 1, "BGP session Down"
        

validate_yaml.py

#!/usr/bin/env python
import os
import sys
import yaml
# YAML_DIR is the location of the directory where the YAML files are kept
YAML_DIR = "%s/../vars/" % os.path.dirname(os.path.abspath(__file__))
# loop over the YAML files and try to load them
for filename in os.listdir(YAML_DIR):
    yaml_file = "%s%s" % (YAML_DIR, filename)

    if os.path.isfile(yaml_file) and ".yml" in yaml_file:
        try:
            with open(yaml_file) as yamlfile:
                configdata = yaml.load(yamlfile)
# If there was a problem importing the YAML, we can print # an error message, and quit with a non-zero error code # (which will trigger our CI system to indicate failure)
        except Exception:
            print("%s failed YAML import" % yaml_file)
            sys.exit(1)
sys.exit(0)

Testing pipeline:

As soon as we commit to our master branch, our pipeline will start executing jobs defined in our CI config file (this depends in your environment and maybe you'll rather have it run during change requests):

We'll first start by committing an invalid YAML variable file to see if this is picked up, this should trigger a pipeline run:

--- 
interfaces: 
  - 
    address: 10.70.0.1/30
    description: "PTP to vMX-02"
name
  
:ge-0/0/1

Towards the bottom you can see the job failed due to our script being unable to load a properly formatted YAML file.

No alt text provided for this image

Our pipeline also shows what stage failed, with our test stage being skipped as a result:

No alt text provided for this image

We will fix this and try to break our BGP session next in order to test our second stage batfish validation job:

We will configure the wrong BGP peer IP on the interface, causing our session to go down

---
interfaces:
  -
    address: 10.70.0.1/30
    description: "PTP to vMX-02"
    name: ge-0/0/1

Similar to our previous test case we can see this job failed throwing an assertion error as batfish analysis did not return what we expected.

No alt text provided for this image

Pipeline showing stage failed for the given run:

No alt text provided for this image

Let's correct our BGP peer IP in our variable file and commit to our repo again.

Upon committing we can now see both stages passed:

No alt text provided for this image

There are things in this article I don't cover like getting the ansible container set up, runner server configuration etc. I feel there's plenty of how-to's online for this already. I hope this is helpful to other network engineers on this journey.

To view or add a comment, sign in

More articles by Jorge Romero

  • A few highlights from NXTWORK 2018

    • This year’s theme is “Engineering Simplicity” which revolves around the idea that complexity is a hard challenge to…

    1 Comment

Others also viewed

Explore content categories