Illustration by Dusan Mirkovic

Easy Clustering with Docker Swarm - Designing CI/CD Systems

2019-08-23 python engineering docker Cristian Medina

Building code is sometimes as simple as executing a script. But a full-featured build system requires a lot more supporting infrastructure to handle multiple build requests at the same time, manage compute resources, distribute artifacts, etc.

After our last chapter discussing build events, this next iteration in the CI/CD design series covers how to spin-up a container inside Docker Swarm to run a build and test it.

What is Docker Swarm

When running the Docker engine in Swarm mode your effectively creating a cluster. Docker will manage a number of compute nodes and their resources, scheduling work across them that runs inside containers.

It handles scaling across nodes while maintaining overall cluster state, such that you can adjust how many worker containers are running in the cluster, automatically failover when nodes go offline, etc.

It also builds the networking hooks necessary so that containers can communicate with each other across multiple host nodes. It does load balancing, rolling updates, and a number of other functions you would expect from cluster technologies.

Cluster setup

When compute hosts form part of a Docker Swarm, they can either run in manager mode or as regular worker nodes. The nodes are able to host containers as directed by the managers.

One Swarm can have multiple managers, and the managers themselves can also host containers. Their job is to track the state of the cluster and spin-up containers across nodes as needed. This allows for redundancy across the cluster such that you can loose one or more managers or nodes and keep basic operations running. More details on this are available in the official Docker Swarm documentation.

To create a swarm you need to run the following command on your first manager (which also serves as your first node).

docker swarm init

The previous command will tell you what to run in each of your nodes in order to join that swarm. It usually looks like this:

docker swarm join --token SOME-TOKEN SOME_IP:SOME_PORT

The docker daemon listens on a unix socket located at /var/run/docker.sock by default. This is great for local access, but if you need remote access, you’ll have to enable tcp sockets explicitly.

Enabling remote access

Docker Swarm provides a good API for managing services, but for our particular use case, we need a feature that’s not available at the swarm-level and requires individual access to the nodes. Part of the reason is because we’re using Swarm for a special case that it wasn’t built for: running a one-off short-lived container - more on this later.

You’ll have to enable remote access on your node daemons in order to connect to them directly. Doing so seems to vary a bit between Linux versions, distributions and the location of docker config files. However, the main objective is the same: you must add a -H tcp://IP_ADDRESS:2375 option into the daemon service execution, where IP_ADDRESS is the interface it listens on.

You’ll find that most examples set it to 0.0.0.0 so that anyone can connect to it, but I would recommend limitting it to a specific address for better security - more below.

I was on an Ubuntu image that used the file in /lib/systemd/system/docker.service to define the daemon options. You just find the line that starts with ExecStart=... or has a -H in it, and add the extra -H option mentioned previously.

Don’t forget you have to reload daemon configs and restart docker after making the change:

sudo systemctl daemon-reload
sudo service docker restart

I’ve seen other distributions that track these settings under /etc/default/docker, and yet another that uses a file in /etc/systemd/system/docker.service.d/. You should google for docker daemon enable tcp or docker daemon enable remote api paired with your OS flavor to be sure.

Security implications

Given the nature of what you can do with Docker, it’s important to point out that enabling TCP sockets for remote access is a very serious security risk. It basically opens your system to remote code execution because anyone could connect to that socket and start or stop containers, view logs, modify network resources, etc.

To mitigate this, you’ll want to enable the use of certificate validation along with TCP sockets. This makes the daemon validate that the HTTPS certificate used by potential clients is signed by a pre-defined certificate authority (CA).

You create the certificate authority and sign any client certs before distributing them to the compute systems that will perform the orchestration - usually your build service.

Steps on how to generate the certificates, perform the signing and enable the verification options are available in the Docker documentaion for protecting the daemon.

Services, Tasks and Containers

When running in swarm mode, Docker terminology changes a little. You’re no longer concerned only with containers and images, but also with tasks and services.

A service defines all the pieces that makeup an application running in the cluster. These pieces are tasks, and each task is a definition for a container.

For example, if you have Application ABC that runs a Flask API that you wish to load balance across two nodes, you define one ABC service with two tasks. The swarm takes care of keeping them running in two nodes (even if there are more nodes in the cluster) and also configures the network so that the service is available over the same port regardless of the node your connected to.

The number of tasks to run are part of a replication strategy that the swarm uses to determine how many copies of the tasks to keep running in the cluster. Not only can you set it to a specific number, but you can also configure it in a global mode, that runs a copy of the tasks in every node of the swarm.

These concepts are simple in principle, but can get tricky when you’re trying to do more complicated things later. So I recommend you check out the service documentation for more information on how it all works.

Using this terminology to describe our use case: for every new build request, you’ll run a new service in the swarm that contains one instance of a task and one container that performs the build. The swarm scheduler will take care of provisioning it in whatever node is available. The container should delete itself ones the work completes.

Alternatives

As mentioned earlier, Docker Swarm and its concepts are meant for maintaining long-running replicated services inside a cluster. But we have the very specific case of single-copy ephimeral services executed for every build.

We don’t care about high availability or load balancing features, we want it for its container scheduling capabilities.

While simple to do, since it wasn’t built for this, it can feel like we’re forcing things together. So another option is to build our own scheduler (or use an existing one) and have it execute work inside containers.

This isn’t hard with existing task systems similar to Celery or Dramatiq that use work queues like RabbitMQ to distribute container management tasks.

Along the same lines you can reuse distributed compute systems for the same goal. I’ve deployed Dask successfully for this purpose. It even simplified some workflows and enabled others that wouldn’t be possible otherwise.

I know I’ve said this many times before, but I’ll say it again. Like most choices in software engineering (and in life), you’re always exchanging one set of problems for another. In this case, you exchange workflow complexity for infrastructure and maintenance complexity because you now have to keep worker threads running and listening to queues, as well as handle updates to those threads as your software evolves. This is completely abstracted for you by using a Docker Swarm.

Orchestration with Python

The official Python library maintained by the Docker folks is the docker module. It wraps all the main constructs and is straightforward to use. I’ve been leveraging it for a while now.

The library interacts with the docker daemon REST API. The daemon’s command interfaces use HTTP verbs on resource URLs to transfer JSON data. For example: listing containers is a GET to /containers/json, creating a volume is a POST to /volumes/create, etc.

If you’re interested, visit the Docker API Reference for more details.

APIClient vs DockerClient

The docker module itself exposes two “layers” of communication with the daemon. They manifest as different client classes: APIClient and DockerClient. The former is a lower-level wrapper around the interface endpoints directly, while the latter is an object-oriented layer of abstraction on top of that client.

For our purpose today, we’ll be able to stick with instances of DockerClient to perform all operations. Going to the lower-level is rare, but sometimes required. Both interfaces are well documented in the link shared earlier.

Creating a client is very simple. After installing the module with pip install docker, you can import the client class and instantiate it without any parameters. By default it will connect to the unix socket mentioned earlier.

from docker import DockerClient
dock = DockerClient()

The DockerClient class follows a general “client.resource.command” architecture that makes it intuitive to use. For example: you can list containers with client.containers.list(), or view image details with client.images.get('python:3-slim').

Each resource object has methods for generic actions like list(), create() or get(), as well as those specific to the resource itself like exec_run() for containers.

All attributes are available with the .attrs property in the form of a dictionary, and the reload() method fetches the latest information on a resource and refreshes the instance.

To accomplish our goals, you’ll need to create a service for each build, find the node where its task executes, make some changes to the container inside that task and start the container.

The execution script

Configuring a swarm service that executes a build is only half the battle. The other half is writing the code that follows the directives we defined in an earlier chapter to build, test, record results and distribute artifacts. This is what I call the execution script.

We’ll discuss specifics on how the script itself works in future articles. For now it’s sufficient to know that it’s the command that each build container runs whenever it starts. This is relevant because it brings up another issue we have to handle: the execution script is written in Python, but the build container is not required to have Python installed.

I’ve designed and implemented continuous integration systems that require Python to execute, as well as those that don’t. One adds complexity and constraints to the developers using the system, the other to the maintainers of the system.

If it’s a known fact that you’ll only ever build Python code, then this doesn’t really matter because the docker images used in the code repositories being built already have Python installed.

If this is not the case, then you may see a substantial increase in build times, complexitiy and maybe even a limitation in supported images because the developers have to install Python into the container as part of their build.

My choice these days is to package the execution script such that it’s runnable inside any docker image. I documented my attempts in a previous article about packaging Python modules as executables, where I concluded on using PyInstaller to do the job. Details on how to produce a package are included there.

Building and Executing Tests

With all the ingredients at hand, it’s time to dive into the code that makes a new service and runs the build. As defined in earlier chapters of this series, the following steps assume that:

The repository being built has a YAML file in its root directory with the build directives.
This configuration file contains an image directive defining the Docker image to use with the build.
There’s an environment directive as well where the user can set environment variables.
The webhook functions handling build events have retrieved build configuration info and stored it in a dictionary called config and provide pull request info in a pr dict.

Creating the service

When making the service, you need to consider what to name it, and how to find it when searching the swarm.

Service names must be unique and in order to simplify infrastructure maintenance, they should be descriptive. I prefer to use a suffix of -{repo_owner}-{repo_name}-{pr_number}-{timestamp}. There’s also string size limits in these names, so careful not to get too creative.

Because you don’t want duplicate builds of the same pull request, you also need the ability to programmatically search the swarm for a running build. In other words, if I’m executing a build for a given pull request, there’s no point in allowing the build to complete if I committed new code in that same PR. Not only are you wasting resources, but even if the build finishes successfully, it’s useless because it’s already out of date.

To handle this situation, I use labels. A service can have one or multiple labels with metadata about what it is and what it’s doing. The Docker API also provides methods for filtering based on this metadata.

The code that follows leverages those functions to determine whether a build is already running and stops it before creating a new service.

import logging
import docker

...

def execute(pr, action=None, docker_node_port=None):
    ...

    # Get environment variables defined in the config
    environment = config['environment'] if 'environment' in config and isinstance(config['environment'], dict) else {}

    # Get network ports definition from the config
    ports = docker.types.EndpointSpec(ports=config['ports']) if 'ports' in config and isinstance(config['ports'], dict) else None

    environment.update({
        'FORGE_INSTALL_ID': forge.install_id,
        'FORGE_ACTION': 'execute' if action is None else action,
        'FORGE_PULL_REQUEST': pr['number'],
        'FORGE_OWNER': owner,
        'FORGE_REPO': repo,
        'FORGE_SHA': sha,
        'FORGE_STATUS_URL': pr['statuses_url'],
        'FORGE_COMMIT_COUNT': str(pr['commits'])
    })
    logging.debug(f'Container environment\n{environment}')

    # Connect to the Docker daemon
    dock = docker.DockerClient()

    # Stop any builds already running on the same pr
    for service in dock.services.list(filters={'label': [f"forge.repo={owner}/{repo}", f"forge.pull_request={pr['number']}"]}):
        logging.info(f"Found service {service.name} already running for this PR")

        # Remove the service
        service.remove()

    # Create the execution service
    service_name = f"forge-{owner}-{repo}-{pr['number']}-{datetime.now().strftime('%Y%m%dT%H%M%S')}"
    logging.info(f"Creating execution service {service_name}...")

    service = dock.services.create(
        config['image'],
        command=f"/forgexec",
        name=service_name,
        env=[f'{k}={v}' for k, v in environment.items()],
        restart_policy=docker.types.RestartPolicy('none'),
        labels={
            'forge.repo': f'{owner}/{repo}',
            'forge.pull_request': str(pr['number']),
        }
    )

Note that before creating the service, we’re not only grabbing the environment variables defined in the build config, but also adding extras that describe the action we’re taking. They pass relevant information about the repository and pull request being built onto the execution script.

As you can see, we’re using .services.list() to get a list of services currently running in the swarm filtered with the labels we described earlier. If a service exists, calling service.remove() will also get rid of its containers.

Creating the service is a call to .services.create() where we pass:

The container image that the service is based on - defined in our build config.
A command to execute when it starts, which is the name of our execution script - forgexec in this case.
The service name.
Environment variables are defined as a list of strings formatted as NAME=VALUE, so we convert them from our environment dict using a list comprehension.
The restart policy is the term that Docker uses to define what to do with containers in the event of a host restart. In our case, we don’t want them to automatically come online, so we set it to none.
The labels with the metadata we described earlier.

Getting task information

Once the swarm creates the service, it takes a few seconds before it initializes its task and provisions the node and container that runs it. So we wait until its available by checking for the total number of tasks assigned to the service:

    # Wait for service, task and container to initialize
    while 1:
        if len(service.tasks()) > 0:
            task = service.tasks()[0]
            if 'ContainerStatus' in task['Status'] and 'ContainerID' in task['Status']['ContainerStatus']:
                break
        time.sleep(1)

There’s only one task in a build service, so we can always assume it’s the first one in the list. Each task is a dictionary with attributes describing the container that runs it.

There are two delays here: one before the task is assigned and one before a container spins up for the task. So it’s necessary to wait until container information is available within the task details before continuing.

Copying the execution script into the container

As discussed earlier, you need to copy our packaged execution script into each build container in order for it to start - the equivalent of a docker cp.

Copying data into a container requires the files to be tar’d and compressed, so at the very beginning of our event server script we create a tar.gz file using the tarfile Python module:

if __name__ == '__main__':
    # Setup the execution script tarfile that copies into containers
    with tarfile.open('forgexec.tar.gz', 'w:gz') as tar:
        tar.add('forgexec.dist/forgexec', 'forgexec')

This means we have a forgexec.tar.gz file available to transfer with the container.put_archive() function that the docker module provides. Do this every time the webhook event server starts and override any existing file to make sure that you’re not using stale code.

Transferring files into a container requires us to connect to the swarm node directly. There’s no interface at the docker service level to help us do that. This is why we had to enable remote access earlier.

First we get information about the docker node from the task and then we make a new DockerClient instance that connects to the node:

    node = dock.nodes.get(task['NodeID'])
    nodeclient = docker.DockerClient(f"{node.attrs['Description']['Hostname']}:{docker_node_port}")

This time, the instantiation uses the tcp port in which the nodes are listening (configured during cluster setup) and the hostname of the node. Depending on your network and DNS setup, you may want to use the socket module to help with domain name resolution. Something like socket.gethostbyname(node.attrs['Description']['Hostname']) might be good enough.

At this point we can directly access the container, copy the file into it and start it:

    # Get container object
    container = nodeclient.containers.get(task['Status']['ContainerStatus']['ContainerID'])

    # Copy the file
    with open('forgexec.tar.gz', 'rb') as f:
        container.put_archive(path='/', data=f.read())

    # Start the container
    container.start()

Putting it together

Here’s our new execute() function merged with code from the webhook event handling chapter:

def execute(pr, action=None, docker_node_port=None):
    """Kick off .forge.yml test actions inside a docker container"""

    logging.info(f"Attempting to run {'' if action is None else action} tests for PR #{pr['number']}")

    owner = pr['head']['repo']['owner']['login']
    repo = pr['head']['repo']['name']
    sha = pr['head']['sha']

    # Select the forge for this user
    forge = forges[owner]

    # Get build info
    config = get_build_config(owner, repo, sha)

    if config is None or config.get('image') is None or config.get('execute') is None:
        logging.info('Unable to find or parse the .forge.yml configuration')
        return

    # Get environment variables defined in the config
    environment = config['environment'] if 'environment' in config and isinstance(config['environment'], dict) else {}

    # Get network ports definition from the config
    ports = docker.types.EndpointSpec(ports=config['ports']) if 'ports' in config and isinstance(config['ports'], dict) else None

    environment.update({
        'FORGE_INSTALL_ID': forge.install_id,
        'FORGE_ACTION': 'execute' if action is None else action,
        'FORGE_PULL_REQUEST': pr['number'],
        'FORGE_OWNER': owner,
        'FORGE_REPO': repo,
        'FORGE_SHA': sha,
        'FORGE_STATUS_URL': pr['statuses_url'],
        'FORGE_COMMIT_COUNT': str(pr['commits'])
    })
    logging.debug(f'Container environment\n{environment}')

    # Connect to the Docker daemon
    dock = docker.DockerClient(docker_host)

    # Stop any builds already running on the same pr
    for service in dock.services.list(filters={'label': [f"forge.repo={owner}/{repo}", f"forge.pull_request={pr['number']}"]}):
        logging.info(f"Found service {service.name} already running for this PR")

        # Remove the service
        service.remove()

    # Create the execution service
    service_name = f"forge-{owner}-{repo}-{pr['number']}-{datetime.now().strftime('%Y%m%dT%H%M%S')}"
    logging.info(f"Creating execution service {service_name}...")

    service = dock.services.create(
        config['image'],
        command=f"/forgexec",
        name=service_name,
        env=[f'{k}={v}' for k, v in environment.items()],
        restart_policy=docker.types.RestartPolicy('none'),
        # mounts=[],
        labels={
            'forge.repo': f'{owner}/{repo}',
            'forge.pull_request': str(pr['number']),
        }
    )

    # Wait for service, task and container to initialize
    while 1:
        if len(service.tasks()) > 0:
            task = service.tasks()[0]
            if 'ContainerStatus' in task['Status'] and 'ContainerID' in task['Status']['ContainerStatus']:
                break
        time.sleep(1)

    node = dock.nodes.get(task['NodeID'])
    nodeclient = docker.DockerClient(f"{node.attrs['Description']['Hostname']}:{docker_node_port}")
    container = nodeclient.containers.get(task['Status']['ContainerStatus']['ContainerID'])

    with open('forgexec.tar.gz', 'rb') as f:
        container.put_archive(path='/', data=f.read())

    container.start()

What’s next?

This article gave you the details needed to use Docker Swarm for provisioning compute that builds code inside a cluster. Along with the previous chapters on handling repository events and defining the directives required to execute a build, you’re ready to move on to the next piece that covers the execution script itself. You’ll need to configure the build environment, run commands and execute tests for the different stages in the pipeline.

python docker ci cd builds swarm clustering