Painless Status Reporting in GitHub Pull Requests - Designing CI/CD Systems

Continuing the build service discussion from the Designing CI/CD Systems series, we’re now at a good point to look at reporting status as code passes through the system.

At the very minimum, you want to communicate build results to our users, but it’s worth examining other steps in the process that also provide useful information.

The code for reporting status isn’t a major feat. However, using it to enforce build workflows can get complicated when implemented from scratch.

Since the pipeline covered so far in earlier articles already integrates with GitHub, it’s much easier to simplify things by taking advantage of GitHub’s features. Specifically, we can use the Status API to convey information directly into pull requests, and use repository settings to gate merges based on those statuses.

Reporting status to GitHub

We had a brief discussion of this API in a previous article about integrating pytest results with GitHub. It also covered GitHub Apps and how to authenticate them into the REST API. Today we’ll discuss more details about the Status API itself, keeping in mind a pre-existing App.

Reporting pull request status is a simple HTTP POST request to the status endpoint of the relevant PR. You can find that URL as part of the webhook event details related to the PR.

Here’s a sample of pull request details sent with events received by the webhook handler discussed in an earlier post:

...
"pull_request": {
    "url": "https://api.github.com/repos/Codertocat/Hello-World/pulls/2",
    "id": 279147437,
    "node_id": "MDExOlB1bGxSZXF1ZXN0Mjc5MTQ3NDM3",
    "html_url": "https://github.com/Codertocat/Hello-World/pull/2",
    "diff_url": "https://github.com/Codertocat/Hello-World/pull/2.diff",
    "patch_url": "https://github.com/Codertocat/Hello-World/pull/2.patch",
    "issue_url": "https://api.github.com/repos/Codertocat/Hello-World/issues/2",
    "number": 2,
    "state": "open",
    "locked": false,
    "title": "Update the README with new information.",
    "user": {
        "login": "Codertocat",
        "id": 21031067,
        "node_id": "MDQ6VXNlcjIxMDMxMDY3",
        "avatar_url": "https://avatars1.githubusercontent.com/u/21031067?v=4",
        "gravatar_id": "",
        ...
    },
    "statuses_url": "https://api.github.com/repos/Codertocat/Hello-World/statuses/ec26c3e57ca3a959ca5aad62de7213c562f8c821",
    ...
}

An HTTP POST to the statuses_url field shown above should include the detailed information we’ve discussed so far in its payload. You should specify the following:

A state field with possible values of success, failure, error, or pending. It shows as an icon next to the status in the GitHub PR.
Provide context information that indicates this is a status from your build system.
Use the description field for a short one-liner explaining the reason for the status. This text appears in the pull request, try to be concise.
A target_url should point to the detailed log output. It shows in the PR as a “Details” link to the right of the description.

Creating the POST request in Python is a simple call to requests.post() on the status endpoint with the given payload. Here’s some sample Python that also includes authentication code for a GitHub App.

import jwt
import requests

from datetime import datetime


# This number is per user / organization
DEFAULT_INSTALLATION_ID = 42
VALID_GH_RESPONSE_CODES = (200, 201, 204)

install_token = None


def authenticate_as_installation(ident=DEFAULT_INSTALLATION_ID):
    """Authenticate to GitHub as an installation"""

    resp = post(f'installations/{ident}/access_tokens', {})

    if 'token' in resp:
        install_token = resp['token']
        logging.info(f"Authentication completed")

    else:
        logging.error(f"Unable to authenticate as installation {id}")
        return resp


def post(endpoint, payload):
    """Issue an HTTP POST to the given GitHub endpoint"""

    resp = requests.post(
        endpoint,
        json=payload,
        headers={'accept': 'application/vnd.github.machine-man-preview+json', 'authorization': f'Bearer {install_token}'}
    )

    if resp.status_code == 401:
        # Our token expired, get a new one
        logging.debug(f"Received 401 from GitHub API:\n{resp.content}\nRequesting new installation token...")
        authenticate_as_installation(self.install_id)

        return post(endpoint, payload)

    elif resp.status_code not in VALID_GH_RESPONSE_CODES:
        # Received an unexpected error
        logging.error(f"Received error response from GitHub API - {resp.status_code}:\n{resp.content}")
        raise GitHubAPIError(resp.url, 'POST', resp.status_code, resp.text)

    return resp.json()


def create_status(status_url, state, context=None, description=None, target=None):
    """Create a new Status at the given status_url"""

    data = {'state': state}

    if context is not None:
        data['context'] = context

    if description is not None:
        data['description'] = description

    if target is not None:
        data['target_url'] = target

    post(status_url, data)

Here’s what it looks like when TravisCI reports status into a pull request using this API:

Sample status report

For more details on the API, check out the GitHub documentation for Statuses.

Deciding what to report

Build complexity varies greatly. It can go from simple 1-liner commands to large time-consuming scripts. This makes it hard to decide what’s important enough to bubble up to the user. Too little or too much information, and you risk making it hard to debug problems.

Debugging is the one certainty you can bet on users attempting with your system.

Things don’t always work as expected, code changes have unintended consequences, and unit tests need updates. The likelihood that code commits work immediately, especially if the build process is running sophisticated tests, is very low.

You should also consider human nature. When you automate a task, people tend to use it as much as possible, whether it makes sense or not for their use case.

It doesn’t matter if you intend the build to start after developers perform local testing in a feature branch. It’s normal to see them commit and push code simply to avoid typing out the command that runs tests locally. I mean, after all, it’s automated, isn’t it?

Keeping in mind that the primary use of status reports and log messages is to provide debug information to your customer. It’s easier to discern which pieces are relevant.

Don’t forget to provide enough data to prove that your automated system is working. It avoids the natural blame game that ensues when something breaks.

Your status messages and log information should make the following points explicit:

The automated system knows a new code commit arrived and is starting work.
The process started correctly and determined that the repo build configuration - remember the YAML file from before - is syntactically correct and adequately conveys work to do.
Your passing through a significant phase of build execution, like installation, testing, or deployment.
Show the user which commands it’s executing.
Clearly communicate success and failure where needed, but don’t overuse those words in messages, or the data becomes hard to skim through.
Always provide links to more detailed information where appropriate.

The context field functions as a grouping or section for each status and always shows the latest report. Meaning, three separate messages with the same context, like “installing dependencies”, “executing tests”, and “tests completed”, appear as one status entry in the PR checks section. As each one arrives on the GitHub server, it merely overrides the icon, text, and link of the previous one in the same context.

Since you can have multiple statuses with different context, it’s worth it to do a short analysis of the best way to group them. Given the executing system discussed so far and the statuses listed above, these are reasonable groupings and transitions assuming a base context name of forge for our system:

A forge context reports progress around the build system and base execution itself. Imagine transitions of: “Starting up”, “Running install steps”, “Running execute steps”, “Execution completed successfully”, etc.
forge-linter shows linting results separate from the build itself, this allows us separate requirements for linters (see below). Similar to the previous one, it shows “Setting up linters”, “Executing linters”, “Linting completed successfully”.
forge-deploy handles the deployment sections of the staging or production directives. These messages are more like “Starting staging deployment” and “Deployment completed successfully.”

Using statuses in build workflows

After running your first build, status reports start flowing into the repository. At this point, you can use the GitHub repo settings to add gates into the merge workflow based on the state of each entry.

To enable it, browse to the Branches section of your Settings page and add branch protection rules by clicking Add Rule. To follow the workflow discussed so far, fill out the fields on the next page with the following:

Set the branch name pattern to master.
Check Require pull request reviews before merging. One review is enough for now, but also check Dismiss stale pull request approvals when new commits are pushed. It sets things up so that even if a PR is approved when another commit comes in, you can’t merge it without another approval. In other words, you always review the most recent code.
Check Require status checks before merging. Once selected, you’ll see a list of the context fields in the statuses reported so far. For this branch, you want to require a passing state for forge and forge-linter.
You should also set Require branches to be up to date before merging. There are many reasons why this is important
Check Include administrators.

To complete the workflow for our branching strategy, you’ll want to repeat this process but with the production branch. Except in this case, you should also require the forge-deploy status.

Checks API

One last point before going into how to make your own log viewer is that there’s another GitHub API for reporting: the Checks API. We covered it in more detail in the pytest article mentioned earlier. Using it might save you the trouble of having to build the system described below, though you’ll have less control over the user interface.

Manual test approvals

Another interesting use case for the Status API is to show results for manual tests. While you can do this with as a pull request review instead, I think that adding a qa-approval status requirement to the final merge into production is a fantastic way of explicitly showing a formal approval step from the test organization. It’s easy enough to write a CLI command they can execute to submit the status, or even create an integration with your chat system.

What’s next?

Today we examined how to break down the build process to keep users aware of the code flowing through your pipeline. Now that you know how to report status at different stages of the build process, it’s time to look at expanding that some more with a detailed log viewer.

python ci cd builds github