Awesome Webhook Handling with Sanic - Designing CI/CD Systems
After covering how to design a build pipeline and define build directives in the continuous builds series, it’s time to look at handling events from a code repository.
As internet standards evolved over the years, the HTTP protocol has become more prevalent. It’s easier to route, simpler to implement and even more reliable. This ubiquity makes it easier for applications that traverse or live on the public internet to communicate with each other. As a result of this, the idea of webhooks came to be as an “event-over-http” mechanism.
With GitHub as the repository management platform, we have the advantage of using their webhook system to communicate user actions over the internet and into our build pipeline.
Events
Since webhooks occur over HTTP, receiving an event simply means you need a webserver. It’s main role is to route an HTTP POST to a function that peeks into the payload and decides what to do with it. In other words, you need to build a REST API.
Before creating the API, let’s determine which events are relevant to perform a build given the branching strategy and design discussed in the first article. We can reference GitHub’s documentation of their webhook specification. It covers which actions trigger which events, what the payload looks like, what it’s attributes mean, etc.
As a reminder, the pipeline starts whenever someone opens a pull request against the master
branch. Tests execute against freshly built code. When everything passes and the changes are approved, the PR is merged. This triggers a new request to merge the changes into production, which then automatically executes deployment steps.
Following the webhook specification, you’ll find that user actions are grouped into categories, one of which is for pull requests. The most important actions to care about are below.
open
Creating a new pull request generates this event. The body contains enough information for our REST API to determine whether there’s work to do, what to do, and how to report status.
close
The close event doesn’t only trigger when canceling a pull request, but also when succesfully merging one. The handler will have to look into the payload in order to discern which one it was, and react accordingly.
The Web Server
Instead of using Flask as the REST server, I chose to go with a newer framework called Sanic. It’s a simple framework, with an interface similar to Flask, but it’s more performant and comes with built-in versioning and batteries included. Meaning you won’t need to install supporting plugins to cover the basics.
Sanic is also built to support asyncronous functions. In other words, you’ll be able to “await” long running tasks - something quite useful for a build system. Even if you don’t need it right away, it’s good to be ready for it.
Creating the app
The base Sanic object that implements the REST API application is enough to create the server, no special parameters required. It’s just a matter of instantiating it.
from sanic import Sanic
app = Sanic()
Routes
After instantiating the application, you’ll have access to decorators per HTTP verb. You use these decorators to assign the URL for a function, the API version in which the function is available, etc. They function much like the Flask’s @app.route()
.
The versioning mechanism enables us to create new endpoints or update the signature of older ones without disrupting the existing API. Incrementing the version number is enough because the final URL will prepend it: /v1/endpoint
vs /v2/endpoint
.
@app.post('/listen', version=1)
async def listen(request):
"""Listen for GitHub events"""
...
Notice that the endpoint function is defined with async def
. Meaning, it’s not a regular function, it’s a coroutine instead. This makes it usable inside an asyncio
event loop and enables its code blocks to await
asynchronous tasks.
If used correctly, you’ll get a decent performance gain as the interpreter suspends execution of tasks that are waiting on IO. A good example would be to await aiohttp
instead of calling requests
when retrieving information from other network services.
Running the app
After putting all the routes together, running a Sanic application is a simple call to app.run()
as shown below. The function does take several other parameters including the number of worker threads to run with (defaults to two), or an SSL context with which to serve HTTPS requests.
if __name__ == '__main__':
app.run(host=arguments['--address'], port=int(arguments['--port']))
Handlers
Events from GitHub will come with extra headers to help validate and route them. The X-GITHUB-EVENT
header contains the event category with which to differentiate, for example, a push
from a pull_request
event. If the header isn’t present, then the request did not come from a GitHub server.
A short if-else inside the main listener functon can call into each handler based on the type of event. You don’t have to handle every one, only those that are relevant to your application. The following example does just that.
import logging
from sanic import response
@app.post('/listen', version=1)
async def listen(request):
"""Listen for GitHub events"""
if 'X-GITHUB-EVENT' not in request.headers:
logging.info("Not a GitHub event - Ignoring")
return response.json({}, headers=RESPONSE_HEADERS)
event = request.headers['X-GITHUB-EVENT']
if event == 'pull_request':
handle_pullrequest_event(body)
elif event == 'push':
handle_push_event(body)
elif event == 'issues':
handle_issue_event(body)
else:
# Ignore it
logging.info(f"Received unhandled GitHub {event} event")
return response.json({}, headers=RESPONSE_HEADERS)
Response headers
In order to fully comply with HTTP standards while helping browsers and custom clients to better understand the response to a REST request, it’s important to include extra headers along with the response body.
The most relevant one in our case is the content-type
header used to indicate that the API is returning JSON. Sanic uses a dictionary to specify any extra headers to include with a response that looks like the one below.
RESPONSE_HEADERS = {
'Content-Type': 'application/json'
}
Depending on the final use case, you may need to specify others, but this is enough to get you started.
Subscribing
Webhooks are managed in the Settings section of a repository. Adding a new one requires the address where the webserver is listening, the content-type in which it expects the body - application/json
in our case - and the events you want to send.
Deciding which events are relevant can get complicated. You don’t necessarily know everything that’s needed ahead of time, and it might be non-obvious when looking at the webhook documentation.
Subscriptions are by categories, in our case we only need the Pull Request category and everything that comes with it.
On another note, ticking the “Send me everything” option is an interesting exercise to help discover events you might care about. You could configure the handler to pprint()
the JSON body when requests come in, and make a dummy repo with which to try different actions.
I’ve done this many times to help me understand not just what each event is, but the lingo or naming convention that GitHub uses.
Another mechanism for subscribing to events is through GitHub Apps. When defining the application itself, you specify which events you’re interested in. Then any user that installs that app just needs to enable it for his repos.
More details on GitHub Apps and how they work are available in this article.
Security
Any time you build a service that’s exposed to the public internet, security is a big concern. This section covers some of the measures you can put in place to help validate that events are coming from a trusted source (a GitHub server in this case), as well as securing the request contents.
SSL and HTTPS
It’s easy to configure webhooks to use HTTPS, but getting a real SSL certificate to serve from Sanic is more complicated and sometimes costly.
Regardless, it’s very important to use HTTPS to protect your web server and your code. Because it enables encryption of the request contents, failing to do so means that event traffic - including commit messages and other sensitive information - is exposed to third parties.
GitHub allows you to disable SSL cert verification when calling on the webhook REST API.
This means you can use self-signed certificates as a way to get started and test things out, but I don’t recommend using them in production.
Services like Let’s Encrypt can help you find free certs, while other DNS providers (like Amazon Route53) also generate them for free. Unfortunately, it’s up to you to do some research here to pick the most convenient option.
IP address filtering
One of the simplest ways to restrict access to the webhook REST API is by retricting the IP addresses that route to it. It’s not hard to do with existing cloud services, all of which have some form of firewall setting that’s built for this.
AWS does it through Security Groups
. Head over to the EC2 console and select the instance that’s running the API. The details pane in the bottom of the screen will list one or many security groups. Follow the link to one of the groups and edit the table to change the rules.
In Digital Ocean, the process is slightly different but conceptually the same. Select the Manage
-> Networking
sidebar entry and pick the Firewalls
tab. Select the firewall that applies to your droplet and edit the rules from there.
GitHub publicizes the IP addresses that they use to communicate with external apps. The latest info is available here and it says their servers use the following ranges: 140.82.112.0/20, 192.30.252.0/22.
In other words, if you’re on public GitHub, the systems that will send HTTP requests to your API will have an IP address in one of those ranges.
If a request comes from anyone else, then it’s not GitHub.com and you should reject it.
The same mechanism applies in a GitHub Enterprise installation, but you’ll have to contact your IT team to get the address ranges used in your organization.
Event signatures
It’s also possible to configure webhooks that include a cryptographic signature based on a secret token that you provide GitHub. You use the token locally to compute a hash on the webhook data, validating that it matches the hash that GitHub sends in the X-Hub-Signature
header.
When generating any type of secret like this, make sure to use cryptographic randomness.
Create it with a random function that comes from a system with enough entropy that gurantees randomness. Pseudo-random number generators are a real thing and common place in computing. You’d be surprised at how repeatable they are.
I like using the passlib
package to do this appropriately, but Python 3 also provides a built-in secrets
module that does a good job.
import secrets
secrets.token_hex()
For more details on how GitHub computes the hash and how to verify it, have a look at Securing your webhooks.
Processing payloads
Pull request events include quite a bit of information about the code changes involved, the PR details and the users that interact with it.
You’ll be processing that information, pulling out the relevant pieces needed by each step of the build. The contents is in JSON, and the Sanic request.json
property contains the parsed dictionary.
The most important keys required to build and test code are:
action
- tells us the type of pull request event. As mentioned earlier, we mostly care aboutopened
andclosed
actions.number
- the identifier GitHub uses when referring to the PR.repository['name']
orrepository['full_name']
- tells your API which repo this event references.repository['owner']['login']
- gives the username that manages the repository.sender['login']
- the user that created the pull request.pull_request['statuses_url']
- the location of the Status API where we can report build progress so it shows up in the PR summary. We’ll have a separate article detailing how this works in the near future.pull_request['head']['sha']
- the commit that you want to merge.pull_request['base']['ref']
- points to the branch that you’re merging into. This will determine whether you run deployment steps or the regular test execution steps.pull_request['commits']
- shows the number of code commits in this PR. You can use this when naming jobs or release candidate packages. I like adding anrc{commit_count}
suffix when creating Python packages to test with.
This is all the information you need to make an HTTP request back to GitHub that downloads the build specification. That’s the YAML file that describes how to build the repo. We discussed specifics on how to get that file in a previous article.
The full code
The code below includes extra items to help complete the service that we haven’t discussed. Specifically:
- I use
docopt
a lot to create CLIs and this is no exception. It works with the docstring at the top of the file to define the arguments. - The
forge
module referenced here contains the common functions used to authenticate as a GitHub App and make requests to the developer API. Those details come from articles mentioned earlier, but you’ll see more of it in future ones. - When the service starts, it initializes a dictionary with instances of the different application installations configured by our users. These “forges” will automatically re-authenticate when used after their tokens expire.
- There’s an extra
/status
endpoint used only to check whether the server is up and listening. This is very helpful when doing debugging, but may also be required if you’re routing through cloud load balancers. - References to Docker and related parameters will be covered in the next article.
"""GitHub continuous integration service.
Usage:
server.py [options]
Options:
-a --address ADDRESS Interface on which to listen.
[default: 0.0.0.0]
-p --port PORT Port to listen on.
[default: 9098]
-d --debug Enable debug mode.
"""
__version__ = '0.0.1'
import logging
import yaml
from docopt import docopt
from sanic import Sanic, response
from forge import Forge, GitHubAPIError
RESPONSE_HEADERS = {
'Content-Type': 'application/json'
}
@app.get('/status', version=1)
async def status(request):
"""Serve as a status check to verify the API is up and running"""
return response.json({'status': 'up', 'version': __version__}, headers=RESPONSE_HEADERS)
@app.post('/listen', version=1)
async def listen(request):
"""Listen for GitHub events"""
if 'X-GITHUB-EVENT' not in request.headers:
logging.info("Not a GitHub event - Ignoring")
return response.json({}, headers=RESPONSE_HEADERS)
event = request.headers['X-GITHUB-EVENT']
body = request.json
if event == 'pull_request':
handle_pullrequest_event(body)
else:
logging.info(f"Received unhandled GitHub {event} event")
return response.json({}, headers=RESPONSE_HEADERS)
def handle_pullrequest_event(details):
"""Handle Pull Request event actions"""
pr = details['pull_request']
logging.info(f"Pull Request #{pr['number']} {details['action']} in repo {details['repository']['full_name']}")
# Ignore PR actions that we don't care about
if details['action'] not in ('closed', 'opened', 'synchronize', 'reopened'):
logging.info(f"Ignoring {details['action']} action")
return
if details['action'] == 'closed':
if pr['merged']:
# PR closed because it merged
logging.info(f"Pull Request #{pr['number']} MERGED")
if pr['base']['ref'] == 'master':
# Merged to master, attempt a staging deployment
deploy_staging(pr)
elif pr['base']['ref'] == 'production':
# Merged to production, attempt a production deployment
deploy_production(pr)
else:
# PR closed without merging, do nothing
logging.info(f"Pull Request #{pr['number']} CLOSED - Ignoring")
elif pr['base']['ref'] == 'master':
# Opened, reopened or synchronized PR on the master branch, run tests
logging.info(f"Pull Request #{pr['number']} issued on master")
execute(pr, docker_host=DOCKER_HOST, docker_node_port=DOCKER_NODE_PORT)
elif pr['base']['ref'] == 'production':
# Opened, reopened or synchronized PR on the production branch, run staging tests
logging.info(f"Pull Request #{pr['number']} issued on production")
execute(pr, action='staging', docker_host=DOCKER_HOST, docker_node_port=DOCKER_NODE_PORT)
def get_build_config(owner, name, ref='master', forge=None):
"""Grab build actions for this repo"""
logging.info(f"Retrieving build config for {owner}/{name}@{ref}")
if forge is None:
forge = forges[owner]
try:
return yaml.safe_load(forge.download_file(owner, name, '.forge.yml', ref))
except GitHubAPIError as e:
if e.status_code == 404:
return None
raise e
def execute(pr, action=None, docker_host=None, docker_node_port=None):
"""Kick off .forge.yml test actions inside a docker container"""
logging.info(f"Attempting to run {'' if action is None else action} tests for PR #{pr['number']}")
owner = pr['head']['repo']['owner']['login']
repo = pr['head']['repo']['name']
sha = pr['head']['sha']
# Select the forge for this user
forge = forges[owner]
# Get build info
config = get_build_config(owner, repo, sha)
if config is None or config.get('image') is None or config.get('execute') is None:
logging.info('Unable to find or parse the .forge.yml configuration')
return
# Run the execution steps
...
if __name__ == '__main__':
# Setup logging configuration
logging.basicConfig(format='[%(levelname)s] %(asctime)s - %(funcName)s: %(message)s', level=logging.INFO)
# Parse command line
arguments = docopt(__doc__, version=__version__)
# Enable debug if needed
if arguments['--debug']:
logging.getLogger().setLevel(logging.DEBUG)
# Get the list of installations for this GitHub Application
forge = Forge(APPLICATION_ID)
logging.info("Lighting the forges...")
installs = forge.list_installations()
# Cache the installations and the client used to talk to them
forges = {install['account']['login']: Forge(APPLICATION_ID, install_id=install['id']) for install in installs}
# Set the web server address
...
# Set the docker parameters
...
logging.info("Starting server...")
app.run(host=arguments['--address'], port=int(arguments['--port']))
Other event actions of note for a build system
This build pipeline mostly works with the few pull request actions described earlier. But given you just built a REST API, there are other actions you can explore. Here’s a couple interesting ones, some of which we’ll use in future articles:
pull_request
Labeled, unlabeled, assigned, unassigned actions can tell you how developers are classifying the code.
push
This triggers whenever a developer pushes changes to any branch and can help you create other types of continuous build systems, or other pieces of it. I’ve used it in the past to automatically generate documentation.
issues
This group of actions allows you to develop automation around issue management. They can help format issues, automatically assign owners, or even mirror them to other management systems like JIRA (not that I’m a fan of it). The supported actions are: opened, closed, edited, reopened, deleted, labeled, milestoned, unlabeled, demilestoned, assigned, unassigned.
issue_comment
and pull_request_review
Separate from the issue actions themselves, you can get notifications whenever anyone creates, edits or deletes issue comments and PR reviews. Use these for notifying folks in a chat application of pending work, or even for automatic gating of code merges.
status
As discussed in an earlier point, we’ll go over the Status API in more detail in a future article. It allows you to report and act on progress changes throughout the execution steps.
check_run
These actions and the Checks API provide a nice interface for users to review build and test results, even request reruns of the execution steps. Some of it was covered in an earlier article about integrating GitHub with Pytest, I expect we’ll discuss more of it in the future as well.
What’s next
At this point you have a system that can receive events from GitHub, parse them and make a determination whether it should execute a build or not. We have the blueprints necessary for an automated pipeline, the next step is to orchestrate the Docker Swarm compute necessary to perform a build and test it.