Asyncio in Python 3.7
The release of Python 3.7 introduced a number of changes into the async world. There are a lot of quality-of-life improvements, some affect compatibility across versions and some are new features to help manage concurrency better. This article will go over the changes, their implications and the things you’ll want to watch out for if you’re a maintainer. Some may even affect you even if you don’t use asyncio.
For a quick summary of all the changes, the official “What’s New in Python 3.7” page is a great resource. Let’s dive in.
New Reserved Keywords
The async
and await
keywords are now reserved. This was needed in order to cement the asynchronous constructs we’ve been using since 3.5. I think it’s a good step forward. However, it creates a certain set of complications for any existing code that uses those keywords, regardless of whether they use asyncio or not.
It’s common to use the word async
as a function parameter or a variable that denotes whether to execute code while performing a blocking wait or not. There’s already quite a few modules broken because of this. However, the fix is easy: rename any variables and parameters. To keep with common python patterns, I would use async_
instead.
# This is a syntax error
def do_some_long_op(async=True):
...
# This is not
def do_some_long_op(async_=True):
...
do_something()
# This is also a syntax error
async = 'this work could be asynchronous'
# This is not
async_ = 'this work could be asynchronous'
do some_other_thing()
If you’re the author of a python package, please have a look through your code for these keywords. A simple find-replace is enough.
Context Variables
Version 3.7 now allows the use of context variables within async tasks. If this is a new concept to you, it might be easier to picture it as global variables whose values are local to the currently running coroutines.
Consider a client-server system where the server is managing a connection to a specific client address. To operate on the client you might need to pass the address around between coroutines. But with context variables, any client communication tasks could reference the information with the same name, but the value would differ depending on which client its talking to (each client is in a different context).
The example below (taken from the documentation) illustrates this scenario fairly well. An asyncio server executes handle_request()
when a new client connects. This sets the client_addr_var
context variable, which we then access in render_goodbye()
without having to pass it as a parameter.
import asyncio
import contextvars
client_addr_var = contextvars.ContextVar('client_addr')
def render_goodbye():
# The address of the currently handled client can be accessed
# without passing it explicitly to this function.
client_addr = client_addr_var.get()
return f'Good bye, client @ {client_addr}\n'.encode()
async def handle_request(reader, writer):
addr = writer.transport.get_extra_info('socket').getpeername()
client_addr_var.set(addr)
# In any code that we call is now possible to get
# client's address by calling 'client_addr_var.get()'.
while True:
line = await reader.readline()
...
writer.write(render_goodbye())
writer.close()
async def main():
srv = await asyncio.start_server(handle_request, '127.0.0.1', 8081)
async with srv:
await srv.serve_forever()
asyncio.run(main())
Python has similar constructs for doing this very thing across threads. However, those were not sufficient in async-world because each thread could run multiple coroutine contexts. Having asyncio support for context variables directly solves that issue.
To further expand on the concept, the coroutine scheduling functions loop.call_soon()
, loop.call_soon_threadsafe()
, loop.call_later()
, loop.call_at()
, and Future.add_done_callback()
can now handle an optional keyword-only context
parameter so that tasks can manage their context automatically.
For more details check out PEP567 and the contextvars module.
A quick warning before moving on: Just because you can do this doesn’t mean you should do it. Overuse will make your code appear magical and therefore hard to read. Since our ultimate goal is to write readable code, please think about it carefully before you decide to use it.
New asyncio.run()
function
This is one of the quality-of-life improvements. In previous versions, to properly execute a coroutine, you have to manage the event loop. This can get a little verbose and, because it’s not hard, a lot of people make assumptions about the loop (usually that they’re the only ones using it). This tend to get them into trouble when running alongside other code.
With a call to asyncio.run()
, we can now automatically create a loop, run a task on it, and close it when complete. Not only does it make your code shorter and easier to read, but it also manages the loop with a common mechanism. However, it’s very important to note that because it closes the loop when done, this method will restrict how the loop can be used by other parts of code. Meaning, it’s really intended to be an entrypoint into asynchronous code only used once, instead of a bread-and-butter “run this next task” call.
async def some_async_task():
...
# Before Python 3.7
loop = asyncio.get_event_loop()
loop.run_until_complete(some_async_task())
loop.close()
# After Python 3.7
asyncio.run(some_async_task())
This is still a provisional
API, so be careful how you use it. The interface may change in the future.
Simpler Task Management
Along the same lines, there’s a new asyncio.create_task()
function that helps make tasks that inside the current loop, instead of having to get the loop first and calling create task on top of it. While it makes code shorter and more readable, it also makes the loop selection implicit, so you’ll have to keep that in mind when scanning through code.
We also see the addition of current_task()
and all_tasks()
to the base asyncio
module. They make reasoning about async code a bit easier, especially since both the functions take a loop
argument to explicitly express the loop on which we’re operating. These were previously class methods under the asyncio.Task
class, but are now deprecated. If you manage a package that depends on asyncio, you’ll want to start thinking about moving any current_task()
or all_tasks()
calls to these new interfaces.
A common use case to illustrate the change is when canceling tasks on your way out of an async block:
try:
loop.run_forever()
except KeyboardInterrupt:
# Canceling pending tasks and stopping the loop
# Previous to Python 3.7
asyncio.gather(*asyncio.Task.all_tasks()).cancel()
# After the changes in 3.7
asyncio.gather(*asyncio.all_tasks()).cancel()
Simpler Event Loop Management
It’s not hard to get the current event loop by using asyncio.get_event_loop()
, but you’ll have to make separate calls to determine its state of execution. Most folks that are new to asyncio are not aware of the distinction, nor do they realize that different modules can make their own loops, instead of interacting with the one loop in which you’re usually operating.
The addition of asyncio.get_running_loop()
will help determine the active event loop, and catch a RuntimeError
if there’s no loop running. While I expect this to see less overall usage than the other changes, it will definitely help many modules that are dependent on running things in their own loop. It simplifies interoperability between different modules that use asyncio.
Also in the same category, the Task
, Server
and Future
classes now have get_loop()
functions to determine which loop they are running on. This completes the loop management picture by helping us find which tasks are scheduled to run or, even better, which loops our futures are waiting on. Again, not something that I would expect your every day async code to manage, but there’s several of us that make frameworks or modules that will find the functionality valuable.
Callback updates
When using call_soon()
or call_soon_threadsafe()
, we normally get a Handle
object as a response that we can use to cancel the call. Now there’s a Handle.cancelled()
method to determine whether the call was already canceled. Meaning, we can better handle interrupts or exceptions that may cancel a task without your knowledge.
Another change is that any cancelled tasks will no longer log exceptions. A nuisance where sometimes you might exit an application with running coroutines, and the act of interrupting the calls would log exceptions. These messages could mislead the user into thinking that there was some previously unhandled problem, when really they were caused by the interruption itself.
Along the same lines of managing callbacks, loop.call_later()
now returns callback objects with a new when()
method that tells us the absolute timestamp in which we expect to run.
Async Context Managers
Another quality-of-life improvement. We now have the asynccontextmanager()
decorator for producing async context managers without the need for a class that implements __aenter__()
or __aexit__()
. This behaves exactly like the contextmanager()
decorator that we use today for synchronous code. They also added a new AbstractAsyncContextManager
and AsyncExitStack
to complement their synchronous cousins.
If you’re not familiar with the concept, asynchronous context managers will await at the async with
line before entering their code block. To illustrate, imagine you want to access a web API asynchronously to obtain a list of resources. Before executing the list call, you have to login and use the token in your list call.
from contextlib import asynccontextmanager
@asynccontextmanager()
async def login(username, password):
# Wait for the login to complete and return the token
token = await _login_to_web_api(username, password)
try:
# Execute the context block
yield token
finally:
# Logout
await _logout_from_web_api(token)
async def list_resources():
async with login(username, password) as token:
# We are now logged in and have a valid token
return await list_resources(token)
Performance Improvements
Several functions are now optimized for speed, some were even reimplemented in C. Here’s the list:
asyncio.get_event_loop()
is now 15 times faster.asyncio.gather()
is 15% faster.asyncio.sleep()
is two times faster when the delay is zero or negative.asyncio.Future
callback management is optimized.- Reduced overhead for asyncio
debug
mode.
Server and Socket Improvements
There are lots of improvements with async servers and sockets. But unless you work in this world, most of them are transparent to you, so I just summarized them below for completeness:
- We have a new
loop.start_tls()
method to upgrade an existing connection to TLS. loop.sock_recv_into()
lets us read data from a socket into a buffer reducing the number of data copies.BufferedProtocol
class was added provisionally to implement streaming protocols with manual control over the receive buffer.- We can now wait for a stream writer to close with
StreamWriter.wait_closed()
, as well as determine whether a writer is in the process of closing withStreamWriter.is_closing()
. - A new
loop.sock_sendfile
lets us send files withos.sendfile
where possible. - We can now control when an
asyncio.Server
begins serving during creation with thestart_serving
keyword and theServer.start_serving()
function. Plus we can also haveServer.is_serving()
to determine its state. - Server objects are now asynchronous context managers which will automatically close the server.
- The
loop.create_datagram_endpoint()
method gained support for Unix sockets. asyncio.open_connection()
,asyncio.start_server()
functions,loop.create_connection()
,loop.create_server()
,loop.create_accepted_socket()
methods and their UNIX variants now accept thessl_handshake_timeout
keyword argument.- We can now use
ReadTransport.is_reading()
to determine the reading state of a transport. Additionally, calls to ReadTransport.resume_reading() and ReadTransport.pause_reading() are now idempotent. - Loop methods which accept socket paths now support passing path-like objects.
- In asyncio TCP sockets on Linux are now created with
TCP_NODELAY
flag set by default. asyncio.Server.sockets
now returns a copy of the internal list of server sockets, instead of returning it directly.
Generator Exception Handling
PEP 479 is now fully implemented in 3.7, this fixes the situation in which a generator that raises StopIteration
may mask a real problem somewhere in the call stack. From now on, directly or indirectly raising StopIteration
in coroutines and generators will transform into RuntimeError
exceptions instead. I suggest looking at the PEP for more information.
Other Miscellaneous Changes
object.__aiter__()
methods can no longer be declared as asynchronous.- Support for directly awaiting instances of
asyncio.Lock
and other asyncio synchronization primitives has been deprecated. An asynchronous context manager must be used in order to acquire and release the synchronization resource. - The
asyncio.windows_utils.socketpair()
function has been removed. Use thesocket.socketpair()
function instead. - asyncio no longer exports the
selectors
and_overlapped
modules asasyncio.selectors
andasyncio._overlapped
, usefrom asyncio import selectors
. loop.sock_recv()
,loop.sock_sendall()
,loop.sock_accept()
,loop.getaddrinfo()
,loop.getnameinfo()
are now actual coroutines, where before they just returned futures.- We have two new event loop policies called
WindowsSelectorEventLoopPolicy
andWindowsProactorEventLoopPolicy
.
Summarizing
As you can see, the core python community is contributing a large effort into expanding, simplifying and optimizing Python’s async capabilities. There’s still a lot of work to do, but most of it will be through evolution, as we all together learn the best way of interacting, designing and using asynchrnous code. I expect lots of churn in this area in the years to come.