Python: Mental notes about Asyncio

Here are some mentals notes after trying for a couple of hours the new Asyncio module based on the PEP 3156 spec.

The Asyncio module is the last confirmation that the multithreading war is over on Python, unless some core developers are already implementing another approach like Transactional Memory / Automatic Mutual Exclusion. The developers prefered to start improving the Asynchronous IO API to face the war against I/O bottlenecks instead of offering a real lock-free API of multithreading. Off course this approach makes perfect sense in the magnitude of the work to do.

Regarding to the Asyncio API, i think it was designed to tackle or standarize the proliferation of coroutine/event base state machine libraries like; Gevent, Tornado, Eventlet and many others who try to fix the same problem: Event based IO Access and colaborative Co-routines. Also appears to me that this API wasn't designed to be directly used by developers, in fact it was designed to be used by developers who write libraries on the top of it.

The Asyncio API offers a single event loop per process that you can call with the get_event_loop function. You can start adding event generators objects like callbacks, coroutines, tasks, futures, pipes and others directly. Please note that like any other Asynchronous orchestration's library all your code must be asynchronous and that means that all the libraries that you use must be asynchronous ready too, otherwise the scheduler could timeout or block your callback queue.

The Asyncio API exposes a few methods for schedule Co-routines call_soon(), call_later(), call_at(), which fits perfect for almost all the common use cases , off course all of them are pretty self explanaitory and well documented.

A little detail that caught my attention was the existence of separated concepts:Co-routines and Tasks; i am still trying to figure out what's the difference between them because a Task can be converted to a Co-routine directly.

The Asyncio module also offers a clean and nice high-level suite for managing internet connections that makes the pain of using Twisted easy for the rest of us. This will be the topic for a next article :).

My most common use case for an Co-routine framework is to attach a set of and explicitly wait for their completion, also i expect to have a way to notify to the siblings co-routines about events and lock them , also block the entire loop if is needed. The Asyncio module covers all of these functionalities perfectly so is a perfect standard replacement for my code that makes use of Gevent and Greenlets.

On the other hand the API also exposes a class called asyncio.Future which looks pretty similar to the javascript jQuery $.Deferred object. I noticed that i was using $.Deferred a lot. A classical use case on javascript would be something like the following code:

var f = function() {  
    return $.Deferred(function(d) {
      asynFunction().done(function(results) {
         d.resolve(results);
      });
    }).promise();
}

w = $.when(new f(), new f(), new f());  
w.done(function() {  
    console.log(arguments);
});

w.fail(function() {  
    console.log('failed');
});

On this case i am passing a collection of $.Deferred objects to the $.when function that will add them to a callback queue and wait for them to be in completed or failed state. Writing something pretty similar using the new Asyncio module will looks like:

#!/usr/bin/env python3.3
# -*- coding: utf-8 -*-

import asyncio

chunk = lambda ulist, step: map(lambda i: ulist[i:i+step],  
                                range(0, len(ulist), step))

loop = asyncio.get_event_loop()  
chunks = chunk([x for x in range(0, 1000)], 8)

def sort_me(chunk, future):  
    future.set_result(sorted(chunk))

def create_future(chunk):  
    f = asyncio.Future()
    loop.call_soon(sort_me, chunk, f)
    return f

@asyncio.coroutine
def run():  
    for chunk in chunks:
        yield from create_future(chunk)

def main():  
    loop.run_until_complete(run())

main()

This code is not doing something interesting, is just sorting chunks of integer's sets. Please note that the coroutines are raised using the new yield from operator and the loop method run_until_complete will do the same as the javascript $.when method.

I think that the Asyncio module is a very relevant step forward to make a more reliable Python, facing the common I/O bottlenecks that computers are having today with a nice API, also is a good step forward to standarize the growing set of asynchronous orchestration libraries.

I wish there was something more simplified like the jquery $.Deferred API with fewer methods and cleaner interfaces but i understand that the core Asyncio module should cover all of the user scenarios.

Soon I will continue writing more about this module.


Jorge Niedbalski

Jorge Niedbalski

Software Engineer , focused on automation.


View or Post Comments