Thursday, June 27, 2013

Anonymous Function Blocks in Python

Python has anonymous functions in the form of lambdas, but they are limited to a single expression. For the most part, this is enough (especially now that print() is a function in Python 3), but there are cases where being able to have multiple statements would be useful. Right now, the way to do this in Python is to use a named, nested function:

def upload_data(dest, *urls):
    def _fetch(x):
        data = fetch_url(url)
        sent = 0
        for line in data:
            sent += send_data(dest, line)
        return sent
    return map(_fetch, urls)

Now, this could be rewritten as a set of expressions, but what if we had multi-statement anonymous functions?

The Idea

I’m not calling this a proposal because, quite frankly, I’m not sure it’s worth the effort, and I certainly don’t have the time or energy to try and champion it. I also haven’t looked to see if someone else has already had the same idea: it just occurred to me and I thought I’d write it down. The thought of wading through python-ideas to see if someone already had the same does not strike me as a good use of time.

Anyway, I was working on implementing function annotations for IronPython, and I realized that the arrow operator (->) was not used anywhere else in the grammar, and as far as I could tell was completely unambiguous – there’s no existing Python code that would contain the arrow. So, rather than being able to put multi-line lambdas anywhere (like C# or C++), what if they were restricted, like Ruby’s blocks? Python can’t use do like Ruby does, but maybe it could use the arrow instead?

Then, because no existing Python functions expect blocks, there needs to be a way to refer to a block in a statement. I decided to copy Ruby’s use of &, but in a slightly different way – as a placeholder for the block attached to that statement. A bare & is also not valid Python code, and I could not think of anything it could combine with that would be currently valid code.


The result is something like this:

def upload_data(dest, *urls):
    return map(&, urls) -> (url):
        data = fetch_url(url)
        sent = 0
        for line in data:
            sent += send_data(dest, line)
        return sent

The –> is used to introduce the block; it’s followed by a parameter list, a :, and a suite, just like a normal funcdef. In fact, even the type annotations would be usable, although the resulting double arrow (map(&, foo) –> (f : int) –> str:) looks a bit weird.

OK, so it’s workable within the grammar (I actually implemented in IronPython’s parser, just to be sure). What does it mean?


Semantically, these blocks are just a prettied-up version of the first function. The block is transformed into a nested function immediately before the statement with a generated name, and any block references (&) are replaced with the generated name. Some tricks would have to be played with line numbers to make debugging make sense, but that’s not insurmountable.

Multiple references would be allowed, and although I can’t think of a use case for that, it makes no sense to disallow it.

Even decorators (which are just functions, after all) can still be used:

map(my_decorator(&), foos) -> (foo):

There’s no reason they couldn’t be generators, either:

list(&()) -> ():
    i = 0
    while i < 10:
        yield i
        i += 1

The idea is to make them as close to named Python function as possible. The object passed to map is still a function instance, so all existing Python functions that take a callable should be immediately usable.

Implicit Blocks

Explicitly passing around block references is necessary to deal with existing Python functions (and we all know “explicit is better than implicit”) but it’s kind of ugly. Borrowing, again, from Ruby, it would be nice to have blocks be implicit:

def map(&func, iterable):
    return [func(e) for e in iterable]

map(foos) -> (foo):

This gets a lot trickier to implement in the general case, where there might be multiple functions with implicit blocks in the same statement. A rule of “outermost-rightmost” would probably work. I’m not exactly sure what restrictions Ruby imposes.

Use Cases

Blocks are possible to implement, and probably not too hard either. However, that doesn’t mean they’re worth doing. There aren’t too many situations where you can’t use list comprehensions, generator expressions, or lambdas, and nested name functions already exist to handle the remaining cases.

There are a couple of things that they do make nicer, though. Implementing decorators that take arguments, for one:

def timed(name):
    return & -> (func):
        return functools.wraps(func)(&) -> (*args, **kwargs):
            with timer(name):
                return func(*args, **kwargs)

Speaking of with statements, they wouldn’t be necessary with blocks:

def with_(obj, &func):
        return func(obj)

with(open("foo.txt")) -> (f):

It’s not exactly the same, since the block cannot return from the enclosing function, and you’d need nonlocal to modify variables in the outer scope. A similar treatment could be applied to for as well.

Finally, there’s the many things Ruby does with its own blocks, such as Sinatra:

get('/hi') -> ():
    return 'Hello, World'

But Flask does basically the same thing in the confines of existing Python. Still, it is very nice syntax sugar.

The Verdict

I think the idea is sound – if blocks are added to Python, they should look something like this. The work required for blocks using explicit block references should be relatively simple for someone familiar with CPython. Implicit block references are harder, but probably still doable.

That said, the use cases aren’t enough to motivate me to want to implement it (except for possibly the decorator – I can never figure out what to name those nested functions). If anyone else wants to, feel free to reuse the syntax. And if someone else already had the same idea, my apologies.

Now that I’ve written it down, I can page this idea out and never think of it again.