Let's Talk About Closures

A closure is a function that returns another function. A lot of languages can have that, but I seem to encounter closures mostly in JavaScript. Over the past two months I have been making a lot of closures because of superstitions I've developed about needing them. After torturing myself writing closures in places I probably didn't have to, I decided to do some experiments to try to really understand the problem they solve. I'm writing this article as a way to record my results, so it's mostly for my own education. But also, I was trying to explain this stuff to my friends in the car, and I got it slightly wrong, so I wanted to clear that up.

First off, we're going to need to run some JavaScript snippets and see what happens, so we need some kind of logging. It so happens that I'll be running these experiments using the utility CodeRunner, which is a nice little app for running test scripts. However, CodeRunner doesn't route console.log to a place where you can see it. So, instead of using console.log, I'll use a line of jQuery to append text onto the HTML body.

function log(message)
{
    $("body").append($("<div>").text(message));
}

Now that we have a log function, we can run some tests! First, let's try something that definitely works the way we expect. We're going to count from 0 to 4 and log the numbers.

for( var i = 0; i < 5; i++ )
{
    log(i);
}

And on the screen displays a nice, pleasing:

0
1
2
3
4

Great. Now, without changing the behavior, let's add a layer of complexity by wrapping the log statement in a function, and then calling that function.

for( var i = 0; i < 5; i++ )
{
    function foo()
    {
        log(i);
    }
    
    foo();
}

Again, the output:

0
1
2
3
4

Groovy.

Now's when the trouble starts. Each iteration, we're going to push foo onto a list foos. Then we'll call each function in the list in a separate loop.

foos = [];

for( var i = 0; i < 5; i++ )
{
    function foo()
    {
        log(i);
    }
    
    foos.push(foo);
}

for( var j = 0; j < 5; j++ )
{
    foos[j]();
}

Now the output is:

5
5
5
5
5

HUH?

How could it print "5"? The loop never made it to 5!

I found this really confusing at first. I expected the output 0 through 4, like before. My logic was: because foo is declared inside the loop, each iteration makes a different foo that prints the value of i for that iteration. But that's not how it works. Instead, each function in foos actually does the exact same thing; it prints i. There's only one i, and at the time of the loop's exiting, i is 5, so it prints 5.

Side-note: I used j for the iterator in the second loop, if you use i, you get 0 through 4 again as the output, but that's a deceptive accident.

At first, I thought this might be a weird idiosyncrasy of JavaScript, but Python (my beloved Python) does essentially the same thing. This code...

foos = []
for i in range(0, 5):
    def foo():
        print i
    foos.append(foo)

for f in foos:
    f()

...prints out:

4
4
4
4
4

It's 4's, not 5's, because Python's looping conventions are different, but the principle is the same.

So, how do you make a list of 5 functions that print the numbers 0 through 4? Answer: you use a closure.

function make_number_logger(n)
{
    function foo()
    {
        log(n);
    }
    
    return foo;
}

for( var i = 0; i < 5; i++ )
{
    foos.push( make_number_logger(i) );
}

for( var j = 0; j < 5; j++ )
{
    foos[j]();
}

And now we're back to the output we're used to:

0
1
2
3
4

Everything is fine. Except... What? Why does that make a difference? Before, foo logged i at whatever value i was set to last. Now we have a function that logs n. Why doesn't it log n at whatever value n was set to last? What's the difference? Well, the difference is that n is the argument of make_number_logger and whenever that function gets called, a new, distinct n is created on a stack-frame. It's not the same n every time.

Normally, when a function returns, its stack-frame is disposed. But here, we're returning foo, and foo knows about n, so the language has to hold on to the stack-frame for that reference to still work. That's the key.

This is where I have to apologize to my friends for explaining this wrong. What makes this work is not some magic about a functions returning functions. To see what I mean, here's another couple experiments.

function make_number_logger()
{
    function foo()
    {
        log(n);
    }
    
    return foo;
}

var n = 9;
var f = make_number_logger();

n = 17;
f();

Output:

17

It doesn't print "9" because n is global, so the line n=17; supplants the 9 with a 17. It doesn't matter that foo was returned by make_number_logger.

Also, suppose we don't return a function from make_number_logger, instead we assign it to a global variable?

function make_number_logger(n)  
{
    function foo()
    {
        log(n);
    }

    g_foo = foo;
}

make_number_logger(9);

n = 17;
g_foo();

Output:

9

This time, n is an argument again, so inside make_number_logger, n is a new n, distinct from the global one that gets the assignment to 17. JavaScript holds on to it when it does g_foo = foo;.

So, my conclusion is that I've been writing a bunch of closures that I don't need to, and also getting the explanation wrong for a while. Hopefully, this article redeems me.