What’s in a name-less method?

When I first saw the announcements for the new Delphi 2009 features, a little more than a year ago, my reactions went something like this:

Unicode: Hmm… looks interesting.
Generics: YES! FINALLY!
Anonymous methods: …huh?

I think that’s pretty much how everyone reacted to anonymous methods at first.  The syntax is kinda ugly, and you could get lost trying to read through a procedure and finding another procedure declared in the middle of it like that.  And all this so you could have… drumroll please… a procedure with no name! Ta-da! Umm… yeah.  OK, whatever.

Things got a little more interesting, and more clear as to what they were useful for, when people started talking about anonymous methods as closures.  Storing local variables inside an anonymous method that retains state and can be passed around.  That’s actually kinda cool!  But how does it work?  Well, let’s find out.

[code lang="delphi"]
program Project1;

{$APPTYPE CONSOLE}

uses
   SysUtils;

type
   TAnonProc = reference to procedure;

procedure doStuff;
var
   counter: integer;
   i: integer;
   countProc: TAnonProc;
begin
   counter := 0;
   countProc := procedure
                begin
                   inc(counter);
                end;
   for i := 1 to 10 do
      countProc;
   writeln(counter);
end;

begin
   try
      doStuff;
   except
   on E: Exception do
      Writeln(E.ClassName, ': ', E.Message);
   end;
   readln;
end.
[/code]

When I run this, the result that gets output is 10, showing that the closure has been modifying the actual local variable for the doStuff procedure and not its own private copy.  But how does that work?  Local variables are stored on the stack, but this closure can be assigned to a reference that can be passed out of the scope of the procedure.  Well, let’s look at the assembly code for DoStuff to get a sense of what’s really going on.  It may not be pretty, but it’s honest.  Nothing can hide from the assembly view.

[code lang="asm"]
Project1.dpr.16: begin
push ebp
mov ebp,esp
push $00
push $00
push ebx
push esi
xor eax,eax
push ebp
push $0040e8f1
push dword ptr fs:[eax]
mov fs:[eax],esp
mov dl,$01
mov eax,[$0040e7a8]
call TObject.Create
mov esi,eax
lea eax,[ebp-$08]
mov edx,esi
test edx,edx
jz $0040e891
sub edx,-$08
call @IntfCopy

Project1.dpr.17: counter := 0;
xor eax,eax
mov [esi+$0c],eax

Project1.dpr.18: countProc := procedure
lea eax,[ebp-$04]
mov edx,esi
test edx,edx
jz $0040e8a7
sub edx,-$10
call @IntfCopy

Project1.dpr.22: for i := 1 to 10 do
mov ebx,$0000000a

Project1.dpr.23: countProc;
mov eax,[ebp-$04]
mov edx,[eax]
call dword ptr [edx+$0c]

Project1.dpr.22: for i := 1 to 10 do
dec ebx
jnz $0040e8b1

Project1.dpr.24: writeln(counter);
mov eax,[$00411ccc]
mov edx,[esi+$0c]
call @Write0Long
call @WriteLn
call @_IOTest

Project1.dpr.25: end;
xor eax,eax
pop edx
pop ecx
pop ecx
mov fs:[eax],edx
push $0040e8f8
lea eax,[ebp-$08]
call @IntfClear
lea eax,[ebp-$04]
call @IntfClear
ret
jmp @HandleFinally
jmp $0040e8e0
pop esi
pop ebx
pop ecx
pop ecx
pop ebp
ret
[/code]

Wow, that’s a lot of code!  The really interesting stuff here is what’s going on in the “begin” line.  After a bit of standard setup, it creates a try block, then does this:

[code lang="asm"]
mov eax,[$0040e7a8]
call TObject.Create
mov esi,eax
lea eax,[ebp-$08]
mov edx,esi
test edx,edx
jz $0040e891
sub edx,-$08
call @IntfCopy
[/code]

It places a class pointer in EAX, then calls TObject.Create to set up a new instance of that class, and stores the resulting object pointer in the ESI register.  Then it copies ESI to EDX (second parameter in the register calling convention), loads the stack location for our anonymous procedure reference into EAX (first parameter), and calls System._IntfCopy, which is declared as

[code lang="delphi"]procedure _IntfCopy(var Dest: IInterface; 
            const Source: IInterface);[/code]

So now we have an interface reference to an object containing an anonymous method, used to implement the closure.  When we go to assign to counter locally, we get this:

[code lang="asm"]
Project1.dpr.17: counter := 0;
xor eax,eax
mov [esi+$0c],eax[/code]

Aha!  So it’s not stored on the stack after all; it’s treated more or less as a var parameter referring to a field inside the closure object.  It’s referenced as [esi + $0C], so our “local” variable is stored 12 bytes in, which is right where we’d expect to find the first data field on an object that implements two interfaces: IInterface and the “anonymous method interface” for this closure.  This explains why an anonymous method can’t capture by-reference parameters or the Result variable, which canonically resides either in EAX or on the stack, depending on your calling convention.  The compiler needs to be able to reference it inside the closure object’s heap space, and it can’t do that if the variable’s already living somewhere else.

This raises an interesting question:  What happens if you create two anonymous methods that both refer to the same local variable?  Where is the variable stored then?  I won’t waste space posting another big disassembly, but the answer is, “in the closure object for that procedure.”  Only one gets created, with multiple (anonymous) methods and multiple interface references.  (Incidentally, this is also how CLOS creates objects:  by wrapping multiple closures around the same set of variables.)

There’s more interesting stuff that can be found by poking around in closure objects, but this post is getting long enough, so that’s all for now.  I’ll go into more details about anonymous methods and closure objects in a few days.

7 Comments

  1. Anax says:

    Looks great; keep it up!

  2. François says:

    Very interesting indeed!

  3. […] been explained several times by several different people, including me in my last article about anonymous methods, that anonymous methods are implemented through interfaces.  And all it takes is a trivial little […]

  4. cheeceicy says:

    Sry for being offtopic – what Word Press theme are you using? Looks interesting!!

  5. Mason Wheeler says:

    It’s called Fluid Blue.

  6. […] And why are they all the same?  It has to do with the way anonymous methods work.  Take a look at my analysis from a while back and see if it becomes any […]