Wish list: Generics collapsing

One annoying thing I’ve noticed in building my script compiler is the way the use of generic collections tends to bloat up the size of your EXE.  I use generics for a lot of things; a compiler uses lists, stacks and lookup tables (dictionaries) all over the place.  When I was building it with DeHL, the compiler plus a very simple test frontend compiled into a 36 MB behemoth of a binary, and a quick look at the mapfile shows that the vast majority of that was DeHL collections.  Now that I’ve switched to the more simplified Collections library, it “only” takes 23 MB, a savings of about 33%.  But that’s still huge.  There has to be a better way.

Why does all this Generics code take up so much space?  Because generics aren’t actual code; they’re a template for creating actual code.  The compiler creates a new copy of the class for each instantiation of a generic class.  And this makes sense, to a certain extent.  You can’t reuse the same machine code implementation for, say, a TList<TMyObject> and a TList<IMyInterface>, because the interface version needs the compiler to generate the reference-counting code when working with the internal storage. Same for strings.  And records, integers and other value types need to be handled differently than objects and interfaces, which are reference types.  The compiler handles all that under the hood, and it makes our job a lot easier.

But… this has the side-effect of also making a bunch of silly, unnecessary copies.  For example, if you have a TList<TMyObject1> and a TList<TMyObject2>, the generated machine code for their routines will be identical, because there’s nothing in the difference between two classes that matters to TList<T>.  To illustrate, here’s a disassembly of the .Add methods of two different TList instantiations for two different class types.

Type 1:

[code lang="asm"]
53               push ebx
56               push esi
51               push ecx
8BF2             mov esi,edx
8BD8             mov ebx,eax
8B5308           mov edx,[ebx+$08]
42               inc edx
8BC3             mov eax,ebx
E856FDFFFF       call Generics + $43A074
8B4308           mov eax,[ebx+$08]
890424           mov [esp],eax
8B5304           mov edx,[ebx+$04]
893482           mov [edx+eax*4],esi
FF4308           inc dword ptr [ebx+$08]
33C9             xor ecx,ecx
8BD6             mov edx,esi
8BC3             mov eax,ebx
8B18             mov ebx,[eax]
FF5304           call dword ptr [ebx+$04]
8B0424           mov eax,[esp]
5A               pop edx
5E               pop esi
5B               pop ebx
C3               ret
[/code]

Type 2:

[code lang="asm"]
53               push ebx
56               push esi
51               push ecx
8BF2             mov esi,edx
8BD8             mov ebx,eax
8B5308           mov edx,[ebx+$08]
42               inc edx
8BC3             mov eax,ebx
E856FDFFFF       call Generics + $43AC54
8B4308           mov eax,[ebx+$08]
890424           mov [esp],eax
8B5304           mov edx,[ebx+$04]
893482           mov [edx+eax*4],esi
FF4308           inc dword ptr [ebx+$08]
33C9             xor ecx,ecx
8BD6             mov edx,esi
8BC3             mov eax,ebx
8B18             mov ebx,[eax]
FF5304           call dword ptr [ebx+$04]
8B0424           mov eax,[esp]
5A               pop edx
5E               pop esi
5B               pop ebx
C3               ret
[/code]

You can eyeball this, or paste it into BeyondCompare if you’d like, but they’re identical.  The only difference is in the CALL line; the disassembly shows a different offset on the right, because the two collection classes are each calling their own versions of GrowCheck.  But the actual machine code on the left is identical, because the CALL instruction uses a relative jump and the routines are arranged the same way in relation to each other.

This being true, it would be really nice if the Delphi linker understood this (it would have to be done in the linker, because you could have different instantiations of the same basic generic class in different units) and kept track of all the instantiations of each generic class and compared them against each other, method by method. When it found routines that were identical like this, it should strip out the duplicates and adjust call references, VMT slots, etc accordingly.

This would probably slow the linker down by a certain amount, and if it’s too slow there should probably be a compiler switch to disable it so it doesn’t interfere with coding productivity, but it’s certainly something I’d like to be have available when I’m making a release build.

17 Comments

  1. Markus says:

    HI Mason,

    Yes, generics in Delphi, as much as I don’t want to miss them anymore, have a tendency to come up with new surprises as you work with them. E.g. until recently I was not aware of the fact that generics really work like template classes that are built into specific classes for a specific type. With a colleague we were wondering how this would turn out with class variables in a generic class. Well, the result of some simple tests was, that given the following simplified class:
    TMyClass
    class var MyVar:T

    end;

    If you instantiate this class with the same type T in the SAME unit several times, then all this instances operate on the same class var.
    But if you instantiate the same TMyClass with the same type T in a different unit, then the compiler generates a new class in the other unit, therby having a separate class var. This behaviour is of course quite different from what we would expect from a non generic class.

    Nevertheless, although the Delphi editor (even in DXE) is completly helpless when it comes to generics, I really can’t think how programming would be like without generics.

  2. eric says:

    I’m using custom minimalist collections for that reason: most of the time you really only need very few methods, and standard ready-made collections carry everything plus the kitchen sink. You can always switch to a richer collection generic class when the need truly arises.
    For object collections with complex code (hashes, sort, etc.) a trick I use is to make the generic collection be merely a thin typed wrapper around a non-generic base collection workhorse. What you lose in overhead you can generally more than make up by having more efficient code in the non-generic workhorse (where you can safely use all the language features, and aren’t limited to a subset like in generic classes)

  3. Chris says:

    @Markus – ‘This behaviour is of course quite different from what we would expect from a non generic class.’

    I don’t know, the behaviour of class vars in a non-generic context can be pretty counter-intuitive – whenever inheritance is used, I frequently have to remind myself that a class var in the base class isn’t copied into each descendant class (i.e., the class var is just a global variable with restricted access, rather than tied specifically to the metaclass you access it through).

    • Markus says:

      @Chris,

      My point was maybe not quite clear, so let me give you an example to see the difference in behaviour between a generic vs non-generic class:

      Unit A

      TMyClass = class
      class var Int: Integer;
      end;

      TMyGenericClass = class
      class var a:T;
      end;

      var MyClass1, MyClass2: TMyclass;
      MyGenericClass1: TMyGenericClass1;
      MyGenericClass2: TMyGenericClass2;


      MyGenericClass1.a := 10;
      write(MyGenericClass2.a) // Results in 10
      MyClass1.int:= 20;
      write(MyClass2.int); // Results in 20

      Unit B

      uses Unit A

      var MyGenericClass3: TMyGenericClass;
      MyClass3: TMyClass;


      write(MyGenericClass3.a) // Result is 10
      write(MyClass3.int) // Results in 20

      Syntacticly not correct of course, but I hope you see my point: Although TMyGenericClass looks the same in Unit A and Unit B, there is one class instance in Unit A and another one in Unit B, whereas for the non generic class there is only one class instance in Unit A.
      It was just that what I tried to say in my first mail.

  4. Delfi Phan says:

    Generics were implemented as Templates in C++ decades ago.

    Even so, megabytes of additional binary is a phenomenal amount.

  5. 23 MB? Seriously?

    Perhaps it’s better to keep writing compilers in C 🙂

  6. If the linker would fold identical functions into one, how are you going to set a breakpoint on only one of them? (As breakpoints are placed as ‘int3’ opcodes over actual machine code – having only one version would trigger the breakpoint on all methods that are folded.) So even if this would ever be done, it would only work for optimise/release builds.

    PS: Visual Studio has a technique called LTCG (link time code generation), which goes a lot further than just merging identical methods – it actually optmizes the way arguments are passed, it changes alignment for better cache-line fills and whatnot. Really advanced stuff that I have yet to see in Delphi…

  7. Raynoch says:

    Great tinkhing! That really breaks the mold!

  8. You can drastically reduce the size of your executable if you have many different generic objects where the type parameter is a class with some trick.

    For example you want to use TList<T> without having all the code to be duplicated by the compiler. Just make a new TList<T: class> = class(TList<TObject>) and just override the methods that need to be strongly typed. The implementations will be one liners just calling inherited with some hardcasts to T. Also imo that has some nice side effect. You can easily put your TList<TDog> into some method that takes TList<TObject> as parameter.

    It’s not a real solution to the problem itself but a nice workaround for generics where T is a class and a bit cheated covariance.

  9. I remember a Borland Conference a long long time ago where Chuck J, then lead compiler architect, answered in reply to a question from the audience if Delphi would support generics that if they choose to implement that feature it would need to be better than a simple macro expander so for instance it should combine containers that hold references to objects to avoid the problem with large binaries that was common with C++ templates. Then Chuck left Borland and generics did not arrive to Delphi until over a decade later, ironically with that specific problem present in their implementation.

  10. […] and better Generics support.  Could it be that Embarcadero is finally adding proper Generics collapsing and improved constant array initialization, like I’ve been wanting for years?  That second […]

Leave a Reply