Wish list: Generics collapsing
One annoying thing I’ve noticed in building my script compiler is the way the use of generic collections tends to bloat up the size of your EXE. I use generics for a lot of things; a compiler uses lists, stacks and lookup tables (dictionaries) all over the place. When I was building it with DeHL, the compiler plus a very simple test frontend compiled into a 36 MB behemoth of a binary, and a quick look at the mapfile shows that the vast majority of that was DeHL collections. Now that I’ve switched to the more simplified Collections library, it “only” takes 23 MB, a savings of about 33%. But that’s still huge. There has to be a better way.
Why does all this Generics code take up so much space? Because generics aren’t actual code; they’re a template for creating actual code. The compiler creates a new copy of the class for each instantiation of a generic class. And this makes sense, to a certain extent. You can’t reuse the same machine code implementation for, say, a TList<TMyObject> and a TList<IMyInterface>, because the interface version needs the compiler to generate the reference-counting code when working with the internal storage. Same for strings. And records, integers and other value types need to be handled differently than objects and interfaces, which are reference types. The compiler handles all that under the hood, and it makes our job a lot easier.
But… this has the side-effect of also making a bunch of silly, unnecessary copies. For example, if you have a TList<TMyObject1> and a TList<TMyObject2>, the generated machine code for their routines will be identical, because there’s nothing in the difference between two classes that matters to TList<T>. To illustrate, here’s a disassembly of the .Add methods of two different TList instantiations for two different class types.
Type 1:
[code lang="asm"] 53 push ebx 56 push esi 51 push ecx 8BF2 mov esi,edx 8BD8 mov ebx,eax 8B5308 mov edx,[ebx+$08] 42 inc edx 8BC3 mov eax,ebx E856FDFFFF call Generics + $43A074 8B4308 mov eax,[ebx+$08] 890424 mov [esp],eax 8B5304 mov edx,[ebx+$04] 893482 mov [edx+eax*4],esi FF4308 inc dword ptr [ebx+$08] 33C9 xor ecx,ecx 8BD6 mov edx,esi 8BC3 mov eax,ebx 8B18 mov ebx,[eax] FF5304 call dword ptr [ebx+$04] 8B0424 mov eax,[esp] 5A pop edx 5E pop esi 5B pop ebx C3 ret [/code]
Type 2:
[code lang="asm"] 53 push ebx 56 push esi 51 push ecx 8BF2 mov esi,edx 8BD8 mov ebx,eax 8B5308 mov edx,[ebx+$08] 42 inc edx 8BC3 mov eax,ebx E856FDFFFF call Generics + $43AC54 8B4308 mov eax,[ebx+$08] 890424 mov [esp],eax 8B5304 mov edx,[ebx+$04] 893482 mov [edx+eax*4],esi FF4308 inc dword ptr [ebx+$08] 33C9 xor ecx,ecx 8BD6 mov edx,esi 8BC3 mov eax,ebx 8B18 mov ebx,[eax] FF5304 call dword ptr [ebx+$04] 8B0424 mov eax,[esp] 5A pop edx 5E pop esi 5B pop ebx C3 ret [/code]
You can eyeball this, or paste it into BeyondCompare if you’d like, but they’re identical. The only difference is in the CALL line; the disassembly shows a different offset on the right, because the two collection classes are each calling their own versions of GrowCheck. But the actual machine code on the left is identical, because the CALL instruction uses a relative jump and the routines are arranged the same way in relation to each other.
This being true, it would be really nice if the Delphi linker understood this (it would have to be done in the linker, because you could have different instantiations of the same basic generic class in different units) and kept track of all the instantiations of each generic class and compared them against each other, method by method. When it found routines that were identical like this, it should strip out the duplicates and adjust call references, VMT slots, etc accordingly.
This would probably slow the linker down by a certain amount, and if it’s too slow there should probably be a compiler switch to disable it so it doesn’t interfere with coding productivity, but it’s certainly something I’d like to be have available when I’m making a release build.
HI Mason,
Yes, generics in Delphi, as much as I don’t want to miss them anymore, have a tendency to come up with new surprises as you work with them. E.g. until recently I was not aware of the fact that generics really work like template classes that are built into specific classes for a specific type. With a colleague we were wondering how this would turn out with class variables in a generic class. Well, the result of some simple tests was, that given the following simplified class:
TMyClass
class var MyVar:T
…
end;
If you instantiate this class with the same type T in the SAME unit several times, then all this instances operate on the same class var.
But if you instantiate the same TMyClass with the same type T in a different unit, then the compiler generates a new class in the other unit, therby having a separate class var. This behaviour is of course quite different from what we would expect from a non generic class.
Nevertheless, although the Delphi editor (even in DXE) is completly helpless when it comes to generics, I really can’t think how programming would be like without generics.
I’m using custom minimalist collections for that reason: most of the time you really only need very few methods, and standard ready-made collections carry everything plus the kitchen sink. You can always switch to a richer collection generic class when the need truly arises.
For object collections with complex code (hashes, sort, etc.) a trick I use is to make the generic collection be merely a thin typed wrapper around a non-generic base collection workhorse. What you lose in overhead you can generally more than make up by having more efficient code in the non-generic workhorse (where you can safely use all the language features, and aren’t limited to a subset like in generic classes)
@Markus – ‘This behaviour is of course quite different from what we would expect from a non generic class.’
I don’t know, the behaviour of class vars in a non-generic context can be pretty counter-intuitive – whenever inheritance is used, I frequently have to remind myself that a class var in the base class isn’t copied into each descendant class (i.e., the class var is just a global variable with restricted access, rather than tied specifically to the metaclass you access it through).
@Chris,
My point was maybe not quite clear, so let me give you an example to see the difference in behaviour between a generic vs non-generic class:
Unit A
TMyClass = class
class var Int: Integer;
end;
TMyGenericClass = class
class var a:T;
end;
var MyClass1, MyClass2: TMyclass;
MyGenericClass1: TMyGenericClass1;
MyGenericClass2: TMyGenericClass2;
…
MyGenericClass1.a := 10;
write(MyGenericClass2.a) // Results in 10
MyClass1.int:= 20;
write(MyClass2.int); // Results in 20
Unit B
uses Unit A
var MyGenericClass3: TMyGenericClass;
MyClass3: TMyClass;
…
write(MyGenericClass3.a) // Result is 10
write(MyClass3.int) // Results in 20
Syntacticly not correct of course, but I hope you see my point: Although TMyGenericClass looks the same in Unit A and Unit B, there is one class instance in Unit A and another one in Unit B, whereas for the non generic class there is only one class instance in Unit A.
It was just that what I tried to say in my first mail.
Ups, all the angle brackets removed – I hope you can guess where they would belong.
Sorry, just looked over it again, and it would be hard to figure it out, so here the explanation:
TMyGenericClass1,2,3 are all of type T:integer
Generics were implemented as Templates in C++ decades ago.
Even so, megabytes of additional binary is a phenomenal amount.
23 MB? Seriously?
Perhaps it’s better to keep writing compilers in C 🙂
The current Delphi compiler is written in C/C++
DWScript stands at 340kB compiler + runtime, and 400kB with the basic classes & functions. 😉
If the linker would fold identical functions into one, how are you going to set a breakpoint on only one of them? (As breakpoints are placed as ‘int3’ opcodes over actual machine code – having only one version would trigger the breakpoint on all methods that are folded.) So even if this would ever be done, it would only work for optimise/release builds.
PS: Visual Studio has a technique called LTCG (link time code generation), which goes a lot further than just merging identical methods – it actually optmizes the way arguments are passed, it changes alignment for better cache-line fills and whatnot. Really advanced stuff that I have yet to see in Delphi…
Great tinkhing! That really breaks the mold!
You can drastically reduce the size of your executable if you have many different generic objects where the type parameter is a class with some trick.
For example you want to use TList<T> without having all the code to be duplicated by the compiler. Just make a new TList<T: class> = class(TList<TObject>) and just override the methods that need to be strongly typed. The implementations will be one liners just calling inherited with some hardcasts to T. Also imo that has some nice side effect. You can easily put your TList<TDog> into some method that takes TList<TObject> as parameter.
It’s not a real solution to the problem itself but a nice workaround for generics where T is a class and a bit cheated covariance.
I remember a Borland Conference a long long time ago where Chuck J, then lead compiler architect, answered in reply to a question from the audience if Delphi would support generics that if they choose to implement that feature it would need to be better than a simple macro expander so for instance it should combine containers that hold references to objects to avoid the problem with large binaries that was common with C++ templates. Then Chuck left Borland and generics did not arrive to Delphi until over a decade later, ironically with that specific problem present in their implementation.
All good implementation of C++ templates manage to solve this bloat problem these days.
[…] […]
[…] and better Generics support. Could it be that Embarcadero is finally adding proper Generics collapsing and improved constant array initialization, like I’ve been wanting for years? That second […]