Generic TObjectList layout has changed

I ran across a really strange bug at work yesterday: Access Violation when performing a certain operation in the program.

These are usually really simple.  Easy to reproduce, and the debugger takes you right to the problem, and there you are, confronted by a nil that someone forgot to initialize properly.  Usually.

This time, though, when I went to reproduce it, I was confronted by a read of an address around 80808080, which wasn’t what was in the stack trace from the exception report I got.  That’s what FastMM’s FullDebugMode uses to identify freed objects.  So OK, this is a use-after-free bug, then.

Except it wasn’t.  The operation it was crashing on was retrieving the value of MyList[i].  Inspecting MyList in the debugger showed that it was not a freed object, and TObjectList<T> doesn’t contain any objects in its internal state; FItems is an array, the Count and Capacity properties are integers, the Comparer is an interface, and so on.

So I looked a bit more closely at the surrounding code, and saw something truly weird going on: the code was retrieving the list from a property of another object, and hard-casting it to TObjectList<T>.  That would make sense if that property were typed as TObject, but this one was actually typed as TObjectList!

That’s when it all came together.  Someone had created this code with a non-generic TObjectList back in the day, then for some reason, when we got generics, instead of casting objects to the right type when taking them out of the list, they cast the list itself to a generic list to make it easier to work with!  And that accidentally worked just fine in earlier versions of the codebase because the internal memory layout of TObjectList and TObjectList<T> was compatible, but when they tried to use this code on Delphi 10 Seattle, it crashed, because there’s a new field in there before FItems and no corresponding change on the non-generic list!

Replacing the original list with the generic version fixed the bug.

Everyone please remember, hard-casts are evil because they lead to problems like this.  Even if what you’re doing works now, that’s no guarantee that it will continue to work in future versions!

5 Comments

  1. Fronzel Neekburm says:

    Exactly. Very evil. The official Delphi components for Sybase Advantage Server (now SAP) also suffered from this – They also hard-casted lists. It seemed to work. Until the containers (like TList) changed in XE8.

  2. Thaddy says:

    It’s not evil.
    It is simply that Delphi programmers are often under-educated about what a hardcast means.
    It also tells you something about the quality of the engineers, both developers and testing engineers, currently employed by Embarcadero….
    Sigh.
    Hard casts are a valid proramming paradigm.
    Programmers should know about its pitfalls.

  3. Thaddy says:

    I may add that if you are cocky enough to be sure about layout you should really be sure about its consequences.
    This kind of low level stuff is only temporarily valid by definition.

    Programmer mistake. Fired.

Leave a Reply to Fronzel Neekburm