Memory management: still a non-issue

I got assigned an interesting bug to fix today at work:  Performing a certain operation in our program caused an enormous memory leak, producing a FastMM report file that weighed in at over 150 MB, representing a serious amount of RAM in our program.  A bit of debugging made it obvious that a certain interfaced object was at the root of the problem, and it had a refcount of 1 when the program ended.  I found the object that was holding a reference to it and went looking for what was holding it up… and it turned out to have a refcount of over 4700 when the program ended!

And just to make things worse, it wasn’t just one place in the code where _AddRef was being called on this object; a bit of checking revealed multiple places that were calling it thousands of times.  This is where some people would freak out and say that maybe the garbage collection advocates are onto something; there’s no way you can ever hope to trace through all of that and come up with anything useful!

I took a bit of a different view of it.  If there’s no way you can ever hope to trace through all of that… then don’t; find a smarter way to do it.  If you have a data set large enough that it can only be understood statistically, then gather some statistics!

Turns out one of my coworkers at De Novo Software had come up with a system for tracking down leaks like this by instrumenting your code to log adds and releases.  It generates a logfile which you can load into his program for statistical analysis.  He showed me how to set it up, which consisted of adding one unit to the project and a few lines of code to the class I was trying to instrument, and in just a few minutes I was building the log.

When I loaded it into the viewer, I got a TTreeView full of data about where _AddRef and _Release were being called, and I could edit out irrelevant bits of data.  It was pretty intuitive for the most part, and soon I found there were two main places where _AddRef was being called over 4700 times, and one where _Release was being called over 4700 times.  The log file captured call stacks, so it was easy to see that one of the _AddRef groups matched the _Release group, which meant the problem was in the other _AddRef group.  From there, I had a stack trace, and it took about 2 minutes to track the problem to a minor misunderstanding about how records, pointers-to-records, and record copying worked.  So I fixed the code, rebuilt, and tested it, and the leak was gone.

This is what I’ve been saying for years.  Proper tooling makes the “problems” of real memory management trivial, without needing to sacrifice the performance benefits.  Which, experience has shown, can be considerable, especially in memory-constrained environments such as mobile devices!

I asked my coworker, and he said that his tool “is not production ready,” but if that changes, I’ll make sure to post a link to it, since it’s really quite useful.

6 Comments

  1. Gad D Lord says:

    If he is upto making money then convince him not to wait. If he is willing to open source it than Google code will wait for him.

  2. It’s sad that over 18 years after the introduction of Interfaces in Delphi there is still no proper tool to profile/trace them.
    And it’s also sad that people holding back things that might be super beneficial to others just because they are not polished.

  3. Eduardo says:

    Tell to your co-worker that tools is EXACTLY what I am need RIGHT NOW!!!

    I have a wild interface reference that is fighting to remain alive, and I want it to die!

    Ask him to open source it.. we give a hand and contribute to finish

  4. Bruce McGee says:

    Wow. The Drew Crawford post was really interesting (and long).

    It also makes me that much more interested in seeing ARC in Delphi’s desktop compilers.

  5. I developed an open source tool called RefCountTracer especially for detection of refcount based interface leaks. It sounds like the tool from your collegue does something similar but I did an optimized graphical representation of the results. You simply add two lines of code (in _AddRef and _Release) and it’ll shows you exactly where the refcounts increase and decrease (including call stack, line number, etc.). Short living references are properly eliminated so only the leaking ones are left or those which are passed throught the whole application. As it needs a callstack to work you need MadExcept or Eurekalog (maybe someone can add JclDebug?).
    Have a look at it here: https://github.com/AquaSoftGmbH/RefCountTracer

  6. EMB says:

    Now I’m feeling like a kid looking at someone else’s ice-cream. And I like ice-cream… 🙁

Leave a Reply