2020-04-22

Little surprises #3: gdb is smart

Where should a debugger report errors that occur in object destructors?

The obvious answer is that they should report it on the line where it occurred, which is right as far as it goes. But in many cases the system standard library doesn't have debugging symbols so errors that occur in the object destructors of, for instance, standard containers won't report a line.

No worries, however, you can grab a stack trace to start finding out where in your code the stupid is being injected.1 But if you are using gdb you may be in for a surprise. Consider this little program
/* some preamble omitted for brevity */
class dtorcrash {
public:
  dtorcrash(int i): m_i(i) {}
  ~dtorcrash(){ std::cout << 10/m_i << std::endl; }// Line 8
private:
  int m_i;
};


int main(int argv, char**argc)
{
  int i = 0;
  if (argv > 1)
    i = atoi(argc[1]);
    
  dtorcrash * heap = nullptr;
  {
    dtorcrash local0(i++);                         // Line 22
    dtorcrash local1(i++);
    heap = new dtorcrash(i++);
    std::cout << "Constructed." << std::endl;
  }                                                // Line 26 
  delete heap;                                     // Line 27
  std::cout << "Destructed." << std::endl;

  return EXIT_SUCCESS;
}
If the command-line argument evaluates to zero, minus-one, or minus-two it will crash because the destructor of dtorcrash executes a division by zero.

The interesting thing is what we see when we examine the stack traces in gdb following those crashes.

If you pass -2 the back trace reads
#0  0x0000555555554d32 in dtorcrash::~dtorcrash (this=0x555555768e70, __in_chrg=) at dtorcrash.cpp:8
#1  0x0000555555554c29 in main (argv=2, argc=0x7fffffffdfc8) at dtorcrash.cpp:27
Here (as in all cases) the actual offending line is reported as the d'tor of dtorcrash. Further the location in the calling frame is identified as delete heap;. All as expected.

The interesting thing with the location in the calling frame if you supply 0 or 1 as the command line argument. In that case gdb does not identify the closing scoping brace on line 26 as the site of the offending call. Instead it identifies one of the declarations at the top of the scoping block. Even though the code will print Constructed. before crashing. For instance if we allow the argument to default to zero the back trace generated looks like
#0  0x0000555555554d32 in dtorcrash::~dtorcrash (this=0x7fffffffdeb4, __in_chrg=) at dtorcrash.cpp:8
#1  0x0000555555554c18 in main (argv=2, argc=0x7fffffffdfc8) at dtorcrash.cpp:22
where line 22 is the declaration of local0.

What the *&^% is going on?


The critical difference is that the weird behavior shows up with the crash is triggered by an object with automatic storage scope. For heap allocated objects there is an explicit call to delete.2

On the other hand the end of a scope can trigger the reaping of multiple objects (indeed, the sample code is written so that two objects are reaped at the same close-brace).

If gdb just told you that the crash occurred in a destructor triggered by that close-brace you wouldn't know which objected was responsible, so instead it points you at the declaration of the offending object.

This means that have to notice the destructor in the back-trace and then scan the code to find the lower limit of the scope associated with that declaration, but you know which object to be checking up on.

Nice.



1 Every novice programmer thinks they've found a bug in a core tool at some point. They're almost always wrong. In my case I spent a couple of hours sure there was something wrong with the system implementation of atan2. There wasn't. It's embarrassing.

More experienced programmers get so used to assuming that their own code is the problem that it can be very hard for them to convince themselves that the problem is in someone else's code.

2 It may be buried in some other objects d'tor, but it's there.

No comments:

Post a Comment