Debuggers and Symbol Tables

Get the Math, Better Explained eBook and turn Huh? to Aha!

Compiling a program turns a human-readable source file (set x = 3) into assembly language (load 3 into register) and finally into machine language (101010101…).

Debugging is stepping through a program to identify the cause of errors. But how do you step through a program that is stored as low-level machine instructions? (Some talented souls can read and decipher raw computer instructions — I cannot).

Symbol Tables

Enter the symbol table. This table maps instructions in the compiled binary program to their corresponding variable, function, or line in the source code. This mapping could be something like:

  • Program instruction => item name, item type, original file, line number defined…

[Aside: I’m not sure exactly how symbol tables are implemented. They could add tags to the source code (a SymbolID), or store the address of the instruction and map that to a variable declaration, variable use, line number, etc.]

Symbol tables may be embedded into the program, or stored as a separate file.

Symbol tables may not be created by default — the compiler must be told to create a “debug” version with a symbol table (the “-g” option for the GCC compiler). A program without the symbol table is called a “retail” build, and is more difficult to reverse-engineer — it has no information that maps the binary program to the original source code.

The symbol table does not include the source code, but can give clues about it by referring to the actual variable and function names. There are no variable names in compiled binary programs — all operations are done using numbered registers.

Debuggers

A “debugger” is an application that reads the symbol table and lets a programmer walk through the program being debugged. It can execute and step through the program, showing the line of source code. This is great for fixing bugs — if your program crashes, reproduce the behavior and see the exact line in the source code that caused the crash. Fix the bug, and try again.

Debuggers can also

  • Set breakpoints — pause the program when it reaches a certain line of code (useful for checking error conditions)
  • Set variables — change internal program variables, and see how the software responds
  • Set watches and conditions — pause (aka break) the program when a certain conditions is met, such as a variable reaching a certain value

We can infer a few facts about symbol tables:

  • A symbol table works for a particular version of the program –if the program changes, a new table must be made.
  • Debug builds are often larger and slower than retail (non-debug) builds; debug builds contain the symbol table and other ancillary information.
  • If you wish to debug a binary program you did not compile yourself, you must get the symbol tables from the author.

More info:

Better Explained is dedicated to helping you grasp concepts, and serves over 250k readers each month.

Enjoy this article? Try the site guide or join the newsletter:

7 Comments

  1. > If you wish to debug a binary program you did
    > not compile yourself, you must get the symbol
    > tables from the author.

    Not if you’re one of those “talented souls” who “can read and decipher raw computer instructions” :)

  2. you seem to imply that retail builds don’t have associated symbols at all. (pardon the windows jargon) But whether you build debug or release in Visual studio, you will see that a separate .pdb file is created which contains symbols.

  3. Thanks for the info. Yep, symbols may or may not be included depending on the build system. On UNIX, the symbols aren’t there unless you explicitly specify it.

  4. Sometimes we need indeed to include symbol-info into our “Retail” release, in order to further debug it. In some systems, this info is stored externally and, thus, costs nothing to the final executable.

    Developers choose a “debug” version of a program having more run-time checks in it, a “retail” one being faster and optimized. One has to explicitly deny the production of symbol info at either release version. But why bother so, if the symbol tables are located to an external file?!

Your feedback is welcome -- leave a reply!

Your email address will not be published.

LaTeX: $$e=mc^2$$