Quickstart - Debugging C and C++ memory errors
February 14, 2022 | View Comments
When programming C or C++, memory access violations resulting in so-called segmentation faults (segfaults) are a very common thing. These will usually crash your program, leaving you with only an unhelpful error message, and no further clue as to where to look for the problem.
This page introduces two tools that'll help you debug these types of errors quickly. They are AddressSanitizer and GDB.
So let's consider this simple example program that looks up a letter in the alphabet based on the user's input:
#include <stdio.h> #include <stdlib.h> const char* alphabet = "abcdefghijklmnopqrstuvwxyz"; int main(int argc, char** argv) { int pos = atoi(argv[1]); char letter = alphabet[pos-1]; printf("The %dth letter of the alphabet is %c\n", pos, letter); }
Here's how we can compile it and run it:
$ gcc alphabet.c -o alphabet $ ./alphabet 4 The 4th letter of the alphabet is d
And here's two ways how we can crash our program, by causing it to access array elements that are out of bounds (resulting in buffer overflow):
$ ./alphabet Segmentation fault (core dumped) $ ./alphabet 123123 Segmentation fault (core dumped)
Print stack traces and more with AddressSanitizer
This is where AddressSanitizer comes in. It's a tool that's available in GCC, Clang and other compilers. To use it, we add two options when compiling: -g to add debugging symbols and -fsanitize=address to enable AddressSanitizer:
$ gcc alphabet.c -g -fsanitize=address -o alphabet
AddressSanitizer will make our program run a bit slower, so we won't usually use it in production. But for development, when our program crashes, it'll print helpful information around how our program failed. Let's take a look:
$ ./alphabet 123123 AddressSanitizer:DEADLYSIGNAL ================================================================= ==789439==ERROR: AddressSanitizer: SEGV on unknown address 0x558b4c760112 (pc 0x558b4c7412bb bp 0x7ffebf7bfe60 sp 0x7ffebf7bfe40 T0) ==789439==The signal is caused by a READ memory access. #0 0x558b4c7412ba in main ./alphabet.c:8 #1 0x7f9e1593b0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) #2 0x558b4c74116d in _start (./alphabet+0x116d) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV ./alphabet.c:8 in main ==789439==ABORTING
Among other bits, you'll find that it prints a stack trace that tells us the exact line at which our program failed:
#0 0x558b4c7412ba in main ./alphabet.c:8
This turns out to be the line where we're accessing an element from the alphabet array that's out of bounds:
char letter = alphabet[pos-1];
AddressSanitizer also helps with a number of other memory access issues, and can give us further information on when the problematic memory was allocated or freed. It can also help track down memory leaks.
Post-mortem debugging with GNU Debugger
Another way of finding out where and why our program crashed is using the GNU Debugger. For GDB to work properly, we need to add debugging symbols when compiling using the -g flag:
$ gcc alphabet.c -g -o alphabet
We then start our program using GDB. First we call gdb alphabet to start the debugger itself. Then at the (gdb) prompt, we run the program with any arguments. Below we'll run our program with the problematic 123123 argument again:
$ gdb alphabet GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. ... For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from alphabet... (gdb) run 123123 Starting program: ./alphabet 123123 Program received signal SIGSEGV, Segmentation fault. main (argc=2, argv=0x7fffffffe388) at alphabet.c:8 8 char letter = alphabet[pos-1]; (gdb)
Here, GDB not only shows us where the program crashed (8 char letter = alphabet[pos-1];), it also allows us to perform post-mortem debugging: it enables us to inspect the state of our program at the point where it crashed. So we can for instance print out the value of the pos variable at the time of the crash:
(gdb) print pos $1 = 123123
Another useful GDB feature is that it allows us to stop the execution of our program at any time to introspect variables, not only after a crash. For this, we can set a breakpoint, before we call run in the GDB prompt. In the following example, we'll set a breakpoint at line 7 and then run our alphabet program without any arguments:
$ gdb alphabet GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. ... For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from alphabet... (gdb) b alphabet.c:7 Breakpoint 1 at 0x117c: file alphabet.c, line 7. (gdb) run Starting program: ./alphabet Breakpoint 1, main (argc=1, argv=0x7fffffffe3a8) at alphabet.c:7 7 int pos = atoi(argv[1]); (gdb)
With this feature we're free to print out the value of variables at arbitrary points in our program, to try to better understand what's going on.