| <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> |
| <HTML> |
| |
| <HEAD> |
| <link rel="stylesheet" href="designstyle.css"> |
| <title>Gperftools Heap Leak Checker</title> |
| </HEAD> |
| |
| <BODY> |
| |
| <p align=right> |
| <i>Last modified |
| <script type=text/javascript> |
| var lm = new Date(document.lastModified); |
| document.write(lm.toDateString()); |
| </script></i> |
| </p> |
| |
| <p>This is the heap checker we use at Google to detect memory leaks in |
| C++ programs. There are three parts to using it: linking the library |
| into an application, running the code, and analyzing the output.</p> |
| |
| |
| <H1>Linking in the Library</H1> |
| |
| <p>The heap-checker is part of tcmalloc, so to install the heap |
| checker into your executable, add <code>-ltcmalloc</code> to the |
| link-time step for your executable. Also, while we don't necessarily |
| recommend this form of usage, it's possible to add in the profiler at |
| run-time using <code>LD_PRELOAD</code>:</p> |
| <pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre> |
| |
| <p>This does <i>not</i> turn on heap checking; it just inserts the |
| code. For that reason, it's practical to just always link |
| <code>-ltcmalloc</code> into a binary while developing; that's what we |
| do at Google. (However, since any user can turn on the profiler by |
| setting an environment variable, it's not necessarily recommended to |
| install heapchecker-linked binaries into a production, running |
| system.) Note that if you wish to use the heap checker, you must |
| also use the tcmalloc memory-allocation library. There is no way |
| currently to use the heap checker separate from tcmalloc.</p> |
| |
| |
| <h1>Running the Code</h1> |
| |
| <p>Note: For security reasons, heap profiling will not write to a file |
| -- and is thus not usable -- for setuid programs.</p> |
| |
| <h2><a name="whole_program">Whole-program Heap Leak Checking</a></h2> |
| |
| <p>The recommended way to use the heap checker is in "whole program" |
| mode. In this case, the heap-checker starts tracking memory |
| allocations before the start of <code>main()</code>, and checks again |
| at program-exit. If it finds any memory leaks -- that is, any memory |
| not pointed to by objects that are still "live" at program-exit -- it |
| aborts the program (via <code>exit(1)</code>) and prints a message |
| describing how to track down the memory leak (using <A |
| HREF="heapprofile.html#pprof">pprof</A>).</p> |
| |
| <p>The heap-checker records the stack trace for each allocation while |
| it is active. This causes a significant increase in memory usage, in |
| addition to slowing your program down.</p> |
| |
| <p>Here's how to run a program with whole-program heap checking:</p> |
| |
| <ol> |
| <li> <p>Define the environment variable HEAPCHECK to the <A |
| HREF="#types">type of heap-checking</A> to do. For instance, |
| to heap-check |
| <code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p> |
| <pre>% env HEAPCHECK=normal /usr/local/bin/my_binary_compiled_with_tcmalloc</pre> |
| </ol> |
| |
| <p>No other action is required.</p> |
| |
| <p>Note that since the heap-checker uses the heap-profiling framework |
| internally, it is not possible to run both the heap-checker and <A |
| HREF="heapprofile.html">heap profiler</A> at the same time.</p> |
| |
| |
| <h3><a name="types">Flavors of Heap Checking</a></h3> |
| |
| <p>These are the legal values when running a whole-program heap |
| check:</p> |
| <ol> |
| <li> <code>minimal</code> |
| <li> <code>normal</code> |
| <li> <code>strict</code> |
| <li> <code>draconian</code> |
| </ol> |
| |
| <p>"Minimal" heap-checking starts as late as possible in a |
| initialization, meaning you can leak some memory in your |
| initialization routines (that run before <code>main()</code>, say), |
| and not trigger a leak message. If you frequently (and purposefully) |
| leak data in one-time global initializers, "minimal" mode is useful |
| for you. Otherwise, you should avoid it for stricter modes.</p> |
| |
| <p>"Normal" heap-checking tracks <A HREF="#live">live objects</A> and |
| reports a leak for any data that is not reachable via a live object |
| when the program exits.</p> |
| |
| <p>"Strict" heap-checking is much like "normal" but has a few extra |
| checks that memory isn't lost in global destructors. In particular, |
| if you have a global variable that allocates memory during program |
| execution, and then "forgets" about the memory in the global |
| destructor (say, by setting the pointer to it to NULL) without freeing |
| it, that will prompt a leak message in "strict" mode, though not in |
| "normal" mode.</p> |
| |
| <p>"Draconian" heap-checking is appropriate for those who like to be |
| very precise about their memory management, and want the heap-checker |
| to help them enforce it. In "draconian" mode, the heap-checker does |
| not do "live object" checking at all, so it reports a leak unless |
| <i>all</i> allocated memory is freed before program exit. (However, |
| you can use <A HREF="#disable">IgnoreObject()</A> to re-enable |
| liveness-checking on an object-by-object basis.)</p> |
| |
| <p>"Normal" mode, as the name implies, is the one used most often at |
| Google. It's appropriate for everyday heap-checking use.</p> |
| |
| <p>In addition, there are two other possible modes:</p> |
| <ul> |
| <li> <code>as-is</code> |
| <li> <code>local</code> |
| </ul> |
| <p><code>as-is</code> is the most flexible mode; it allows you to |
| specify the various <A HREF="#options">knobs</A> of the heap checker |
| explicitly. <code>local</code> activates the <A |
| HREF="#explicit">explicit heap-check instrumentation</A>, but does not |
| turn on any whole-program leak checking.</p> |
| |
| |
| <h3><A NAME="tweaking">Tweaking whole-program checking</A></h3> |
| |
| <p>In some cases you want to check the whole program for memory leaks, |
| but waiting for after <code>main()</code> exits to do the first |
| whole-program leak check is waiting too long: e.g. in a long-running |
| server one might wish to simply periodically check for leaks while the |
| server is running. In this case, you can call the static method |
| <code>HeapLeakChecker::NoGlobalLeaks()</code>, to verify no global leaks have happened |
| as of that point in the program.</p> |
| |
| <p>Alternately, doing the check after <code>main()</code> exits might |
| be too late. Perhaps you have some objects that are known not to |
| clean up properly at exit. You'd like to do the "at exit" check |
| before those objects are destroyed (since while they're live, any |
| memory they point to will not be considered a leak). In that case, |
| you can call <code>HeapLeakChecker::NoGlobalLeaks()</code> manually, near the end of |
| <code>main()</code>, and then call <code>HeapLeakChecker::CancelGlobalCheck()</code> to |
| turn off the automatic post-<code>main()</code> check.</p> |
| |
| <p>Finally, there's a helper macro for "strict" and "draconian" modes, |
| which require all global memory to be freed before program exit. This |
| freeing can be time-consuming and is often unnecessary, since libc |
| cleans up all memory at program-exit for you. If you want the |
| benefits of "strict"/"draconian" modes without the cost of all that |
| freeing, look at <code>REGISTER_HEAPCHECK_CLEANUP</code> (in |
| <code>heap-checker.h</code>). This macro allows you to mark specific |
| cleanup code as active only when the heap-checker is turned on.</p> |
| |
| |
| <h2><a name="explicit">Explicit (Partial-program) Heap Leak Checking</h2> |
| |
| <p>Instead of whole-program checking, you can check certain parts of your |
| code to verify they do not have memory leaks. This check verifies that |
| between two parts of a program, no memory is allocated without being freed.</p> |
| <p>To use this kind of checking code, bracket the code you want |
| checked by creating a <code>HeapLeakChecker</code> object at the |
| beginning of the code segment, and call |
| <code>NoLeaks()</code> at the end. These functions, and all others |
| referred to in this file, are declared in |
| <code><gperftools/heap-checker.h></code>. |
| </p> |
| |
| <p>Here's an example:</p> |
| <pre> |
| HeapLeakChecker heap_checker("test_foo"); |
| { |
| code that exercises some foo functionality; |
| this code should not leak memory; |
| } |
| if (!heap_checker.NoLeaks()) assert(NULL == "heap memory leak"); |
| </pre> |
| |
| <p>Note that adding in the <code>HeapLeakChecker</code> object merely |
| instruments the code for leak-checking. To actually turn on this |
| leak-checking on a particular run of the executable, you must still |
| run with the heap-checker turned on:</p> |
| <pre>% env HEAPCHECK=local /usr/local/bin/my_binary_compiled_with_tcmalloc</pre> |
| <p>If you want to do whole-program leak checking in addition to this |
| manual leak checking, you can run in <code>normal</code> or some other |
| mode instead: they'll run the "local" checks in addition to the |
| whole-program check.</p> |
| |
| |
| <h2><a name="disable">Disabling Heap-checking of Known Leaks</a></h2> |
| |
| <p>Sometimes your code has leaks that you know about and are willing |
| to accept. You would like the heap checker to ignore them when |
| checking your program. You can do this by bracketing the code in |
| question with an appropriate heap-checking construct:</p> |
| <pre> |
| ... |
| { |
| HeapLeakChecker::Disabler disabler; |
| <leaky code> |
| } |
| ... |
| </pre> |
| Any objects allocated by <code>leaky code</code> (including inside any |
| routines called by <code>leaky code</code>) and any objects reachable |
| from such objects are not reported as leaks. |
| |
| <p>Alternately, you can use <code>IgnoreObject()</code>, which takes a |
| pointer to an object to ignore. That memory, and everything reachable |
| from it (by following pointers), is ignored for the purposes of leak |
| checking. You can call <code>UnIgnoreObject()</code> to undo the |
| effects of <code>IgnoreObject()</code>.</p> |
| |
| |
| <h2><a name="options">Tuning the Heap Checker</h2> |
| |
| <p>The heap leak checker has many options, some that trade off running |
| time and accuracy, and others that increase the sensitivity at the |
| risk of returning false positives. For most uses, the range covered |
| by the <A HREF="#types">heap-check flavors</A> is enough, but in |
| specialized cases more control can be helpful.</p> |
| |
| <p> |
| These options are specified via environment varaiables. |
| </p> |
| |
| <p>This first set of options controls sensitivity and accuracy. These |
| options are ignored unless you run the heap checker in <A |
| HREF="#types">as-is</A> mode. |
| |
| <table frame=box rules=sides cellpadding=5 width=100%> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_AFTER_DESTRUCTORS</code></td> |
| <td>Default: false</td> |
| <td> |
| When true, do the final leak check after all other global |
| destructors have run. When false, do it after all |
| <code>REGISTER_HEAPCHECK_CLEANUP</code>, typically much earlier in |
| the global-destructor process. |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_IGNORE_THREAD_LIVE</code></td> |
| <td>Default: true</td> |
| <td> |
| If true, ignore objects reachable from thread stacks and registers |
| (that is, do not report them as leaks). |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_IGNORE_GLOBAL_LIVE</code></td> |
| <td>Default: true</td> |
| <td> |
| If true, ignore objects reachable from global variables and data |
| (that is, do not report them as leaks). |
| </td> |
| </tr> |
| |
| </table> |
| |
| <p>These options modify the behavior of whole-program leak |
| checking.</p> |
| |
| <table frame=box rules=sides cellpadding=5 width=100%> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_MAX_LEAKS</code></td> |
| <td>Default: 20</td> |
| <td> |
| The maximum number of leaks to be printed to stderr (all leaks are still |
| emitted to file output for pprof to visualize). If negative or zero, |
| print all the leaks found. |
| </td> |
| </tr> |
| |
| |
| </table> |
| |
| <p>These options apply to all types of leak checking.</p> |
| |
| <table frame=box rules=sides cellpadding=5 width=100%> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_IDENTIFY_LEAKS</code></td> |
| <td>Default: false</td> |
| <td> |
| If true, generate the addresses of the leaked objects in the |
| generated memory leak profile files. |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code></td> |
| <td>Default: false</td> |
| <td> |
| If true, check all leaks to see if they might be due to the use |
| of unaligned pointers. |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_POINTER_SOURCE_ALIGNMENT</code></td> |
| <td>Default: sizeof(void*)</td> |
| <td> |
| Alignment at which all pointers in memory are supposed to be located. |
| Use 1 if any alignment is ok. |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>PPROF_PATH</code></td> |
| <td>Default: pprof</td> |
| <td> |
| The location of the <code>pprof</code> executable. |
| </td> |
| </tr> |
| |
| <tr valign=top> |
| <td><code>HEAP_CHECK_DUMP_DIRECTORY</code></td> |
| <td>Default: /tmp</td> |
| <td> |
| Where the heap-profile files are kept while the program is running. |
| </td> |
| </tr> |
| |
| </table> |
| |
| |
| <h2>Tips for Handling Detected Leaks</h2> |
| |
| <p>What do you do when the heap leak checker detects a memory leak? |
| First, you should run the reported <code>pprof</code> command; |
| hopefully, that is enough to track down the location where the leak |
| occurs.</p> |
| |
| <p>If the leak is a real leak, you should fix it!</p> |
| |
| <p>If you are sure that the reported leaks are not dangerous and there |
| is no good way to fix them, then you can use |
| <code>HeapLeakChecker::Disabler</code> and/or |
| <code>HeapLeakChecker::IgnoreObject()</code> to disable heap-checking |
| for certain parts of the codebase.</p> |
| |
| <p>In "strict" or "draconian" mode, leaks may be due to incomplete |
| cleanup in the destructors of global variables. If you don't wish to |
| augment the cleanup routines, but still want to run in "strict" or |
| "draconian" mode, consider using <A |
| HREF="#tweaking"><code>REGISTER_HEAPCHECK_CLEANUP</code></A>.</p> |
| |
| <h2>Hints for Debugging Detected Leaks</h2> |
| |
| <p>Sometimes it can be useful to not only know the exact code that |
| allocates the leaked objects, but also the addresses of the leaked objects. |
| Combining this e.g. with additional logging in the program |
| one can then track which subset of the allocations |
| made at a certain spot in the code are leaked. |
| <br/> |
| To get the addresses of all leaked objects |
| define the environment variable <code>HEAP_CHECK_IDENTIFY_LEAKS</code> |
| to be <code>1</code>. |
| The object addresses will be reported in the form of addresses |
| of fake immediate callers of the memory allocation routines. |
| Note that the performance of doing leak-checking in this mode |
| can be noticeably worse than the default mode. |
| </p> |
| |
| <p>One relatively common class of leaks that don't look real |
| is the case of multiple initialization. |
| In such cases the reported leaks are typically things that are |
| linked from some global objects, |
| which are initialized and say never modified again. |
| The non-obvious cause of the leak is frequently the fact that |
| the initialization code for these objects executes more than once. |
| <br/> |
| E.g. if the code of some <code>.cc</code> file is made to be included twice |
| into the binary, then the constructors for global objects defined in that file |
| will execute twice thus leaking the things allocated on the first run. |
| <br/> |
| Similar problems can occur if object initialization is done more explicitly |
| e.g. on demand by a slightly buggy code |
| that does not always ensure only-once initialization. |
| </p> |
| |
| <p> |
| A more rare but even more puzzling problem can be use of not properly |
| aligned pointers (maybe inside of not properly aligned objects). |
| Normally such pointers are not followed by the leak checker, |
| hence the objects reachable only via such pointers are reported as leaks. |
| If you suspect this case |
| define the environment variable <code>HEAP_CHECK_TEST_POINTER_ALIGNMENT</code> |
| to be <code>1</code> |
| and then look closely at the generated leak report messages. |
| </p> |
| |
| <h1>How It Works</h1> |
| |
| <p>When a <code>HeapLeakChecker</code> object is constructed, it dumps |
| a memory-usage profile named |
| <code><prefix>.<name>-beg.heap</code> to a temporary |
| directory. When <code>NoLeaks()</code> |
| is called (for whole-program checking, this happens automatically at |
| program-exit), it dumps another profile, named |
| <code><prefix>.<name>-end.heap</code>. |
| (<code><prefix></code> is typically determined automatically, |
| and <code><name></code> is typically <code>argv[0]</code>.) It |
| then compares the two profiles. If the second profile shows |
| more memory use than the first, the |
| <code>NoLeaks()</code> function will |
| return false. For "whole program" profiling, this will cause the |
| executable to abort (via <code>exit(1)</code>). In all cases, it will |
| print a message on how to process the dumped profiles to locate |
| leaks.</p> |
| |
| <h3><A name=live>Detecting Live Objects</A></h3> |
| |
| <p>At any point during a program's execution, all memory that is |
| accessible at that time is considered "live." This includes global |
| variables, and also any memory that is reachable by following pointers |
| from a global variable. It also includes all memory reachable from |
| the current stack frame and from current CPU registers (this captures |
| local variables). Finally, it includes the thread equivalents of |
| these: thread-local storage and thread heaps, memory reachable from |
| thread-local storage and thread heaps, and memory reachable from |
| thread CPU registers.</p> |
| |
| <p>In all modes except "draconian," live memory is not |
| considered to be a leak. We detect this by doing a liveness flood, |
| traversing pointers to heap objects starting from some initial memory |
| regions we know to potentially contain live pointer data. Note that |
| this flood might potentially not find some (global) live data region |
| to start the flood from. If you find such, please file a bug.</p> |
| |
| <p>The liveness flood attempts to treat any properly aligned byte |
| sequences as pointers to heap objects and thinks that it found a good |
| pointer whenever the current heap memory map contains an object with |
| the address whose byte representation we found. Some pointers into |
| not-at-start of object will also work here.</p> |
| |
| <p>As a result of this simple approach, it's possible (though |
| unlikely) for the flood to be inexact and occasionally result in |
| leaked objects being erroneously determined to be live. For instance, |
| random bit patterns can happen to look like pointers to leaked heap |
| objects. More likely, stale pointer data not corresponding to any |
| live program variables can be still present in memory regions, |
| especially in thread stacks. For instance, depending on how the local |
| <code>malloc</code> is implemented, it may reuse a heap object |
| address:</p> |
| <pre> |
| char* p = new char[1]; // new might return 0x80000000, say. |
| delete p; |
| new char[1]; // new might return 0x80000000 again |
| // This last new is a leak, but doesn't seem it: p looks like it points to it |
| </pre> |
| |
| <p>In other words, imprecisions in the liveness flood mean that for |
| any heap leak check we might miss some memory leaks. This means that |
| for local leak checks, we might report a memory leak in the local |
| area, even though the leak actually happened before the |
| <code>HeapLeakChecker</code> object was constructed. Note that for |
| whole-program checks, a leak report <i>does</i> always correspond to a |
| real leak (since there's no "before" to have created a false-live |
| object).</p> |
| |
| <p>While this liveness flood approach is not very portable and not |
| 100% accurate, it works in most cases and saves us from writing a lot |
| of explicit clean up code and other hassles when dealing with thread |
| data.</p> |
| |
| |
| <h3>Visualizing Leak with <code>pprof</code></h3> |
| |
| <p> |
| The heap checker automatically prints basic leak info with stack traces of |
| leaked objects' allocation sites, as well as a pprof command line that can be |
| used to visualize the call-graph involved in these allocations. |
| The latter can be much more useful for a human |
| to see where/why the leaks happened, especially if the leaks are numerous. |
| </p> |
| |
| <h3>Leak-checking and Threads</h3> |
| |
| <p>At the time of HeapLeakChecker's construction and during |
| <code>NoLeaks()</code> calls, we grab a lock |
| and then pause all other threads so other threads do not interfere |
| with recording or analyzing the state of the heap.</p> |
| |
| <p>In general, leak checking works correctly in the presence of |
| threads. However, thread stack data liveness determination (via |
| <code>base/linuxthreads.h</code>) does not work when the program is |
| running under GDB, because the ptrace functionality needed for finding |
| threads is already hooked to by GDB. Conversely, leak checker's |
| ptrace attempts might also interfere with GDB. As a result, GDB can |
| result in potentially false leak reports. For this reason, the |
| heap-checker turns itself off when running under GDB.</p> |
| |
| <p>Also, <code>thread_lister</code> only works for Linux pthreads; |
| leak checking is unlikely to handle other thread implementations |
| correctly.</p> |
| |
| <p>As mentioned in the discussion of liveness flooding, thread-stack |
| liveness determination might mis-classify as reachable objects that |
| very recently became unreachable (leaked). This can happen when the |
| pointers to now-logically-unreachable objects are present in the |
| active thread stack frame. In other words, trivial code like the |
| following might not produce the expected leak checking outcome |
| depending on how the compiled code works with the stack:</p> |
| <pre> |
| int* foo = new int [20]; |
| HeapLeakChecker check("a_check"); |
| foo = NULL; |
| // May fail to trigger. |
| if (!heap_checker.NoLeaks()) assert(NULL == "heap memory leak"); |
| </pre> |
| |
| |
| <hr> |
| <address>Maxim Lifantsev<br> |
| <!-- Created: Tue Dec 19 10:43:14 PST 2000 --> |
| <!-- hhmts start --> |
| Last modified: Fri Jul 13 13:14:33 PDT 2007 |
| <!-- hhmts end --> |
| </address> |
| </body> |
| </html> |