Debugging Memory Issues
This page is designed to help Chromium developers debug memory issues.
When in doubt, reach out to firstname.lastname@example.org.
Investigating Reproducible Memory Issues
Let‘s say that there’s a CL or feature that reproducibly increases memory usage when it's landed/enabled, given a particular set of repro steps.
- Take a look at the documentation for both taking and navigating memory-infra traces.
- Take two memory-infra traces. One with the reproducible memory regression, and one without.
- Load the memory-infra traces into two tabs.
- Compare the memory dump providers and look for the one that shows the regression. Follow the relevant link.
Regression in Malloc MemoryDumpProvider
Repeat the above steps, but this time also take a heap dump. Confirm that the regression is also visible in the heap dump, and then compare the two heap dumps to find the difference. You can also use diff_heap_profiler.py to perform the diff.
Regression in Non-Malloc MemoryDumpProvider
Hopefully the MemoryDumpProvider has sufficient information to help diagnose the leak. Depending on the whether the leaked object is allocated via malloc or new
- it usually should be, you can also use the steps for debugging a Malloc MemoryDumpProvider regression.
Regression only in Private Footprint
- Repeat the repro steps, but instead of taking a memory-infra trace, use the following tools to map the process's virtual space:
- On macOS, use vmmap
- On Windows, use SysInternal VMMap
- On other OSes, use /proc/<pid>/smaps.
- The results should help diagnose what's happening. Contact the email@example.com mailing list for more help.
No observed regression
- If there isn't a regression in PrivateMemoryFootprint, then this might become a question of semantics for what constitutes a memory regression. Common problems include:
- Shared Memory, which is hard to attribute, but is mostly accounted for in the memory-infra trace.
- Binary size, which is currently not accounted for anywhere.
Investigating Heap Dumps From the Wild
For a small set of Chrome users in the wild, Chrome will record and upload anonymized heap dumps. This has the benefit of wider coverage for real code paths, at the expense of reproducibility.
These heap dumps can take some time to grok, but frequently yield valuable insight. At the time of this writing, heap dumps from the wild have resulted in real, high impact bugs being found in Chrome code ~90% of the time.
For an example investigation of a real heap dump, see this link.
- Raw heap dumps can be viewed in the trace viewer. See detailed instructions.. This interface surfaces all available information, but can be overwhelming and is usually unnecessary for investigating heap dumps.
- Important note: Heap profiling in the field uses Poisson process sampling with a rate parameter of 10000. This means that for large/frequent allocations [e.g. >100 MB], the noise will be quite small [much less than 1%]. But there is noise so counts will not be exact.
- The heap dump summary typically contains all information necessary to diagnose a memory issue.
- The stack trace of the potential memory leak is almost always sufficient to tell the type of object being leaked, since most functions in Chrome have a limited number of calls to new and malloc.
- The next thing to do is to determine whether the memory usage is intentional. Very rarely, components in Chrome legitimately need to use many 100s of MBs of memory. In this case, it's important to create a MemoryDumpProvider to report this memory usage, so that we have a better understanding of which components are using a lot of memory. For an example, see Issue 813046.
- Assuming the memory usage is not intentional, the next thing to do is to figure out what is causing the memory leak.
- The most common cause is adding elements to a container with no limit. Usually the code makes assumptions about how frequently it will be called in the wild, and something breaks those assumptions. Or sometimes the code to clear the container is not called as frequently as expected [or at all]. Example 1. Example 2.
- Retain cycles for ref-counted objects. Example
- Straight up leaks resulting from incorrect use of APIs. Example 1. Example 2.
Taking a Heap Dump
Navigate to chrome://flags and search for memlog. There are several options that can be used to configure heap dumps. All of these options are also available as command line flags, for automated test runs [e.g. telemetry].
#memlog controls which processes are profiled. It's also possible to manually specify the process via the interface at
#memlog-in-process makes the profiling service to be run within the Chrome browser process. Defaults to run the service as a separate dedicated process.
#memlog-sampling-rate specifies the sampling interval in bytes. The lower the interval, the more precise is the profile. However it comes at the cost of performance. Default value is 100KB, that is enough to observe allocation sites that make allocations >500KB total, where total equals to a single allocation size times the number of such allocations at the same call site.
#memlog-stack-mode describes the type of metadata recorded for each allocation.
native stacks provide the most utility. The only time the other options should be considered is for Android official builds, most of which do not support
Once the flags have been set appropriately, restart Chrome and take a memory-infra trace. The results will have a heap dump.