Enable sharding of AllocationRegister on desktop.
Previously, when native heap profiling was enabled, all allocations in a process
were gated by a global lock. This proves to be a significant performance hit on
macOS. Profiling shows that 50-75% of all wall time is spent waiting on this
lock.
This CL introduces a new class ShardedAllocationRegister to handle sharding for
desktop platforms. This stores allocation/backtrace information across 64
different AllocationRegister instances. In addition, the sizes of the fixed size
hash maps were changed. The number of allocation buckets was reduced by a factor
of 8, and the number of backtrace buckets reduced by a factor of 16.
The new class ShardedAllocationRegister is thread-safe, and its consumers no
longer need to acquire a lock when using the container. Each consumer still
needs to know whether heap profiling is enabled. This CL uses a subtle::Atomic32
along with Acquire_Load and Release_Store to determine this state. Using a bool
to record this state and a base::Lock to synchronize reading/writing from the
bool has performance almost as bad as the global lock.
BUG=724651
Review-Url: https://codereview.chromium.org/2890363003
Cr-Original-Commit-Position: refs/heads/master@{#473696}
Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src
Cr-Mirrored-Commit: bd599af5d33647f0b498311eab05be79fac98e63
7 files changed