Windows Binary Sizes

Chrome binaries sometimes increase in size for unclear reasons and investigating these regressions can be quite tricky. There are a a few tools that help with investigating these regressions or just looking for wasted bytes in DLLs or EXEs. The tools are:

  • tools\win\ShowGlobals - this executable (must be built from a VS solution) uses DIA2 (and is based off of the dia2dump sample whose source comes with Visual Studio) to print all ‘interesting’ global variables. Just pass the path to a PDB file and details about any large or redundant global variables will be printed. Large global variables are simple enough to understand. Redundant or repeated global variables are when the same global ends up in the binary multiple times - sometimes dozens or hundreds of times. This happens when a non-integral global variable (typically a struct, array, float, or double, or an integral type with a non-trivial constructor) is defined as static or const in a header file, causing it to be instantiated in multiple translation units. The constants kWastageThreshold and kBigSizeThreshold can be used to configure how much information ShowGlobals reports. Making these command-line parameters is left as an exercise for the reader.
  • tools\win\pdb_compare_globals.py - this uses ShowGlobals.exe to get the interesting global variables from two PE files and then print what has changed, either different sizes or different interesting globals. This is designed for understanding size regressions, since binary size regressions, even when they are mostly due to code size, often have a correlated effect on global variables
  • tools\win\pe_summarize.py - this displays a summary of the size of the sections in the DLLs or EXEs whose paths are passed to it. If two copies of the same DLL/EXE are passed to it then the size differences are printed as a summary, to make regressions and improvements easier to see. This is most often used for measuring progress or seeing what section has grown.
  • tools\win\linker_verbose_tracking.py - this scans the output of “link /verbose” in order to see why a particular object file was linked in. This can either be because it was listed on the command line (perhaps it was in a source set) or because it was referenced (often indirectly) by an object file that was listed on the command line.
  • SymbolSort - this is a third party tool available on github that generates various reports, including a sorted-by-size list of all symbols, including functions. The -diff option to SymbolSort is particularly useful for identifying where a change in size is coming from, grouped in various ways. Usage: SymbolSort -in new\chrome.dll.pdb -diff old\chrome.dll.pdb -out change.txt
  • SymbolSort data can be visualized using webtreemap - see https://github.com/rongjiecomputer/bloat-win for an example
  • Static initializers waste code size and create process-private data. The source to a static_initializers tool that will look for symbol names associated with static initializers is in the Chromium repo in tools\win\static_initializers.

Details of how and when to use these tools are shown below.

If looking for size saving opportunities then “ShowGlobals.exe file.pdb” can be used to find duplicated or large global variables that may be unnecessary. Typical (shortened for this document) results look like this - the first set of entries are duplicated globals, the second set of entries are large globals:

#Dups DupSize Size Section Symbol-name

805 805 std::piecewise_construct

3 204 rgb_red

3 204 rgb_green

3 204 rgb_blue

187 187 WTF::in_place

4 160 extensions::api::g_factory

...

122784 2 kBrotliDictionary

65536 2 jpeg_nbits_table

57080 2 propsVectorsTrie_index

53064 3 unigram_table

50364 2 kNetworkingPrivate

47152 3 device::UsbIds::vendors_

...

The actual output is tab separated and can be most easily visualized by pasting into a spreadsheet to ensure that the columns line up.

Some fixed issues include:

  • webrtc regression was analyzed to find a fix - see https://bugs.chromium.org/p/chromium/issues/detail?id=734631#c14 for the many steps
  • We have 216 functions instances of color_xform_RGBA totaling 68,184 bytes, down from 300 instances totaling 146496 bytes (crbug.com/677450 and crbug.com/680973). This waste was found by using SymbolSort's -diff option to investigate a sizes regression.

The most egregious problems shown by these reports have been fixed but some remaining issues include:

The large kBrotliDictionary and jpeg_nbits_table arrays can be seen, but those are used and are in the read-only section, so there is nothing to be done.

When investigating a regression pdb_compare_globals.py can be used to find out what large or duplicated global variables have showed up between two builds. Just pass both PDBs and a summary of the changes will be printed. This uses ShowGlobals.exe to generate the list of interesting global variables and then prints a diff.

When investigating a regression (or testing a fix) it can be useful to use pe_summarize.py to print the size of all of the sections with a PE, or to compare to PE files (two versions of chrome.dll, for instance). This is the ultimate measure of success for a change - has it made the binary smaller or larger, and if so where?

If an unwanted global variable is being linked in then linker_verbose_tracking.py can be used to help answer the question of “why?” First you need to find out what object file defines the variable. For instance, ff_cos_131072 and the other ff_* globals are defined in rdft.c. When rdft.obj is pulled in then Chrome gets significantly bigger. Some of this is discussed in comments #25 to #27 here: https://bugs.chromium.org/p/chromium/issues/detail?id=624274#c27.

In order to get verbose linker output you need to modify the appropriate BUILD.gn file to add the /verbose linker flag. For chrome.dll I make the following modification:

diff --git a/chrome/BUILD.gn b/chrome/BUILD.gn

index 58586fc..c15d463 100644

--- a/chrome/BUILD.gn

+++ b/chrome/BUILD.gn

@@ -354,6 +354,7 @@ if (is_win) {

“/DELAYLOAD:winspool.drv”,

“/DELAYLOAD:ws2_32.dll”,

“/DELAYLOAD:wsock32.dll”,

  • “/verbose”,

]

if (!is_component_build) {

Then build chrome.dll, redirecting the verbose output to a text file:

ninja -C out\release chrome.dll >chrome_dll_01.txt

Then linker_verbose_tracking.py is used to find why a particular object file is being pulled in, in this case rdft.obj:

python linker_verbose_tracking.py chrome_dll_01.txt rdft

Typical output looks like this:

python linker_verbose_tracking.py chrome_dll_01.txt rdft

Read 662400 lines

Database loaded - 11453 xrefs found

rdft.obj referenced by

avfft.obj for symbol ff_rdft_init

avfft.obj referenced by

FFTFrameFFMPEG.obj for symbol av_rdft_calc

FFTFrameFFMPEG.obj referenced by

PeriodicWave.obj for symbol public: void __thiscall blink::FFTFrame::doInverseFFT(float *)

PeriodicWave.obj referenced by

SkGeometry.obj for symbol log2f

SkGeometry.obj referenced by

SkPath.obj for symbol int __cdecl SkChopQuadAtYExtrema(struct SkPoint const * const,struct SkPoint * const)

SkPath.obj referenced by

Command-line obj file: color_picker.obj for symbol public: void __thiscall SkPath::addOval(struct SkRect const &,enum SkPath::Direction)

In this case this tells us that rdft.obj is being pulled in indirectly through a chain of references that starts with color_picker.obj. color_picker.obj is in a source_set which means that it is on the command-line. I then change the source_set to a static_library and redo the steps. If all goes well then the unwanted .obj file will eventually stop being pulled in and I can use pe_summarize.py to measure the savings, and ShowGlobals to look for the next large global variable.

Sometimes this technique - of changing source_set to static_library - doesn’t help. In those case it can be important to figure out follow the chain of symbols and see if it can be broken. In this case above we can see that SkGeometry.obj (skia) is pulling in PeriodicWave.obj (Blink) because of log2f. Investigation showed that SkGeometry.obj referenced the math.h function log2f, while PeriodicWave.obj defined it as an inline function. The chain was broken by not defining log2f as an inline function (letting math.h do that) and a significant amount of code size was saved.

Note that some size regressions only happen with certain build configurations. Ideally all testing would be done with PGO builds, but that is unwieldy, so release (non-component?) builds are probably the best starting point. If a size regression reproes on this configuration then investigate there and fix it. But, it is advisable to verify the fix on a full official build - with both is_official_build and full_wpo_on_official set to true. Failing to do this test can lead to a fix that doesn’t actually help on the builds that we ship to customers, and this can go unnoticed for months.

See crrev.com/2556603002 for an example of using this technique. In this case it was sufficient to change a single source_set to a static_library. The size savings of 900 KB was verified using:

python pe_summarize.py out\release\chrome.dll size_reduction\chrome.dll

Size of out\release\chrome.dll is 42.127872 MB

name: mem size , disk size

.text: 33.900375 MB

.rdata: 6.325718 MB

.data: 0.718696 MB, 0.274944 MB

.tls: 0.000025 MB

CPADinfo: 0.000036 MB

.rodata: 0.003216 MB

.crthunk: 0.000064 MB

.gfids: 0.001052 MB

_RDATA: 0.000288 MB

.rsrc: 0.175088 MB

.reloc: 1.443124 MB

Size of size_reduction\chrome.dll is 41.211392 MB

name: mem size , disk size

.text: 33.188599 MB

.rdata: 6.164966 MB

.data: 0.707848 MB, 0.264704 MB

.tls: 0.000025 MB

CPADinfo: 0.000036 MB

.rodata: 0.003216 MB

.crthunk: 0.000064 MB

.gfids: 0.001052 MB

_RDATA: 0.000288 MB

.rsrc: 0.175088 MB

.reloc: 1.409388 MB

Change from out\release\chrome.dll to size_reduction\chrome.dll

.text: -711776 bytes change

.rdata: -160752 bytes change

.data: -10848 bytes change

.reloc: -33736 bytes change

Total change: -917112 bytes