|  | Deterministic builds | 
|  | ==================== | 
|  |  | 
|  | Chromium's build is deterministic. This means that building Chromium at the | 
|  | same revision will produce exactly the same binary in two builds, even if | 
|  | these builds are on different machines, in build directories with different | 
|  | names, or if one build is a clobber build and the other build is an incremental | 
|  | build with the full build done at a different revision. This is a project goal, | 
|  | and we have bots that verify that it's true. | 
|  |  | 
|  | Furthermore, even if a binary is built at two different revisions but none of | 
|  | the revisions in between logically affect a binary, then builds at those two | 
|  | revisions should produce exactly the same binary too (imagine a revision that | 
|  | modifies code `chrome/` while we're looking at `base_unittests`). This isn't | 
|  | enforced by bots, and it's currently not always true in Chromium's build -- but | 
|  | it's true for some binaries at least, and it's supposed to become more true | 
|  | over time. | 
|  |  | 
|  | Having deterministic builds is important, among other things, so that swarming | 
|  | can cache test results based on the hash of test inputs. | 
|  |  | 
|  | This document currently describes how to handle failures on the deterministic | 
|  | bots. | 
|  |  | 
|  | There's also | 
|  | https://www.chromium.org/developers/testing/isolated-testing/deterministic-builds; | 
|  | over time all documentation over there will move to here. | 
|  |  | 
|  | Handling failures on the deterministic bots | 
|  | ------------------------------------------- | 
|  |  | 
|  | This section describes what to do when `compare_build_artifacts` is failing on | 
|  | a bot. | 
|  |  | 
|  | The deterministic bots make sure that building the same revision of chromium | 
|  | always produces the same output. | 
|  |  | 
|  | To analyze the failing step, it's useful to understand what the step is doing. | 
|  |  | 
|  | There are two types of checks. | 
|  |  | 
|  | 1. The full determinism check makes sure that build artifacts are independent | 
|  | of the name of the build directory, and that full and incremental builds | 
|  | produce the same output. This is done by having bots that have two build | 
|  | directories: `out/Release` does incremental builds, and `out/Release.2` | 
|  | does full clobber builds. After doing the two builds, the bot checks | 
|  | that all built files needed to run tests on swarming are identical in the | 
|  | two build directories. The full determinism check is currently used on | 
|  | Linux and Windows bots. (`Deterministic Linux (dbg)` has one more check: | 
|  | it doesn't use reclient for the incremental build, to check that using | 
|  | reclient doesn't affect built files either.) | 
|  |  | 
|  | 2. The simple determinism check does a clobber build in `out/Release`, moves | 
|  | this to a different location (`out/Release.1`), then does another clobber | 
|  | build in `out/Release`, moves that to another location (`out/Release.2`), | 
|  | and then does the same comparison as done in the full build. Since both | 
|  | builds are done at the same path, and since both are clobber builds, | 
|  | this doesn't check that the build is independent of the name of the build | 
|  | directory, and it doesn't check that incremental and full builds produce | 
|  | the same results. This check is used on Android and macOS, but over time | 
|  | all platforms should move to the full determinism check. | 
|  |  | 
|  | ### Understanding `compare_build_artifacts` error output | 
|  |  | 
|  | `compare_build_artifacts` prints a list of all files it compares, followed by | 
|  | `": None`" for files that have no difference. Files that are different between | 
|  | the two build directories are followed by `": DIFFERENT(expected)"` or | 
|  | `": DIFFERENT(unexpected)"`, followed by e.g. `"different size: 195312640 != | 
|  | 195311616"` if the two files have different size, or by e.g. `"70 out of | 
|  | 5091840 bytes are different (0.00%)"` if they're the same size. | 
|  |  | 
|  | You can ignore lines that say `": None"` or `": DIFFERENT(expected)"`, these | 
|  | don't turn the step red. `": DIFFERENT(expected)"` is for files that are known | 
|  | to not yet be deterministic; these are listed in | 
|  | [`src/tools/determinism/deterministic_build_ignorelist.pyl`][1].  If the | 
|  | deterministic bots turn red, you usually do *not* want to add an entry to this | 
|  | list, but figure out what introduced the nondeterminism and revert that. | 
|  |  | 
|  | [1]: https://chromium.googlesource.com/chromium/src/+/HEAD/tools/determinism/deterministic_build_ignorelist.pyl | 
|  |  | 
|  | If only a few bytes are different, the script prints a diff of the hexdump | 
|  | of the two files. Most of the time, you can ignore this. | 
|  |  | 
|  | After this list of filenames, the script prints a summary that looks like | 
|  |  | 
|  | ``` | 
|  | Equals:           5454 | 
|  | Expected diffs:   3 | 
|  | Unexpected diffs: 60 | 
|  | Unexpected files with diffs: | 
|  | ``` | 
|  |  | 
|  | followed by a list of all files that contained `": DIFFERENT(unexpected)"`. | 
|  | This is the most interesting part of the output. | 
|  |  | 
|  | After that, the script tries to compute all build inputs of each file with | 
|  | a difference, and compares the inputs. For example, if a .exe is different, | 
|  | this will try to find all .obj files the .exe consists of, and try to compare | 
|  | these too. Nowadays, the compile step is usually deterministic, so this can | 
|  | usually be ignored too. Here's an example output: | 
|  |  | 
|  | ``` | 
|  | fixed_build_dir C:\b\s\w\ir\cache\builder\src\out\Release exists. will try to use orig dir. | 
|  | Checking verifier_test_dll_2.dll.pdb difference: (1 deps) | 
|  | ``` | 
|  |  | 
|  | ### Diagnosing bot redness | 
|  |  | 
|  | Things to do, in order of involvedness and effectiveness: | 
|  |  | 
|  | - Look at the list of files following `"Unexpected files with diffs:"` and check | 
|  | if they have something in common. If the blame list on the first red build | 
|  | has a change to that common thing, try reverting it and see if it helps. | 
|  | If many, seemingly unrelated files have differences, look for changes to | 
|  | the build config (Ctrl-F ".gn") or for toolchain changes (Ctrl-F "clang"). | 
|  |  | 
|  | - The deterministic bots try to upload a tar archive to Google Storage. | 
|  | Use `gsutil.py ls gs://chrome-determinism` to see available archives, | 
|  | and use e.g. `gsutil.py cp gs://chrome-determinism/Windows\ | 
|  | deterministic/9998/deterministic_build_diffs.tgz .` to copy one archive to | 
|  | your workstation. You can then look at the diffs in more detail. See | 
|  | https://bugs.chromium.org/p/chromium/issues/detail?id=985285#c6 for an | 
|  | example. | 
|  |  | 
|  | - Try to reproduce the problem locally. First, set up two build directories | 
|  | with identical args.gn. Then do a full build at the last known green revision | 
|  | in the first build directory: | 
|  |  | 
|  | ``` | 
|  | $ gn clean out/gn | 
|  | $ autoninja -C out/gn base_unittests | 
|  | ``` | 
|  |  | 
|  | Then, sync to the first bad revision (make sure to also run `gclient sync` | 
|  | to update dependencies), do an incremental build in the | 
|  | first build directory and a full build in the second build directory, and | 
|  | run `compare_build_artifacts.py` to compare the outputs: | 
|  |  | 
|  | ``` | 
|  | $ autoninja -C out/gn base_unittests | 
|  | $ gn clean out/gn2 | 
|  | $ autoninja -C out/gn2 base_unittests | 
|  | $ tools/determinism/compare_build_artifacts.py \ | 
|  | --first-build-dir out/gn \ | 
|  | --second-build-dir out/gn2 \ | 
|  | --target-platform linux | 
|  | ``` | 
|  |  | 
|  | This will hopefully reproduce the error, and then you can binary search | 
|  | between good and bad revisions to identify the bad commit. | 
|  |  | 
|  |  | 
|  | Things *not* to do: | 
|  |  | 
|  | - Don't clobber the deterministic bots. Clobbering a deterministic bot will | 
|  | turn it green if build nondeterminism is caused by incremental and full | 
|  | clobber builds producing different outputs. However, this is one of the | 
|  | things we want these bots to catch, and clobbering them only removes the | 
|  | symptom on this one bot -- all CQ bots will still have nondeterministic | 
|  | incremental builds, which is (among other things) bad for caching. So while | 
|  | clobbering a deterministic bot might make it green, it's papering over issues | 
|  | that the deterministic bots are supposed to catch. | 
|  |  | 
|  | - Don't add entries to `src/tools/determinism/deterministic_build_ignorelist.pyl`. | 
|  | Instead, try to revert commits introducing nondeterminism. |