| # GPU Pixel Testing With Gold |
| |
| This page describes various extra details of the Skia Gold service |
| that the GPU pixel tests use. For information on running the tests locally, see |
| [this section][local pixel testing]. For common information on triaging, |
| modification, or general pixel wrangling, see [GPU Pixel Wrangling] or these |
| sections ([1][pixel debugging], [2][pixel updating]) of the general GPU testing |
| documentation. |
| |
| [local pixel testing]: gpu_testing.md#Running-the-pixel-tests-locally |
| [GPU Pixel Wrangling]: pixel_wrangling.md |
| [pixel debugging]: gpu_testing.md#Debugging-Pixel-Test-Failures-on-the-GPU-Bots |
| [pixel updating]: gpu_testing.md#Updating-and-Adding-New-Pixel-Tests-to-the-GPU-Bots |
| |
| [TOC] |
| |
| ## Skia Gold |
| |
| [Gold][gold documentation] is an image diff service developed by the Skia team. |
| It was originally developed solely for Skia's usage and only supported |
| post-submit tests, but has been picked up by other projects such as Chromium and |
| PDFium and now supports trybots. Unlike other image diff solutions in Chromium, |
| comparisons are done in an external service instead of locally on the testing |
| machine. |
| |
| [gold documentation]: https://skia.org/dev/testing/skiagold |
| |
| ### Why Gold |
| |
| Gold has three main advantages over the traditional local image comparison |
| historically used by Chromium: |
| |
| 1. Triage time can be much lower. Because triaging is handled by an external |
| service, new golden images don't need to go through the CQ and wait for |
| waterfall bots to pick up the CL. Once an image is triaged in Gold, it |
| becomes immediately available for future test runs. |
| 2. Gold supports multiple approved images per test. It is not uncommon for |
| tests to produce images that are visually indistinguishable, but differ in |
| a handful of pixels by a small RGB value. Fuzzy image diffing can solve this |
| problem, but introduces its own set of issues such as possibly causing a test |
| to erroneously pass. Since most tests that exhibit this behavior only actually |
| produce 2 or 3 possible valid images, being able to say that any of those |
| images are acceptable is simpler and less error-prone. |
| 3. Better image storage. Traditionally, images had to either be included |
| directly in the repository or uploaded to a Google Storage bucket and pulled in |
| using the image's hash. The former allowed users to easily see which images were |
| currently approved, but storing large sized or numerous binary files in git is |
| generally discouraged due to the way git's history works. The latter worked |
| around the git issues, but made it much more difficult to actually see what was |
| being used since the only thing the user had to go on was a hash. Gold moves the |
| images out of the repository, but provides a GUI interface for easily seeing |
| which images are currently approved for a particular test. |
| |
| ### How It Works |
| |
| Gold consists of two main parts: the Gold instance/service and the `goldctl` |
| binary. A Gold instance in turn consists of two parts: a Google Storage bucket |
| that data is uploaded to and a server running on GCE that ingests the data and |
| provides a way to triage diffs. `goldctl` simply provides a standardized way |
| of interacting with Gold - uploading data to the correct place, retrieving |
| baselines/golden information, etc. |
| |
| In general, the following order of events occurs when running a Gold-enabled |
| test: |
| |
| 1. The test produces an image and passes it to `goldctl`, along with some |
| information about the hardware and software configuration that the image was |
| produced on, the test name, etc. |
| 2. `goldctl` checks whether the hash of the produced image is in the list of |
| approved hashes. |
| 1. If it is, `goldctl` exits with a non-failing return code and nothing else |
| happens. At this point, the test is finished. |
| 2. If it is not, `goldctl` uploads the image and metadata to the storage |
| bucket and exits with a failing return code. |
| 3. The server sees the new data in the bucket and ingests it, showing a new |
| untriaged image in the GUI. |
| 4. A user approves the new image in the GUI, and the server adds the image's |
| hash to the baselines. See the [Waterfall Bots](#Waterfall-Bots) and |
| [Trybots](#Trybots) sections for specifics on this. |
| 5. The next time the test is run, the new image is in the baselines, and |
| assuming the test produces the same image again, the test passes. |
| |
| While this is the general order of events, there are several differences between |
| waterfall/CI bots and trybots. |
| |
| #### Waterfall Bots |
| |
| Waterfall bots are the simpler of the two bot types. There is only a single |
| set of baselines to worry about, which is whatever baselines were approved for |
| a git revision. Additionally, any new images that are produced on waterfalls are |
| all lumped into the same group of "untriaged images on master", and any images |
| that are approved from here will immediately be added to the set of baselines |
| for master. |
| |
| Since not all waterfall bots have a trybot counterpart that can be relied upon |
| to catch newly produced images before a CL is committed, it is likely that a |
| change that produces new goldens on the CQ will end up making some of the |
| waterfall bots red for a bit, particularly those on chromium.gpu.fyi. They will |
| remain red until the new images are triaged as positive or the tests stop |
| producing the untriaged images. So, it is best to keep an eye out for a few |
| hours after your CL is committed for any new images from the waterfall bots that |
| need triaging. |
| |
| #### Trybots |
| |
| Trybots are a little more complicated when it comes to retrieving and approving |
| images. First, the set of baselines that are provided when requested by a test |
| is the union of the master baselines for the current revision and any baselines |
| that are unique to the CL. For example, if an image with the hash `abcd` is in |
| the master baselines for `FooTest` and the CL being tested has also approved |
| an image with the hash `abef` for `FooTest`, then the provided baselines will |
| contain both `abcd` and `abef` for `FooTest`. |
| |
| When an image associated with a CL is approved, the approval only applies to |
| that CL until the CL is merged. Once this happens, any baselines produced by the |
| CL are automatically merged into the master baselines for whatever git revision |
| the CL was merged as. In the above example, if the CL was merged as commit |
| `ffff`, then both `abcd` and `abef` would be approved images on master from |
| `ffff` onward. |
| |
| ## Triaging Less Common Failures |
| |
| ### Triaging Images Without A Specific Build |
| |
| You can see all currently untriaged images that are currently being produced on |
| ToT on the [GPU Gold instance's main page][gpu gold instance] and currently |
| untriaged images for a CL by substituting the Gerrit CL number into |
| `https://chrome-gpu-gold.skia.org/search?issue=[CL Number]&unt=true&master=true`. |
| |
| [gpu gold instance]: https://chrome-gpu-gold.skia.org |
| |
| It's possible, particularly if a test is regularly producing multiple images, |
| for an image to be untriaged but not show up on the front page of the Gold |
| instance (for details, see [this crbug comment][untriaged non tot comment]). To |
| see all such images, visit [this link][untriaged non tot]. |
| |
| [untriaged non tot comment]: https://bugs.chromium.org/p/skia/issues/detail?id=9189#c4 |
| [untriaged non tot]: https://chrome-gpu-gold.skia.org/search?fdiffmax=-1&fref=false&frgbamax=255&frgbamin=0&head=false&include=false&limit=50&master=false&match=name&metric=combined&neg=false&offset=0&pos=false&query=source_type%3Dchrome-gpu&sort=desc&unt=true |
| |
| ### Finding A Failed Build |
| |
| If for some reason you know that a test run produced a bad image, but do not |
| have a direct link to the failed build (e.g. you found a bad image using the |
| untriaged non-ToT link from above), you may want to find the failed Swarming |
| task to help debug the issue. Gold currently provides a list of CLs that were |
| under test when a particular image was produced, but does not provide a link to |
| the build that produced it, so the following workaround can be used. |
| |
| Assuming the failure is relatively recent (within the past week or so), you can |
| use the flakiness dashboard to help find the failed run. To do so, substitute |
| the test name into |
| `https://test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&testType=pixel_skia_gold_test&tests=[test_name]` |
| and scroll through the history until you find the failed build (represented by |
| a red square). Click on the build and follow the `Build log` link. This will |
| take you to the failed build, from which you can get to the Swarming task like |
| normal by scrolling to the failed step and clicking on the link for the failed |
| shard number. |
| |
| ### Triaging A Specific Image |
| |
| If for some reason an image is not showing up in Gold but you know the hash, you |
| can manually navigate to the page for it by filling in the correct information |
| to `https://chrome-gpu-gold.skia.org/detail?test=[test_name]&digest=[hash]`. |
| From there, you should be able to triage it as normal. |
| |
| If this happens, please also file a bug in [Skia's bug tracker][skia crbug] so |
| that the root cause can be investigated and fixed. It's likely that you will |
| be unable to directly edit the owner, CC list, etc. directly, in which case |
| ping kjlubick@ with a link to the filed bug to help speed up triaging. Include |
| as much detail as possible, such as a links to the failed swarming task and |
| the triage link for the problematic image. |
| |
| [skia crbug]: https://bugs.chromium.org/p/skia |
| |
| ## Working On Gold |
| |
| ### Modifying Gold And goldctl |
| |
| Although uncommon, changes to the Gold service and `goldctl` binary may be |
| needed. To do so, simply get a checkout of the |
| [Skia infrastructure repo][skia infra repo] and go through the same steps as |
| a Chromium CL (`git cl upload`, etc.). |
| |
| [skia infra repo]: https://skia.googlesource.com/buildbot/ |
| |
| The Gold service code is located in the `//golden/` directory, while `goldctl` |
| is located in `//gold-client/`. Once your change is merged, you will have to |
| either contact kjlubick@google.com to roll the service version or follow the |
| steps in [Rolling goldctl](#Rolling-goldctl) to roll the `goldctl` version used |
| by Chromium. |
| |
| ### Rolling goldctl |
| |
| `goldctl` is available as a CIPD package and is DEPSed in as part of `gclient |
| sync` To update the binary used in Chromium, perform the following steps: |
| |
| 1. (One-time only) get an [infra checkout][infra repo] |
| 2. Run `infra $ ./go/env.py` and run each of the commands it outputs to change |
| your GOPATH and other environment variables for your terminal |
| 3. Update the Skia revision in [`deps.yaml`][deps yaml] to match the revision |
| of your `goldctl` CL (or some revision after it) |
| 4. Run `infra $ ./go/deps.py update` and copy the list of repos it updated |
| (include this in your CL description) |
| 5. Upload the changelist ([sample CL][sample roll cl]) |
| 6. Once the CL is merged, wait until the git revision of your merged CL shows |
| up in the tags of one of the instances for the [CIPD package][goldctl package] |
| 7. Update the [revision in DEPS][goldctl deps entry] to be the merged CL's |
| revision |
| |
| [infra repo]: https://chromium.googlesource.com/infra/infra/ |
| [deps yaml]: https://chromium.googlesource.com/infra/infra/+/91333d832a4d871b4219580dfb874b49a97e6da4/go/deps.yaml#432 |
| [sample roll cl]: https://chromium-review.googlesource.com/c/infra/infra/+/1493426 |
| [goldctl package]: https://chrome-infra-packages.appspot.com/p/skia/tools/goldctl/linux-amd64/+/ |
| [goldctl deps entry]: https://chromium.googlesource.com/chromium/src/+/6b7213a45382f01ac0a2efec1015545bd051da89/DEPS#1304 |
| |
| If you want to make sure that `goldctl` builds after the update before |
| committing (e.g. to ensure that no extra third party dependencies were added), |
| run the following after the `./go/deps.py update` step: |
| |
| 1. `infra $ ./go/deps.py install` |
| 2. `infra $ go install go.skia.org/infra/gold-client/cmd/goldctl` |
| 3. `infra $ which goldctl` which should point to a binary in `infra/go/bin/` |
| 4. `infra $ goldctl` to make sure it actually runs |