This is a guide to write pixel unit tests to verify Ash UI. Ash pixel unit testing is image-based testing. It takes screenshots in test code then compares the captured screenshots with benchmark images pixel-by-pixel. Therefore, ash pixel unit testing can check the UI features that could be hard to verify through ordinary ash unit tests, such as the appearance of a gradient shader.
Ash pixel unit testing is stable. Supported by the Skia Gold as the backend, it is straightforward to add or change benchmark images. Users can use Skia Gold to tell the difference in a failed test run.
This section teaches how to add a simple test that verifies widgets on the primary screen. The code below can be found here. If you are unfamiliar with Chrome testing, read this doc first.
class DemoAshPixelDiffTest : public AshTestBase { public: // AshTestBase: absl::optional<pixel_test::InitParams> CreatePixelTestInitParams() const override { return pixel_test::InitParams(); } // … unrelated code }; // Create top level widgets at corners of the primary display. Check the // screenshot on these widgets. TEST_F(DemoAshPixelDiffTest, VerifyTopLevelWidgets) { auto widget1 = … auto widget2 = … auto widget3 = … auto widget4 = … EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnPrimaryScreen( "check_widgets", /*revision_number=*/0, widget1.get(), widget2.get(), widget3.get(), widget4.get())); }
DemoAshPixelDiffTest is a subclass of AshTestBase, just like the ordinary ash unit tests. There is one difference: DemoAshPixelDiffTest overrides CreatePixelTestInitParams() to return pixel test initialization params. When pixel test init params are existent, an AshPixelDiffer instance is built during test setup. AshPixelDiffer is a wrapper of Skia Gold APIs that capture screenshots, upload screenshots to the Skia Gold server and return pixel comparison results. AshTestBase exposes the AshPixelDiffer instance by GetPixelDiffer().
The sample code’s test body adds four widgets then it checks these widgets by calling CompareUiComponentsOnPrimaryScreen(), an API provided by AshPixelDiffer to compare the pixels of the given UI components (such as views::view, views::widget and aura::Window) with the benchmark image’s. Section 3.3 will give more details on this and other APIs.
The build target of ash pixel unit tests is ash_pixeltests
. A sample command to build tests:
.../chrome/src $ autoninja -C out/debug ash_pixeltests
The command to run the sample pixel unit test:
.../chrome/src $ out/debug/ash_pixeltests --gtest_filter=DemoAshPixelDiffTest. VerifyTopLevelWidgets
Options of running pixel tests:
--skia-gold-local-png-write-directory=DIR: this option specifies a directory to save the screenshots captured in pixel testing. DIR is an absolute path to an existing directory. Note that a relative file path does not work. The saved screenshots’ names follow the rule illustrated in section 2.4. The screenshots generated by local runs could be slightly different from those generated by CQ runs due to different hardware.
--bypass-skia-gold-functionality: when this option is given, the image comparison functions such as AshPixelDiffTestHelper::ComparePrimaryFullScreen() always return true. Usually this option is used along with skia-gold-local-png-write-directory when comparing with benchmark is not needed, e.g. a user is developing a new test case, which means that the benchmark image does not exist yet.
Developers do not need any extra work to add benchmarks other than writing pixel test code. (NOTE: approving benchmarks through Gold digests mentioned in the old user guide doc is not required anymore). When the CL that contains any new pixel test case is merged, the corresponding new benchmarks will be generated automatically.
Developers can follow Section 4.2 to preview the benchmarks generated by CQ runs before CL merge.
All committed benchmarks are listed in this link. Each benchmark’s name follow this rule: {Test Suite Name}.{Test Case Name}.{Screenshot Name}. {Platform Suffix}, where:
Therefore, the full name of the benchmark image added by the sample code is DemoAshPixelDiffTest.VerifyTopLevelWidgets.check_widgets.rev_0.ash
.
In a parameterized test, “/” used by the TEST_P macro is replaced by “.” since a backlash leads to an illegal file path. Take the following code as an example:
INSTANTIATE_TEST_SUITE_P(RTL, AppListViewPixelRTLTest, testing::Bool()); TEST_P(AppListViewPixelRTLTest, Basics) { // … unrelated code EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnPrimaryScreen( "bubble_launcher_basics", /*revision_number=*/0, …); }
The names of the committed screenshots are:
Updating benchmarks refers to updating the benchmarks of the existing pixel tests. It happens, for example, when a CL under construction touches any product feature then breaks the corresponding pixel tests.
To update a benchmark, a developer should:
Find the broken test code and locate the code line that generates this benchmark.
Increase the revision number by one in code.
For example, the original test code is
EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnPrimaryScreen( "check_widgets", /*revision_number=*/1, widget1.get(), widget2.get(), widget3.get(), widget4.get()));
Then the code after change is
EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnPrimaryScreen( "check_widgets", /*revision_number=*/2, widget1.get(), widget2.get(), widget3.get(), widget4.get()));
The benchmark image will update when the CL is merged. The updated benchmark can still be previewed following the procedure in Section 4.2.
Read Ash pixel test failure triage for more information.
You can customize the pixel_test::InitParams
structure. For example, you can create a pixel test to verify the right-to-left UI layout in the code below:
class DemoRTLTest : public AshTestBase { public: // AshTestBase: absl::optional<pixel_test::InitParams> CreatePixelTestInitParams() const override { pixel_test::InitParams init_params; init_params.under_rtl = true; return init_params; } // … unrelated code };
Use AshPixelDiffer::CompareUiComponentsOnPrimaryScreen() to get the result of a pixel comparison between a screenshot of the primary display that is taken when the test runs and a previously-verified benchmark image.
The first parameter is the screenshot’s name.
The second parameter is the revision number. Please use “0” as the first version number when adding a new benchmark. After each benchmark update, increase the version number by one in the test code. Section 5.1 explains why we should do this.
Besides the screenshot name string and the revision number, this function accepts any number of views::view pointers, aura::window pointers and/or views::widget pointers. In the screenshot taken by this API, only the pixels within the screen bounds of the objects referred to by the given pointers are visible. Note that the screenshot will always be the size of the primary display. The benchmark image generated by the sample code only shows the widgets at the corners while the remaining area is blacked out (see Fig 1).
Here is another example that compares the pixels within the app list bubble view and the shelf navigation widget:
// Verifies the app list view under the clamshell mode. TEST_P(AppListViewPixelRTLTest, Basics) { // … EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnPrimaryScreen( "bubble_launcher_basics", /*revision_number=*/0, GetAppListTestHelper()->GetBubbleView(), GetPrimaryShelf()->navigation_widget())); }
See Its benchmark image in Fig 1.
Fig 1: benchmark image generated by CompareUiComponentsOnPrimaryScreen() |
This API is identical to AshPixelDiffer::CompareUiComponentsOnPrimaryScreen() except that it takes a screenshot of the secondary display rather than the primary display. Note that there must be exactly two displays present to use this API.
Here is an example usage:
// Tests the UI of the notification center tray on a secondary display. TEST_F(NotificationCenterTrayPixelTest, NotificationTrayOnSecondaryDisplayWithTwoNotificationIcons) { // … // Add a secondary display. UpdateDisplay("800x799,800x799"); // Check the UI of the notification center tray on the secondary display. EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnSecondaryScreen( "check_view", /*revision_number=*/0, test_api()->GetTrayOnDisplay(display_manager()->GetDisplayAt(1).id()))); }
Use AshPixelDiffer::CompareUiComponentsOnRootWindow() to get the result of a pixel comparison between a screenshot of the specified root window (not necessarily the primary display's root window) that is taken when the test runs and a previously-verified benchmark image.
This API is nearly identical to AshPixelDiffer::CompareUiComponentsOnPrimaryScreen() except that the first parameter is the root window of which the screenshot will be taken. The other parameters are the same.
Here is an example usage (note that this example is just a slight more cumbersome version of the previous example for AshPixelDiffer::CompareUiComponentsOnSecondaryScreen()):
// Tests the UI of the notification center tray on a secondary display. TEST_F(NotificationCenterTrayPixelTest, NotificationTrayOnSecondaryDisplayWithTwoNotificationIcons) { // … // Add a secondary display. UpdateDisplay("800x799,800x799"); const display::Display& display = display::test::DisplayManagerTestApi(Shell::Get()->display_manager()) .GetSecondaryDisplay(); aura::Window* root_window = Shell::GetRootWindowForDisplayId(display.id()); // Check the UI of the notification center tray on the secondary display. EXPECT_TRUE(GetPixelDiffer()->CompareUiComponentsOnRootWindow( root_window, "check_view", /*revision_number=*/0, test_api()->GetTrayOnDisplay(secondary_display_id))); }
If a screenshot is unstable, its associated pixel test could be flaky. To ensure a stable UI, pixel test during setup does the following jobs:
Despite this, there are still some factors leading to flaky pixel tests. Some common flakiness sources are listed below:
Test writers should ensure that the UI is stable when taking screenshots.
A developer can preview the benchmarks generated by CQ runs before CL merge through the following steps:
Run linux-chromeos-rel
and wait until ash_pixeltests
completion.
Left click at the Gold UI that shows “CL added at least one new image” (demonstrated as below) to jump to a Skia Gold website.
Fig 2: Gold UI example |
Fig 3: Triage log icon example |
Fig 4: Digest link example |
Fig 5: The generated benchmark example |
Developers are encouraged to use the script check_pixel_test_flakiness.py to detect the flakiness in newly created pixel tests before landing their CLs.
This script detects flakiness by running the specified pixel test executable file multiple iterations and comparing screenshots generated by neighboring iterations through file hash code.
A sample usage with the demo pixel test:
./tools/pixel_test/check_pixel_test_flakiness.py --gtest_filter=\ DemoAshPixelDiffTest.VerifyTopLevelWidgets --test_target=out/debug/\ ash_pixeltests --root_dir=../.. --output_dir=var
This command verifies DemoAshPixelDiffTest.VerifyTopLevelWidgets. If it detects flakiness, the screenshots generated by the pixel test are saved under a directory called “var” in the same directory with the Chromium project directory (i.e. a file path like .../chromium/../var).
Please read the comment in the script for further details.
TakePrimaryDisplayScreenshotAndSave() is a debugging helper function that takes a screenshot of full screen. You can use it for debugging even on non-pixel ash unit tests.
Skia Gold does not map the set of benchmark images with code revision. In other words, Skia Gold is not branch based (read Section 5.5 for more details).
To handle branches in Ash pixel tests (such as CL reverting and cherry-picking), developers are required to set the benchmark version number in the test code. Please use “0” as the first version number when adding a new benchmark. After each benchmark update, increase the version number by one in the test code.
Use the following scenarios to explain how this method works:
Scenario 1: Land the CL on the main branch. After landing, the screenshots generated by Foo in CQ runs are compared with “Foo.rev_1”, which is expected.
Scenario 2: Revert the CL on the main branch. After reverting, the screenshots generated by Foo in CQ runs are compared with “Foo.rev_0”, which is expected.
Scenario 3: Similar to Scenario 1 but Foo also runs in the CQ of an old branch. After landing, the screenshots generated by the old branch CQ runs are compared with “Foo.rev_0” while the screenshots from the main branch CQ runs are compared with “Foo.rev_1”. All these behaviors are expected.
Scenario 4: Continue with Scenario 3 but also cherry-pick this CL into an old branch. After cherry-picking, the screenshots generated by the old branch CQ runs are compared with “Foo.rev_1”, which is expected.
The red box, or Skia Gold UI, is a feature created and maintained by the Skia Gold team. Ash pixel testing is not the first user. There are already many pixel tests on Windows and Android. It is a known issue that Skia Gold UI shows due to flaky tests even if these flaky tests actually do not block a CL from landing. One way to check whether your CL breaks any pixel test is to click on the red box. If the untriaged digest does not show any image, your CL should be good. Also, if your CL can land, your CL is good.
If your CL is landed, your CL is good. CQ should reject a CL that breaks pixel tests. See the answer to question 5.2 for more information.
We cannot remove benchmarks of Ash pixel tests manually. But if a benchmark has not been matched in the most recent 2000 test CQ runs, this benchmark is removed from the Skia Gold server automatically.
Skia Gold is not branch based. In other words, Skia Gold is not aware of branches. Explain it by the following example.
Let’s say there is a pixel test called Foo whose benchmark image is image A. There is an incoming CL that updates product code. With this CL, Foo generates a different screenshot denoted by image B.
Scenario 1: The CL is landed on the main branch. In this scenario, Skia Gold treats both A and B as the valid benchmarks of test Foo. Therefore any CL that generates a screenshot identical to either image A or image B in CQ runs passes test Foo.
Scenario 2: The CL is landed then gets reverted. In this scenario, Skia Gold treats both A and B as the valid benchmarks of test Foo, same with Scenario 1. This is why it is important to change benchmark names in tests after each update.