Desktop Web App Integration Testing Framework

Background

The WebAppProvider system has very wide action space and testing state interactions between all of the subsystems is very difficult. The integration testing framework is intended to help testing critical user journeys for installable web apps in a scalable fashion.

The framework and process is broken down into the following pieces:

A list of critical user journeys and the actions used to make those journeys.
A script that can process these (along with information about which action is supported by which platform).
Tests generated by the script that use the WebAppIntegrationTestDriver to execute actions.
Coverage information about what percentage of the critical user journeys are covered.

How-tos:

WebAppProvider README.md

When to add integration tests?

Any web app feature (or any code in general) should have a combination of unit tests and browser tests that focus on testing the specific feature itself. Unit tests are the least likely to become flaky, and allow fine-grained testing of a system. Browser tests are more likely to be flaky but enable testing with most of the system running. Regular unit tests or browser tests should be used for testing system parts in isolation and for testing minute details like handling of error cases.

Integration tests are required for all critical user journeys for the dPWA (or installable web app) product. If a feature is to be considered “supported” on the dPWA platform, then it MUST have it's critical user journeys described and integration tests generated for these journeys.

Future Work

Change arguments from string values (e.g. "SiteA") to enumeration values (e.g. Site::kSiteA).
Making test generation friendlier for the developer.
- Detecting incorrect tests.
- Modifying tests automatically in common scenarios.

Terminology

Action

A primitive test operation, or test building block, that can be used to create coverage tests. The integration test framework may or may not support an action on a given platform. Actions can fall into one of three types:

State-change action
State-check action
parameterized action

Actions also can be fully, partially, or not supported on a given platform. This information is used when generating the tests to run & coverage report for a given platform. To make parsing easier, actions are always snake_case.

State-change Action

A state-changing action is expected to change the state chrome or the web app provider system.

Examples: navigate_browser(SiteA), switch_incognito_profile, sync_turn_off, set_app_badge

State-check Action

Some actions are classified as “state check” actions, which means they do not change any state and only inspect the state of the system. In graph representations, state check actions are not given nodes, and instead live under the last non-state-check action.

All actions that start with check_ are considered state-check actions.

Examples: check_app_list_empty, check_install_icon_shown, check_platform_shortcut_exists(SiteA), check_tab_created

Action Arguments

When creating tests, there emerged a common scenario where a given action could be applied to multiple different sites. For example, the “navigate the browser to an installable site” action was useful if “site” could be customized.

To accept arguments, list the argument types you wish to accept in the “Argument Types” column in the actions file. If an required argument type does not exist, please add it to that file.

To allow for future de-parsing of modes (when generating C++ tests), modes will always be PascalCase.

Default argument values

Each enumeration can specify a “default” enumeration value that is used if the test does not specify an argument value. This is done by a * character.

Parameterized Action

To help with testing scenarios like outlined above, an action can be defined that references or ‘turns into’ a set of non-parameterized actions. For example, an action install_windowed can be created and reference the set of actions install_omnibox_icon, install_menu_option, install_create_shortcut_windowed, add_policy_app_windowed_shortcuts, and add_policy_app_windowed_no_shortcuts. When a test case includes this action, it will generate multiple tests in which the parameterized action is replaced with the non-parameterized action.

Arguments & Argument Forwarding

All output actions must have their arguments fully specified, no defaults are respected. This avoids implementation complexity and mistakes.

To “forward” arguments from the parent parameterized action to the output actions, you can use the bash-style argument syntax.

Example output actions for install_windowed, which has a single Site argument:

install_omnibox_icon($1),
install_create_shortcut_windowed($1),
add_policy_app_windowed_shortcuts($1),
etc

These actions use the first argument of the parent action as their argument.

Tests

A sequence of actions used to test the WebAppProvider system. A test that can be run by the test framework must not have any “parameterized” actions, as these are supposed to be used to generate multiple tests.

Unprocessed Required-coverage tests

This is the set of tests that, if all executed, should provide full test coverage for the WebAppProvider system. They currently live in this file as “unprocessed”.

Required-coverage tests (processed)

Processed tests go through the following steps from the unprocessed version in the file:

Tests with one or more “parameterized” actions have been processed to produce the resulting tests without parameterized actions.
Actions in tests that have arguments but do not specify them have the default argument added to them. Default arguments are known only if all argument types have a default value specified.

Platform-specific tests

The first column of the test specifies which platforms the test should be created for:

W = Windows
M = Mac
L = Linux
C = ChromeOS

This is because some tests need to be platform-specific. For example, all tests that involve “locally installing” an app are only applicable on Windows/Mac/Linux, as ChromeOS automatically locally installs all apps from sync. Because of this, tests must be able to specify which platforms they should be run on.

Sync Partition and Default Partition

Due to some browsertest support limitations, certain actions are only supported in the sync testing framework. Because of this, the script supports a separate “partition” of tests for any test that uses sync actions. This means that at test output time, a test will either go in the “sync” partition or the “default” partition.

See the sync tests design doc for more information.

Script Design & Usage

The script takes the following information:

A list of action-based tests which fully test the WebAppProvider system (a.k.a. required-coverage tests).
A list of actions supported by the integration test framework (per-platform).

The results of running the script is:

Console output (to stdout) of the minimal number of tests (per-platform) to run to achieve the maximum coverage of the critical user journeys.
- If tests already exist, then these are taken into account and not printed.
- If any existing tests are unnecessary, then the script will inform the developer that they can be removed.
The resulting coverage of the system (with required-coverage tests as 100%).

See the design doc for more information and links.

Downloading test data

The test data is hosted in this spreadsheet. To download the latest copy of the data, run the included script:

./chrome/test/webapps/download_data_from_sheet.py

This will download the data from the sheet into csv files in the data/ directory:

actions.csv This describes all actions that can be used in the required coverage tests (processed or unprocessed).
coverage_required.csv This is the full list of all tests needed to fully cover the Web App system. The first column specifies the platforms for testing, and the test starts on the second column.

Generating test descriptions & coverage

Required test changes are printed and coverage files are written by running:

chrome/test/webapps/generate_framework_tests_and_coverage.py

This uses the files in chrome/test/webapps/data and existing browsertests on the system (see custom_partitions and default_partitions in generate_framework_tests_and_coverage.py) to:

1) Print to `stdout` all detected changes needed to browsertests.

The script is not smart enough to automatically add/remove/move tests to keep complexity to a minimum. Instead, it prints out the tests that need to be added or removed to have the tests match what it expects. It assumes:

Browsertests are correctly described by the TestPartitionDescriptions in generate_framework_tests_and_coverage.py.
Browsertests with the per-platform suffixes (e.g. _mac, _win, etc) are only run on those platforms

This process doesn't modify the browsertest files so any test disabling done by sheriffs can remain. The script runner is thus expected to make the requested changes manually. In the rare case that a test is moving between files (if we are enabling a test on a new platform, for example), then the script runner should be careful to copy any sheriff changes to the browsertest as well.

2) Generate per-platform processed required coverage `tsv` files in `chrome/test/webapps/coverage`

These are the processed required coverage tests with markers per action to allow a conditional formatter (like the one here) to highlight what was and was not covered by the testing framework.

These files also contain a coverage % at the top of the file. Full coverage is the percent of the actions of the processed required coverage test that were executed and fully covered by the framework. Partial coverage also includes actions that are partially covered by the framework.
This includes loss of coverage from any disabled tests. Cool!

Exploring the tested and coverage graphs

To view the directed graphs that are generated to process the test and coverage data, the --graphs switch can be specified:

chrome/test/webapps/generate_framework_tests_and_coverage.py --graphs

This will generate:

coverage_required_graph.dot - The graph of all of the required test coverage. Green nodes are actions explicitly listed in the coverage list, and orange nodes specify partial coverage paths.
framework_test_graph_<platform>.dot - The graph that is now tested by the generated framework tests for the given platform, including partial coverage.

The graphviz library can be used to view these graphs. An install-free online version is here.

Debugging Further

To help debug or explore further, please see the graph_cli_tool.py script which includes a number of command line utilities to process the various files.

Both this file and the generate_framework_tests_and_coverage.py file support the -v option to print out informational logging.

`WebAppIntegrationTestDriver` and Browsertest Implementation

After the script has output the tests that are needed, they still need to be compiled and run by something. The WebAppIntegrationTestDriver is what runs the actions, and the browsertests themselves are put into specific files based on which partition they will be run in (default or sync), and which platforms will be running them.

These are all of the files that make up the browsertest and browsertest support of the dPWA integration test framework:

WebAppIntegrationTestDriver - This class implements most actions that are used by the generated tests.
web_app_integration_browsertest.cc - These are the cross-platform tests in the default partition.
- web_app_integration_browsertest_mac_win_linux.cc - These are the default partition tests that only run on mac, windows, and linux.
- web_app_integration_browsertest_cros.cc.
two_client_web_apps_integration_test_* - These are the tests in the sync partition.
- two_client_web_apps_integration_test_mac_win_linux.cc
- two_client_web_apps_integration_test_cros.cc

Creating Action Implementations

The driver implements all actions for the generated tests. For tests in the sync partition (which require the functionality of the SyncTest base class), some actions are delegated to the two_client_web_apps_integration_test_base.h).

Testing actions must:

Call the appropriate BeforeState*Action() and AfterState*Action() functions inside of their function body.
Wait until the action is fully completed before returning.
Try to exercise code as close to the user-level action as reasonably possible.
Accommodate multiple browsertests running at the same time on the trybot (be careful modifying global information on an operating system).
Ensure that, at the end of the test, all side effects are cleaned up in TearDownOnMainThread().

To help with state-check actions, the state of the system is recorded before and after every state-change action. This allows for actions to detect any changes that happened in the last state-change action. This is stored in before_state_change_action_state_ and after_state_change_action_state_, and generated by calling ConstructStateSnapshot().

When adding actions, it may be useful to add information into this state snapshot in order to verify the results of state-change actions within state-check actions. To do this:

Add a field onto the relevant state object.
Update the == and << operators.
Populate the field in the ConstructStateSnapshot() method.

Then, this field can be accessed in the before_state_change_action_state_ and after_state_change_action_state_ members appropriately.

Running the tests on Mac

There is a history of browser_tests being disabled on Mac trybots & CQ (but they run on the waterfall). To ensure there are no mac failures for changes to integration tests:

Click on the “Choose Trybots” button in Gerrit.
Filter for “mac”
Choose applicable builders from the “luci.chromium.try” section. Examples (from 2022.04.08):
- mac11-arm64-rel
- mac12-arm64-rel
- mac_chromium_10.11_rel_ng
- mac_chromium_10.12_rel_ng
- mac_chromium_10.13_rel_ng
- mac_chromium_10.14_rel_ng
- mac_chromium_10.15_rel_ng
- mac_chromium_11.0_rel_ng
- mac_chromium_asan_rel_ng

Running tests on these bots MAY have other random failures happening. That is normal.

Disabling a Test

Tests can be disabled in the same manner that other integration/browser tests are disabled, using macros. See on disabling tests for more information.

Understanding and Implementing Test Cases

Actions are the basic building blocks of integration tests. A test is a sequence of actions. Each action has a name that must be a valid C++ identifier.

Actions are defined (and can be modified) in this sheet. Tests are defined (and can be modified) in this sheet.

Action Creation & Specification

Actions are the building blocks of tests.

This section is meant to help describe how the testing script works internally, but may not be helpful for those just looking to simply run the script.

Templates

To help making test writing less repetitive, actions are described as templates in the actions spreadsheet. Action templates specify actions while avoiding rote repetition. Each action template has a name (the action base name). Each action template supports arguments, which must be defined in the enumeration sheet. Parameter values must also be valid C++ identifiers.

An action template without arguments specifies one action whose name matches the template. For example, the check_tab_created template generates the check_tab_created action.

An action template with arguments that can take N * M * (...etc) values for all combinations of each argument type specified. The action names are the concatenations of the template name and the corresponding value name, separated by an underscore (_). For example, the clear_app_badge template generates the clear_app_badge_SiteA and clear_app_badge_SiteB actions. A three-argument action like install_policy_app may generate many actions, for example: install_policy_app_SiteA_Windowed_NoShortcut, install_policy_app_SiteA_Windowed_Shortcut, install_policy_app_SiteA_Browser_NoShortcut, etc

The templates also support parameterizing an action, which causes any test that uses the action to be expanded into multiple tests, one per specified output action. Arguments carry over into the output action by using bash-style string replacement of argument values. If an output action doesn't support a given argument value then that parameterization is simply excluded during test generation.

To see any “skipped” parameterized output functions, please run the script with the -v option to see the log.

Default Values

All argument types can mark one of their values as the default value by using a * character. This value is used if the test author does not specify one.

Note: Default values are not considered when resolving the output actions of parameterized actions.

Specifying an Argument

Human-friendly action names are a slight variation of the canonical names above.

Actions generated by argument-less templates have the same human-friendly name as their canonical name.

Actions generated by templates use parenthesis to separate the template name from the value name. For example, the actions generated by the clear_app_badge template have the human-friendly names clear_app_badge(SiteA) and clear_app_badge(SiteB).

The template name can be used as the human-friendly name of the action generated by the template with the default value. For example, clear_app_badge is a shorter human-friendly name equivalent to clear_app_badge(SiteA).

Test Creation & Specification

Tests are created specifying actions.

For a step-by-step guide for creating a new integration test, see this guide.

Mindset

The mindset for test creation and organization is to really exhaustively check every possible string of user actions. The framework will automatically combine tests that are the same except for state check actions. They are currently organized by:

Setup actions - The state change actions needed to enter the system state that is being tested.
Primary state-change action/s - Generally one action that will change the system state after the setup.
State check action - One state check action, checking the state of the system after the previous actions have executed.

Each test can have at most one state check action as the last action.

One way to enumerate tests is to think about affected-by action edges. These are a pair of two actions, where the first action affects the second. For example, the action set_app_badge will affect the action check_app_badge_has_value. Or, uninstall_from_app_list will affect check_platform_shortcut_exists. There is often then different setup states that would effect these actions. Once these edges are identified, tests can be created around them.

Creating a test

A test should be created that does the bare minimum necessary to set up the test before testing the primary state change action and then checking the state.

The framework is designed to be able to collapse tests that contain common non-‘state-check’ actions, so adding a new test does not always mean that a whole new test will be run by the framework. Sometimes it only adds a few extra state-check actions in an existing test.

If new actions are required for a test, see How to create WebApp Integration Tests for more information about how to add a new action.