Using wptrunner in Chromium (experimental)

wptrunner is the harness shipped with the web platform tests project for running the test suite. This user guide documents experimental support in Chromium for wptrunner as a replacement for run_web_tests.py for running WPTs.

For general information on web platform tests, see web-platform-tests.org.

Differences from run_web_tests.py

The main differences between run_web_tests.py and wptrunner are that:

  1. wptrunner can communicate with a browser using the standard WebDriver protocol, whereas run_web_tests.py relies on content_shell's protocol mode and internal APIs.
  2. Due to (1), run_web_tests.py can only test the stripped-down content_shell, but wptrunner can also test the full Chrome binary through chromedriver.
  3. wptrunner should automatically support new upstream WPT features (e.g., new testdriver.js APIs).

Running Tests Locally

First, build the ninja target for the product you wish to test:

autoninja -C out/Release chrome_wpt
autoninja -C out/Release content_shell_wpt
autoninja -C out/Release system_webview_wpt   # `android_webview`
autoninja -C out/Release chrome_public_wpt    # `chrome_android` (Clank)

Then, from //third_party/blink/tools/, run the wptrunner wrapper script run_wpt_tests.py with one of the following commands:

run_wpt_tests.py [options] [test list]   # *nix
run_wpt_tests.bat [options] [test list]  # Windows

Like for run_web_tests.py:

  • The test list may contain directories, files, or URLs (e.g., path/to/test.html?variant).
  • Providing a directory or file runs all contained tests.
  • Prefixing a path with virtual/<suite>/ runs that path's tests as virtual tests.
  • Omitting the test list will run all tests and virtual suites, including wpt_internal/

Useful options:

  • -t <target>: Select which //out/ subdirectory to use. Defaults to Release.
  • -p <product>: Select which browser (or browser component) to test. Defaults to content_shell.
  • -v: Increase verbosity (may provide multiple times). -v will dump browser logs.
  • --help: Show detailed usage and all available options.

Builders

A set of FYI builders continuously run all WPTs (including wpt_internal/) on different platforms:

Each builder has an opt-in try mirror with the same name. To run one of these builders against a CL, click “Choose Tryjobs” in Gerrit, then search for the builder name. Alternatively, add a CL description footer of the form:

Cq-Include-Trybots: luci.chromium.try:<builder name>

to opt into select builders when submitting to CQ.

Test Results

Results from the most recent run are placed under //out/<target>/layout-test-results/. The next test run will rename this directory with a timestamp appended to preserve the results.

To view the test results locally, open layout-test-results/results.html in a browser. This results.html is the same results viewer that run_web_tests.py supports. On builders, results.html can be accessed from the “archive results for ...” → web_test_results link as an alternative to the built-in test results UI:

results.html build page link

results.html build page link

results.html

results.html query filter

The artifacts that results.html serves are stored under layout-test-results/ with the same directory structure as their tests. run_wpt_tests.py produces similar kinds of artifacts as run_web_tests.py does:

  • *-{expected,actual,diff}.txt, *-pretty-diff.html: (Sub)test results in the text-based WPT metadata format and their diffs.
  • *-{expected,actual,diff}.png (reftest only): Screenshots of the reference and test pages, and the diff produced by image_diff.
  • *-crash-log.txt: Contains all logs from driver processes (e.g., chromedriver, logcat). For content_shell, this also includes the protocol mode stdout.
  • *-stderr.txt: (Sub)test results in unstructured form.

Testing Different Configurations

run_wpt_tests.py consumes the same VirtualTestSuites, FlagSpecificConfig, and SmokeTests/*.txt files that run_web_tests.py does. Other than the omission of non-WPT tests from test lists, run_wpt_tests.py honors the format and behavior of these files exactly as described for web tests.

run_wpt_tests.py also accepts the --flag-specific option to add a flag-specific configuration's arguments to the binary invocation.

Expectations

wptrunner uses WPT metadata files to specify which tests should run and what results to expect. Each metadata file is checked in with an .ini extension appended to its corresponding test file's path:

external/wpt/folder/my-test.html
external/wpt/folder/my-test-expected.txt  <-- run_web_tests.py baseline
external/wpt/folder/my-test.html.ini      <-- wptrunner metadata

A metadata file is roughly equivalent to a run_web_tests.py baseline and the test's lines in web test expectation files. As the extension implies, metadata follow an INI-like structured text format:

TestExpectations
# Flakily slow
crbug.com/3 external/wpt/a/reftest.html [ Pass Timeout ]

(This TestExpectations line is equivalent to the metadata file on the right.)

external/wpt/a/reftest.html.ini
[reftest.html]
  bug: crbug.com/3
  # Flakily slow
  expected: [PASS, TIMEOUT]
  • The brackets [...] start a (sub)test section whose contents follow in an indented block.
  • The section heading should contain either the subtest name or the test URL without the dirname (i.e., should contain basename and query parameters, if any).
  • A section may contain <key>: <value> pairs. Important keys that wptrunner understands:
    • expected: A status (or list of possible statuses) to expect.
      • Common test statuses include TIMEOUT, CRASH, and either OK/ERROR for testharness tests to represent the overall harness status, or PASS/FAIL for non-testharness tests that only have a single result (e.g., reftests).
      • Common subtest statuses include PASS, FAIL, TIMEOUT, or NOTRUN.
      • For convenience, wptrunner expects OK or PASS when expected is omitted. Deleting the entire metadata file implies an all-PASS test.
    • disabled: Any nonempty value will skip the test or ignore the subtest result. By convention, disabled should contain the reason the (sub)test is disabled, with the literal neverfix for NeverFixTests.
  • # starts a comment that extends to the end of the line.

Note: For testharness tests, the harness statuses OK/ERROR are orthogonal to PASS/FAIL and have different semantics:

  • OK only means all subtests ran to completion normally; it does not imply that every subtest PASSed. A test may FAIL subtests while still reporting the harness is OK.
  • ERROR indicates some problem with the harness, such as a WebDriver remote end disconnection.
  • PASS/FAIL represent passing or failing assertions against the browser under test.

testharness.js subtest expectations are represented by a section nested under the relevant test:

external/wpt/test-expected.txt
This is a testharness.js-based test.
PASS passing subtest
FAIL failing subtest whose name needs an escape []
Harness: the test ran to completion.
external/wpt/test.html.ini
[test.html]
  [failing subtest whose name needs an escape [\]]
    expected: FAIL

Conditional Values

run_web_tests.py reads platform- or flag-specific results from platform tags in TestExpectations, FlagExpectations/*, and baseline fallback. WPT metadata uses a Python-like conditional syntax instead to store all expectations in one file:

TestExpectations
[ Win Debug ] test.html [ Crash ]  # DCHECK triggered
[ Mac11-arm64 ] test.html [ Pass Timeout ]
external/wpt/test.html.ini
[test.html]
  expected:
    if os == "win" and debug: CRASH  # DCHECK triggered
    if port == "mac11-arm64": [PASS, TIMEOUT]
    # Resolves to this default value when no conditions
    # match. An `OK/PASS` here can be omitted because
    # it's implied by an absent value.
    PASS

wptrunner resolves a conditional value to the right-hand side of the first branch whose expression evaluates to a truthy value. Conditions can contain arbitrary Python-like boolean expressions that will be evaluated against properties, variables detected from the test environment. Properties available in Chromium are shown below:

PropertyTypeDescriptionChoices
osstrOS familylinux, mac, win, android, ios
portstrPort name (includes OS version and architecture)See Port.ALL_SYSTEMS (e.g., mac12-arm64)
productstrBrowser or browser componentchrome, content_shell, chrome_android, android_webview, chrome_ios
flag_specificstrFlag-specific suite nameSee FlagSpecificConfig (the empty string "" represents the generic suite)
virtual_suitestrVirtual test suite nameSee VirtualTestSuites (the empty string "" represents the generic suite)
debugboolis_debug build?N/A

Parameterized Tests

In WPT, multiglobal .any.js tests and test variants are forms of parameterization where a test file may generate more than one test ID. The metadata for these parameterizations live in the same .ini file, but under different top-level sections. For example, a test external/wpt/a/b.any.js that generates .any.html and .any.worker.html scopes with variants ?c and ?d can express its expectations as:

TestExpectations
a/b.any.html?c [ Crash ]
a/b.any.html?d [ Crash ]
a/b.any.worker.html?c [ Timeout ]
a/b.any.worker.html?d [ Timeout ]
external/wpt/a/b.any.js.ini
[b.any.html?c]
  expected: CRASH
[b.any.html?d]
  expected: CRASH
[b.any.worker.html?c]
  expected: TIMEOUT
[b.any.worker.html?d]
  expected: TIMEOUT

Directory-Wide Expectations

To set expectations or disable tests under a directory without editing an .ini file for every contained test, create a special __dir__.ini file under the desired directory with top-level keys, which work identically to those for per-test metadata:

FlagExpectations/highdpi
# Redundant coverage
external/wpt/a/* [ Skip ]
external/wpt/a/__dir__.ini
disabled:
  if flag_specific == "highdpi": redundant coverage

Metadata closer to affected test files take greater precedence. For example, expectations set by a/b/c.html.ini override those of a/b/__dir__.ini, which overrides a/__dir__.ini in turn. The special value disabled: @False can selectively reenable tests or directories disabled by an ancestor __dir__.ini.

Update Tool

To update expectations in bulk for all tested configurations, blink_tool.py has an update-metadata subcommand that can trigger try builds and update expectations from the results (similar to rebaseline-cl). The workflow is very similar to rebaselining:

# Create a CL, if one has not been created already.
git cl upload

# Trigger try builds against the current patchset.
./blink_tool.py update-metadata

# After the try builds complete, collect the results and update expectations
# for `external/wpt/css/CSS2/` (sub)tests that only failed unexpectedly. Any
# test section updated will be annotated with `bug: crbug.com/123`.
./blink_tool.py update-metadata --bug=123 css/CSS2/

# Commit and upload the staged `.ini` files.
git commit -m "Update WPT expectations" && git cl upload

The WPT autoroller uses update-metadata to automatically suppress imported tests with new failures.

update-metadata can also suppress failures occurring on trunk:

# Suppress tests that caused any `linux-wpt-fyi-rel` CI builds 3000-3005
# (inclusive) to fail.
./blink_tool.py update-metadata --build=ci/linux-wpt-fyi-rel:3000-3005

Debugging Support

Passing the --no-headless flag to run_wpt_tests.py will pause execution after running each test headfully. You can interact with the paused test page afterwards, including with DevTools:

wptrunner paused

Closing the tab or window will unpause wptrunner and run the next test.

In the future, we intend to support hooking up text-based debuggers like rr to test runs.

Known Issues

Please file bugs and feature requests against Blink>Infra with the wptrunner label.