The recommendations here are intended to help you write new tests that go through code review with a minimal number of round trips, remain useful as Blink evolves, and serve as an asset (rather than a liability) for the team.
While reading existing layout tests, please keep in mind that they represent snapshots taken over many years of an ever-evolving collective opinion of what good Web pages and solid tests should look like. Thus, it should not come as a surprise that most existing layout tests are not consistent with these recommendations, and are not even consistent with each other.
Tests should be concise, without compromising on the principles below. Every element and piece of code on the page should be necessary and relevant to what is being tested. For example, don't build a fully functional signup form if you only need a text field or a button.
Content needed to satisfy the principles below is considered necessary. For example, it is acceptable and desirable to add elements that make the test self-describing (see below), and to add code that makes the test more reliable (see below).
Content that makes test failures easier to debug is considered necessary (to maintaining a good development speed), and is both acceptable and desirable.
Tests should be as fast as possible, without compromising on the principles below. Blink has several thousand layout tests that are run in parallel, and avoiding unnecessary delays is crucial to keeping our Commit Queue in good shape.
Tests should be reliable and yield consistent results for a given implementation. Flaky tests slow down your fellow developers' debugging efforts and the Commit Queue.
window.setTimeout is again a primary offender here. Asides from wasting time on a fast system, tests that rely on fixed timeouts can fail when on systems that are slower than expected.
When adding or significantly modifying a layout test, use the command below to assess its flakiness. While not foolproof, this approach gives you some confidence, and giving up CPU cycles for mental energy is a pretty good trade.
third_party/blink/tools/run_web_tests.py path/to/test.html --repeat-each=100
The PSA on writing reliable layout tests. also has good guidelines for writing reliable tests.
Tests should be self-describing, so that a project member can recognize whether a test passes or fails without having to read the specification of the feature being tested.
Tests should require a minimal amount of cognitive effort to read and maintain.
Avoid depending on edge case behavior of features that aren't explicitly covered by the test. For example, except where testing parsing, tests should contain valid markup (no parsing errors).
Tests should provide as much relevant information as possible when failing.
testharness.js tests should prefer rich assert_ functions to combining
assert_true() with a boolean operator. Using appropriate
assert_ functions results in better diagnostic output when the assertion fails.
Tests should be as cross-platform as reasonably possible. Avoid assumptions about device type, screen resolution, etc. Unavoidable assumptions should be documented.
When possible, tests should only use Web platform features, as specified in the relevant standards. When the Web platform's APIs are insufficient, tests should prefer to use WPT extended testing APIs, such as
wpt_automation, over Blink-specific testing APIs.
Test pages should use the HTML5 doctype (
<!doctype html>) unless they specifically cover quirks mode behavior.
Tests should avoid using features that haven't been shipped by the actively-developed major rendering engines (Blink, WebKit, Gecko, Edge). When unsure, check caniuse.com. By necessity, this recommendation does not apply to the feature targeted by the test.
ES2015 is shipped by all major browsers under active development (except for modules), so using ES2015 features is acceptable.
At the time of this writing, ES2016 is not fully shipped in all major browsers.
Tests must be self-contained and not depend on external network resources.
<script> tags. Content shared by multiple tests should be placed in a
resources/ directory near the tests that share it. See below for using multiple origins in a test.
Test file names should describe what is being tested.
File names should use
snake-case, but preserve the case of any embedded API names. For example, prefer
Tests should use the UTF-8 character encoding, which should be declared by
<meta charset=utf-8>. A
<meta> tag is not required (but is acceptable) for tests that only contain ASCII characters. This guideline does not apply when specifically testing encodings.
<meta> tag must be the first child of the document's
<head> element. In documents that do not have an explicit
<meta> tag must follow the doctype.
No coding style is enforced for layout tests. This section highlights coding style aspects that are not consistent across our layout tests, and suggests some defaults for unopinionated developers. When writing layout tests for a new part of the codebase, you can minimize review latency by taking a look at existing tests, and pay particular attention to these issues. Also beware of per-project style guides, such as the ServiceWorker Tests Style guide.
== in C++.
=== everywhere is an easy default that saves you, your reviewer, and any colleague that might have to debug test failures, from having to reason about special cases for ==. At the same time, some developers consider
=== to add unnecessary noise when
== would suffice. While
=== should be universally accepted, be flexible if your reviewer expresses a strong preference for
For the reasons above, a reasonable default is to prefer
var, with the same caveat as above.
'use strict'; to the very top of a script, helps catch some errors, such as mistyping a variable name, forgetting to declare a variable, or attempting to change a read-only property.
Given that strict mode gives some of the benefits of using a compiler, adding it to every test is a good default. This does not apply when specifically testing sloppy mode behavior.
Some developers argue that adding the
'use strict'; boilerplate can be difficult to remember, weighs down smaller tests, and in many cases running a test case is sufficient to discover any mistyped variable names.
Promises are a mechanism for structuring asynchronous code. When used correctly, Promises avoid some of the issues of callbacks. For these reasons, a good default is to prefer promises over other asynchronous code structures.
When using promises, be aware of the execution order subtleties associated with them. Here is a quick summary.
Promise.newis executed synchronously, so it finishes before the Promise is created and returned.
catchare executed in separate microtasks, so they will be executed after the code that resolved or rejected the promise finishes, but before any other event handler.
A good default is to prefer classes over other OOP constructs, as they will make the code easier to read for many of your fellow Chrome developers. At the same time, most layout tests are simple enough that OOP is not justified.
When HTML pages do not explicitly declare a character encoding, browsers determine the encoding using an encoding sniffing algorithm that will surprise most modern Web developers. Highlights include a default encoding that depends on the user's locale, and non-standardized browser-specific heuristics.
The easiest way to not have to think about any of this is to add
<meta charset="utf-8"> to all your tests. This is easier to remember if you use a template for your layout tests, rather than writing them from scratch.
Tests that rely on the testing APIs exposed by WPT or Blink will not work when loaded in a standard browser environment. When writing such tests, default to having the tests gracefully degrade to manual tests in the absence of the testing APIs.
The document on layout tests with manual feedback describes the approach in detail and highlights the trade-off between added test weight and ease of debugging.