Tast Design Principles (go/tast-design)

This document lists principles that guided Tast's design and that should be kept in mind when considering future changes.

Running tests should be fast.

Iteratively running tests (regardless of whether one is adding or modifying tests, trying to reproduce a failure, or verifying that a system change has fixed a failing test) should be fast from the perspective of a developer waiting for the command to complete.

  • Operations should not take additional time to complete due to the choice of programming language.
  • Building and deploying tests and associated data must be fast. emerge adds ten seconds or more of overhead even when building a C++ program with no dependencies and an empty main() function and shouldn't be part of the build/deploy/run cycle in its present form.
  • The test system‘s overhead should be minimized. Nothing should be copied to the DUT when the test hasn’t changed. All communication with the DUT should happen over a single persistent SSH connection, and round trips should be minimized on the critical path — otherwise, network latency kills performance when running tests on a DUT in a different geographical location.
  • Developers shouldn‘t need to edit tests on-device in order to iterate quickly. ChromeOS systems do not make for pleasant development environments. (They may support pleasant development environments within VMs, but that doesn’t help for testing.)
  • Running a test shouldn't result in code being compiled. If a test needs additional executables to be installed on the DUT, then those executables should already be present in the system image. We already have a packaging system; use it.
  • Information about a test (e.g. its inclusion in a suite) should be available without needing to evaluate hundreds of scripts. Don't emulate a declarative language using an imperative interpreted language.

Tests should yield consistent results.

  • Minimize the number of moving pieces when a test is run on a DUT. The framework, and tests themselves, should do everything in their power to avoid operations that might fail. Avoid runtime dependencies on external resources like databases, websites, and other network services.

Adding or modifying a test should be easy.

  • Minimize boilerplate. For example, test names shouldn‘t appear repeatedly in the source (e.g. directory names, control files, filenames, test implementations, .ebuild files). We’d frown if we saw the same lengthy string constant repeated five or more times in a C++ program. In cases where repetition is unavoidable, there should be automatic checks that the names are consistent in all locations.
  • Developers shouldn‘t need to know the specifics of how the test system is integrated into ChromeOS. In the common case, they shouldn’t need to edit .ebuild files when adding a test, run cros_workon when making changes, or set USE flags or build and deploy packages to run tests.

Test results should be easy to interpret.

  • A given run's output directory should be structured in a way that is easy to navigate.
  • Logs must be easy to read. The default log level should include messages that describe what's happening at any given time (e.g. no radio silence while the test is running on a remote host: see issue 715865), but no non-fatal warnings and errors. A separate log file should be written with full verbose output, and it should be trivial for both machines and humans to find the overall pass/fail status of all tests and the verbose output from an individual test.
  • Errors should be passed back to the top level of the test and logged there. When fatal errors are reported from deep in support libraries, test results are often difficult to interpret due to the lack of context present in the errors.
  • Detailed timing information should be written in a format readable by both humans and machines to make it easy to see why a test run was slow and track long-term performance trends.
  • System log information generated by the DUT while tests were running should be captured. It should be easy to compare timestamps in test results to timestamps from the DUT's system logs, even in the presence of clock skew.

The test framework, and tests themselves, should be maintainable.

  • The framework should focus on running tests. Tasks like allocating DUTs and scheduling tests on them, reimaging or repairing DUTs, and displaying and archiving test results belong elsewhere.
  • There should be a clear separation between code that‘s used by tests and code that runs on developers’ workstations or bots to deploy and run tests.
  • Avoid magic. Code that spells out what it‘s doing is easier to debug than code that relies on action at a distance (e.g. overriding __getattr__ or using setattr to dynamically set attributes in Python). Make code easy to trace unless there’s an extremely compelling reason to do something fancy.
  • Avoid making test libraries ornate. Nobody wants to puzzle their way through complicated object hierarchies while trying to debug a failing test.
  • The code that supports tests must itself be thoroughly covered by unit tests.
  • Make it easy to disable a broken test until it can be fixed by its owners.

The test system should have opinions about the right way to write tests.

Don't overwhelm developers with choices.

  • Keep logging simple. There should be one way to report test failures and one way to log informative messages. Don't permit non-fatal “warning”-level errors, as nobody does anything about them and they end up permanently cluttering logs.
  • Tests should be straightforward to read. Instead of distributing work across superclasses and overridden methods with non-obvious semantics (e.g. initialize(), setup(), warmup(), run_once(), postprocess()), implement each test in a single function, with initialization appearing at the beginning and teardown happening at exit (per language affordances).