The final product of this exercise will be:
A set of tools for taking the above textual descriptions and constructing a lab in a GCP project, in whole or in part. See DEPLOYER.
A basic set of end-to-end tests for Chromium, including the descriptions of the required resources. These tests will live in the Chromium source tree and will become part of our CI suite.
A mechanism by which these end-to-end tests can be invoked manually using binaries built on a developer workstation. See Expected Workflows.
There can be multiple types of tests. The one referred to here is a system test which is written in Python and uses Selenium + ChromeDriver to automate a Chrome binary.
runrequest from Luci Swarming and downloads the isolate. The isolate includes a system test script which is responsible for preparing and running a test suite within the lab.
We start our story when a build has completed and the build artifacts needed for running the test are ready. These artifacts may be produced on the waterfall or on a developer machine.
The collection of files needed for running the test in isolation are bundled together and uploaded to a Luci Isolate server. The process of uploading yields a digest that uniquely identifies the specific set of uploaded files and their relative layout.
The contents of the isolate (i.e. the files contained therein) are as follows (Note that the exact layout may be different, but the content should be as described below):
system_tests/ is the contents of the similarly named directory under in the Chromium source tree. See Source Locations for details. The isolate includes the entire contents of the directory excluding the
doc/ subdirectory. These files will be transferred over to the enterprise lab for execution.
system_test_runner.py file lives in
system_tests/scripts/. It is the main entry point for system tests and is invoked via
task_runner.py (source) on the GREETER. This script is described in SYSTEM TEST RUNNER.
assets/ directory contains the definitions of the assets that should exist in the enterprise lab. Files and directories in this directory can be referenced by the definitions in
assets.textpb and are used to configure things like IIS and ActiveDirectory.
assets.textpb is a text formatted protobuf describing all the assets needed by the system tests. Described in ASSET MANIFEST.
Everything in the
selenium_tests/ directory comprise the tests that will be executed in the lab using Selenium. For example, the
another_test.py. Described in TEST.
Last, but not least, the browser binary being tested.
Once the isolate is uploaded to an isolateserver, it is ready to be downloaded and used by the bots.
All assets required by the test suite are described in the ASSET MANIFEST. In this document the asset manifest is often referred to as
assets.textpb, but in reality it will be a collection of files.
The manifest takes the form of a
textproto file conforming to the Asset Description Schema. By virtue of the chosen schema, the file can trivially be broken into multiple files. This aspect is important since it is expected that some of the asset manifest fragments will be programmatically generated to account for the matrices of configurations that need to be supported. The lab does not care how or where the files were generated, but just that all the required pieces are there by the time the DEPLOYER is invoked.
To the fullest extent possible, the asset manifest also aims to be independent of the specifics of the Google Cloud Platform. A different implementation of the DEPLOYER, for example, could theoretically target a different cloud hosting solution without changing the asset manifests.
What is an asset? It's just about anything physical or virtual that needs to be specified in order for the test fixtures to be unambiguously deployed. Assets include descriptions of networks including subnets and address ranges, peering between these networks, VPNs, virtual machines including parameters required to provision them, Windows Active directory domains, domain users, domain groups, Active Directory containers, trust relationships between the domains, Windows AD group policy templates, group policy objects that should be deployed to each AD container, and so on.
The Asset Description Schema is designed for expressiveness, but not necessarily for readability or writability since it is expected that much of the asset descriptions will be programmatically generated.
What doesn't go in the manifest? The asset manifest should be independent of its hosting environment. As such, the manifest does not prescribe a specific hosting Google Compute project, nor does it specify hosting environment specific parameters like static external IPs, SSL certificates, references to Google Cloud Storage buckets, and StackDriver logging parameters. Such parameters are defined in the HOST ENVIRONMENT as described below.
Examples of assets include:
For more information about the structure of assets, see ASSET MANIFEST below.
While the ASSET MANIFEST describes what goes into the enterprise lab, the HOST ENVIRONMENT describes the environment in which the lab itself sits. For example, the HOST ENVIRONMENT defines:
The Google Cloud Project where the enterprise lab is to be instantiated. All cloud resources described in the ASSET MANIFEST will be created within this single project.
Google Cloud Storage buckets to use during deployment. These storage buckets are used for transferring files to be used by GCE instances during startup and for configuration.
External IPs. When GCE instances need to be externally visible, they need to be assigned external IPs. The ASSET MANIFEST refers to such external IPs by name, and the HOST ENVIRONMENT maps these names to IP addresses.
GCE Instance Images. These are the images that are used as source disks when creating GCE instances. E.g. An ASSET MANIFEST might declare that an instance be based on
win10. The HOST ENVIRONMENT describes the image family and project that should be used to locate the image, e.g.:
Google Compute Engine has a growing inventory of public images. However the enterprise lab will need to rely on private images for the purpose of testing against Windows 7 and Windows 10 which are currently popular in enterprise but not supported on GCE. Source images for these operating systems cannot be made available publicly due to licensing issues. Hence the HOST ENVIRONMENT needs to specify how these images are to be located. See Private Google Compute Images.
Permanent Assets. These are assets that can't be deployed automatically. Such assets must already exist by the time the DEPLOYER gets around to constructing assets. Permanent assets include external physical labs, and virtual machine instances that need to be deployed manually due to licensing and activation requirements.
Unlike the ASSET MANIFEST, the HOST ENVIRONMENT is necessarily private. Each instantiation of the enterprise lab needs to specify its own HOST ENVIRONMENT during the Bootstrapping process.
The HOST ENVIRONMENT takes the form of a textproto file that is made available during Bootstrapping. It conforms to the
HostEnvironment message in the Asset Description Schema.
Between the HOST ENVIRONMENT and the ASSET MANIFEST, these files contain all the information necessary to construct an instance of the enterprise lab from scratch. The details of how this deployment happens is described in the Deployment Details section.
Test execution proper starts when a GREETER VM receives a notification that a test is ready to be deployed to the test machines. This notification includes the identifier of the ISOLATE containing the files needed to run the test suite.
A GREETER is a swarming bot (described here). As such, a notification for a new test run takes the form of a
run command issued to the bot from the swarming server.
In addition to being swarming bots, GREETERs are …
Once a GREETER VM is notified of an isolate (via swarming or otherwise), it invokes the isolate via
task_runner.py (source). This downloads and invokes the embedded test script from the ISOLATE. The embedded test script for system tests is described in SYSTEM TEST RUNNER.
A single GREETER and a single DEPLOYER can be presumed to exist in the lab. It is possible for any and all other test fixtures and TEST HOST VMs to not exist, in which case they will be created by the DEPLOYER. GREETERs are the only VMs that can be exposed externally.
This is the logic described in the file named
system_test_runner.py included in the ISOLATE and exists in the Chromium source tree. It is extracted and invoked by the swarming bot code running on GREETER. The script itself also runs on the GREETER.
cel_pylibrary ( described below).
The SYSTEM TEST RUNNER needs access to the test environment specific code for invoking the DEPLOYER and setting up the communication channels between the SYSTEM TEST RUNNER and HOST TEST RUNNERs. The API for accessing these services is provided by a Python library called
cel_py ( described below) which is installed on GREETERs as well as TEST HOSTs.
This SYSTEM TEST RUNNER performs the following operations:
If this step completes successfully, then all the TEST HOSTs described in the ASSET MANIFEST can be assumed to exist and be operational.
As a part of the deployment results, the DEPLOYER sends back a comprehensive inventory of the software that's running in the lab. This inventory includes software versions and hotfix information. The SYSTEM TEST RUNNER passes along this information to
Deploys the tests to all the TEST HOSTs identified in the ASSET MANIFEST as follows:
Constructs an archive containing
host_test_runner.py (described in the HOST TEST RUNNER section) and the contents of the
system_tests folder from the ISOLATE.
Write the archive to a Google Cloud Storage bucket.
Publishes a message to Google Cloud Pub/Sub topic
tests that includes the location in Google Cloud Storage of the archive.
The TEST HOSTs created by the DEPLOYER subscribe to this Pub/Sub topic and start their tests accordingly.
SYSTEM TEST RUNNER must subscribe to topics named
results-<hostname> for each hostname corresponding to the known TEST HOSTs. This is the mechanism by which TEST HOSTs communicate test results.
Collects the test results passed along via
cel_py, and streams it back to
Constructs aggregate test results and streams these via
These results include:
Why not use Chrome infra's Isolate instance or another Isolate instance running inside the lab project for transferring files?
One of the goals of this effort is to make it possible for an individual developer or a team to easily clone the environment for the purpose of developing new tests or running manual tests. In that regard it is desirable to keep the set of dependencies minimal. Running an Isolate server is makes sense for distributing binaries in the scale at which Chrome infra in general operates, but isn't feasible for a small scale such as this lab.
The DEPLOYER is a service that takes an ASSET MANIFEST as input and deploys the specified assets to the Google Compute Engine environment based on it. The DEPLOYER runs on a separate VM and is invoked by the SYSTEM TEST RUNNER.
The DEPLOYER does the following:
Parses the ASSET MANIFEST, and the HOST ENVIRONMENT and constructs a dependency graph for all the assets that are required by the test.
For example, an Instance asset depends on a Network asset, a Windows domain user asset depends on the Windows domain asset. A diagram showing the dependency graph for compute engine assets for a simple test is shown below:
In the diagram above,
gce-instance/win-client identifies a Google Compute Engine instance named
Traverses the dependency graph in topological order and:
The assets that are handled by DEPLOYER are not just assets known to Google Compute Engine. They are also things that exist within the lab environment like Windows domain accounts and software that needs to be installed on the virtual machines.
See the Deployment Details section for full details on how these assets are deployed.
Upon successful completion of deployment,
results-<hostname>can be assumed to exist for each TEST HOST.
The TEST HOST is a VM that runs a test. These VMs are deployed by the DEPLOYER based on the ASSET MANIFEST included in an ISOLATE.
The TEST HOST must have all the software necessary to run the test with the exception of the code that's included in the ISOLATE. At a minimum, the TEST HOST has:
cel_pyfor inspecting the TEST HOST environment. The library also has some additional utilities for marking tests as disabled and conditionally skipping tests based on host dimensions.
Different TEST HOST instances can have different characteristics. The different characteristics are exposed as tags in the VM. E.g.: A tag
policy-ntlm-disabled can indicate that the GPO disabling the HTTP authentication scheme NTLM was applied to the TEST HOST.
The TEST section describes how these tags are used in tests.
When a TEST HOST instance starts up, it runs a program named
Creates a subscription for the Pub/Sub topic
tests within the GCE project. The topic is the means by which a SYSTEM TEST RUNNER communicates the start of a test to a TEST HOST.
Listens for incoming tests via the
tests Pub/Sub topic.
Upon receipt of a test - which takes the form of a path to an archive hosted on Google Cloud Storage - downloads, extracts and runs the test script contained in the archive. The test runner contained therein is referred to as the HOST TEST RUNNER.
Once the test completes, or times out, the
cel_bot is also responsible for collecting logs and other test artifacts and making them available to the SYSTEM TEST RUNNER.
This script is responsible for running the tests on one of the TEST HOST instances. The script itself lives in
/TBD/ and depends on a
system_tests directory that's expected to exist alongside it.
Once invoked, the HOST TEST RUNNER:
load_tests protocol as defined by Python
unittest to locate tests.
Announces the list of tests about to be run via the
results-<hostname> Pub/Sub topic.
Announces the test host properties including software versions, and Resultant Set of Policy. The software versions including hotfix information for test fixtures are reported during the deployment phase by the DEPLOYER and passed along by the SYSTEM TEST RUNNER. The HOST TEST RUNNER only needs to concern itself with the software on the TEST HOST.
unittest along with a custom Python
TestRunner class which publishes test results via the
Tests live under
/TBD/ and are structured as Python
unittests. They are invoked on a candidate TEST HOST via a HOST TEST RUNNER.
Tests must be designed so that a component on a volatile TEST HOST runs against a set of stateless test fixtures. Tests must be:
Idempotent: Given a single test suite session, the test should be able to run multiple times with the same outcome.
Isolated within the lab: This one's a jiffy.
A test should only depend on test fixtures that are declared in the accompanying ASSET MANIFEST. Necessarily, tests are run with real host resolvers and run against real services. However, tests should not reach out to services outside those provided in the lab.
Chromium/Chrome currently depend on a smorgasbord of Google services. See the Google Services section for details on how to deal with Google Services that affect the browser including field trials.
Not depend on the absence of a resource: While a test should declare and can depend on specific test fixtures, it should not depend on a test fixture being absent despite it not appearing in its assets manifest.
What does this mean? A test should not depend on, say, that an
A query for
bar.example.com getting an NXDOMAIN. This will break if another test adds a
Proper adherence to the above rules mean that multiple test suites can occupy and execute within the same GCE project simultaneously. Ref. Scalability.
The test may assume that its containing directory and subdirectories have the same contents as the subtree rooted at
/TBD/. It can also assume that the test fixtures exist as per the ASSET MANIFEST and that the test itself is running on one of the TEST HOST instances defined in the ASSET MANIFEST. The
cel_py library can be used to determine the name and characteristics of the TEST HOST running the test.
Contents of a hypothetical example test file follows:
import unittest import cel # cel_py client library from selenium import webdriver class TestFoo(unittest.TestCase): def test_basic(self): driver = cel.ChromeDriver() # Connects to Chrome binary under test. driver.get("http://www.google.com") self.assertIn("google", driver.title) driver.close() @cel.require(["policy-ntlm-disabled", "policy-never-use-proxy"]) def test_policy(self): driver = cel.ChromeDriver() # Connects to Chrome binary under test. driver.get(cel.NamedUrl("ntlm-only-over-http") self.assertIn("authorization required", driver.title) driver.close()
cel.ChromeDriver() call binds to a WebDriver instance that's invoking the Chrome binary under test. Currently we are not planning on using remote WebDriver instances.
@cel.require() annotation causes the test to be skipped if either of the two listed tags are not present on the TEST HOST. In addition, the
cel.NamedUrl() call looks up a named URL from the test environment. The mapping from a named URL to a URL that can be used within the lab is defined in the ASSET MANIFEST.
cel_py library (Chrome Enterprise Lab Python library) exists on the GREETER and TEST HOST and encapsulates the logic for…
Talking to the DEPLOY service.
Constructing and uploading the archive containing per-host tests to Google Cloud Storage for consumption on the TEST HOST instances.
Constructing and marshalling a host test via a
tests Pub/Sub topic.
Subscribing to and initiating a test run based on a
tests Pub/Sub topic.
Downloading and extracting said archive on the TEST HOST.
Constructing and marshalling test results to a
results-<hostname> Pub/Sub topic.
Subscribing to and consuming messages from a
results-<hostname> Pub/Sub topic.
Exposing instance tags and named assets to TESTs.
Controlling test execution via Python
unittest friendly annotations so that tests can filter themselves based on instance tags. The outcome of these filtering operations are made available to test results so that
results-<hostname> subscribers can clearly tell why a test was skipped.
In other words, the
cel_py library encapsulates most of the lab-environment specific logic for test execution both at the GREETER (SYSTEM TEST RUNNER) and TEST HOST (HOST TEST RUNNER) level.
This separation allows the TEST to be written such that it's independent of how the test lab is constructed.