BACKGROUND

Google Chrome has historically had little end-to-end system level continuous integration testing of its support for enterprise authentication including single sign-on. This resulted in a blind spot where Google Chrome could introduce regressions that wouldn't be discovered until the changes rolled out to the stable channel. Changes to Windows and other enterprise software, including hosted applications like GSuite, could also result in breaking Google Chrome behavior.

In order to minimize these blind spots, the Chrome Network Stack team built an enterprise authentication test lab. The primary goal of the lab was to lower the cost of issue investigation and testing by maintaining a set of ready-to-use test fixtures that replicate a few enterprise environments.

In parallel, the Chrome Enterprise team also built out a set of tests on a third party cloud computation platform . These tests used virtualized mini enterprise environments to do focused manual testing.

Having fixtures in a lab helps with some aspects of developing enterprise features, but doesn‘t go far enough to ensure stable support. Manual testing is still prone to error and has too much overhead to be a reliable defense against regressions. Moreover, developers don’t always know a priori whether their changes affect enterprise functionality.

Furthermore, the existing test infrastructure built by these teams aren't suited for reuse by other teams or individuals contributors, specially those outside of Google. While folks have been given access to some parts of these labs over time, there is no self-help process nor is there a well documented inventory of what fixtures are available.

These documents discusses how we can develop a set of tools that can be used to efficiently construct enterprise labs, and how such a lab can be integrated into the Chromium continuous integration workflow. The use cases and considerations that went into the design are discussed below.

Use Cases

Continuous integration tests on the build waterfall. We'd need a robust set of automated tests run periodically to catch regressions whether caused by Chrome or enterprise software.
Developer workflows. Developers working on code that needs to be tested in the lab would need a workflow where they can manually push builds, trigger tests and retrieve results, retrieve test artifacts for post-mortem debugging, and even perform interactive debugging where necessary.
System level tests cannot run on developer workstations nor can they be run on an arbitrary host. These tests rely both on a specific configuration on the host running the test as well as being able to interact with real servers and network configurations that exist only within the lab. Hence all tests require some mechanism for scheduling and interacting with tests running inside the lab.
Batched tests. Should be able to trigger a run of the entire enterprise test suite using their build artifacts, ideally via a try job.
Where enterprise specific features are being developed, it would be desirable for a developer to be able to run a subset of tests quickly against a subset of test fixtures. A try job may be overkill in this case since the developer already has built binaries.
Interactive testing. Developers should be able to debug their code interactively when necessary without these interactive sessions interfering with automated tests.
Development of test fixtures. Developers should be able work on new test fixtures to be added to the lab, or try out modifications to the existing lab configuration without interfering with automated tests.
Verifying software updates. Lab maintainers would also need to trigger running the entire test suite after patching lab VMs especially after Microsoft releases Windows patches that possibly affect Chrome functionality.
Bug reproduction and regression range narrowing for Chrome TE. Narrowing a regression range could be done by bisecting continuous builds against a set of tests.

Additional Considerations

Focus on Windows. Deploying and configuring Active Directory services in its many permutations remains a very expensive part of supporting Microsoft Windows based technology stacks. The design should focus on simplifying the deployment of a number of test environments.
Support Posix clients. The lab should support interoperability testing of our Posix clients against enterprise fixtures. It would be great to also support macOS clients, but it is not one of the short term goals.
Note that Posix test fixtures are already in scope to be supported.
Easily Instantiated/Cloned. Automated tests, interactive debugging sessions, and development of new test fixtures or new test configurations cannot all happen within the same lab. It should be easy and not too expensive to instantiate one's own instance of the lab.
Software versions are a part of the test configuration. Security patches installed on the clients and servers, and software versions should be discovered and logged during a test run. Due to the many factors that could affect a test, test runs can't be considered idempotent.
Binary being tested ≈ Candidate to be shipped. When testing browser level features, it's important to test the unadulterated binary. E.g. no mocking the host resolver or disabling security features.
Support physical appliances. Some of the middleware that should be tested for interoperability with Google Chrome are only available in the form of physical appliances that consumers are expected to operate within their internal networks. They cannot be run in VMs. As such, the enterprise lab needs to span both virtual and physical environments.
Use default directories. User data directory should be placed in a location where it would be subject to the effects of folder redirection policies. Ideally, the test would run with profile data in the default location.

Frameworks/Tools Used

In addition to staples like ProtoBufs, the lab makes use of the following frameworks and tools. Some familiarity of these would be helpful while perusing the rest of the document.

Google Cloud Platform (website): It's what we are going to use for building the lab. Google Cloud Platform is abbreviated as GCP. Google Compute Engine is abbreviated as GCE.
Luci Isolate (docs): A service for efficiently storing and distributing a directory tree containing a large number of files. Used as the primary means of getting files from builder and developer machines into the lab.
Luci Swarming (docs): A service used for scheduling tests in the lab. This is how tests in the lab are triggered from a builder.
Luci CIPD (docs): “CIPD is package deployment infrastructure”. Used for deploying software into VMs.
Microsoft Powershell Desired State Configuration (docs): “A declarative platform used for configuration, deployment, and management of systems.”
Used for deploying Microsoft Windows Active Directory assets. Desired State Configuration will often be referred to in this document as DSC. This page provides a good overview of how DSC works and what DSC scripts look like.
Selenium (website): “Selenium automates browsers. That's it!”
Used for browser level tests.
Go (website): A popular scripting language.