Unit tests verify the correctness of a small piece of source code (e.g. a class or function) and live alongside the code they‘re testing. When a Portage package with src_test
defined in its ebuild file is compiled with a command like FEATURES=test emerge-<board> <package>
, the package’s unit tests are compiled into one or more executables, which are executed immediately by the system performing the build. If any unit tests fail, the entire build fails.
C++ unit tests are written using Google Test and typically live in files with _unittest.cc
or _test.cc
suffixes (e.g. the unit tests for a source file named foo/bar.cc
should live in foo/bar_unittest.cc
).
The unit testing document has more details about how unit tests are executed.
Functional tests (sometimes also called integration tests) are written in Go using the Tast framework or in Python using the Autotest framework. They run on a live Chrome OS system and verify the end-to-end functionality of multiple components. As a result, they do not run automatically while a build is being performed, and take substantially longer to complete than unit tests.
Functional tests are needed to verify that different components work together. For example, a test that verifies that the user is able to log into Chrome indirectly checks that:
session_manager
.session_manager
runs Chrome.session_manager
and Chrome are able to communicate over D-Bus.cryptohome
mounts user's encrypted home directories.It‘s important to test all of these things, but we don’t want to use functional tests for all of our testing:
When you can get similar coverage using either type of test (i.e. the test exercises a single component), always use unit tests. This requires some planning when writing the code; the rest of this document gives advice for doing that.
Unit tests verify that functions or classes produce correct output when supplied various inputs; “output” may refer to functions' or methods' return values, changes made to the system (e.g. files that are written), or actions that are performed by making calls to other classes. Test the code's output instead of its internal state (i.e. how it produces the output).
Instead of using the friend
keyword and FRIEND_TEST
macro to let tests inspect and modify a class‘s private data members and call its private methods, make unit tests exercise the class’s public interface: test the API rather than the implementation. This helps avoid tests that simply mirror the implementation line-for-line.
If tests have a true need to exercise private implementation details (e.g. to check whether a private timer is running and manually trigger it), add dedicated public methods with ForTest
suffixes (or for_test
, in the case of inlined methods) that can be called by tests. If tests need even more control over a class's internals, consider defining a dedicated TestApi
internal class:
// delayed_sound_player.h: // Plays a sound after a while. class DelayedSoundPlayer { public: // Helps tests interact with DelayedSoundPlayer. class TestApi { public: explicit TestApi(DelayedSoundPlayer* player); ~TestApi() = default; // Triggers and stops |player_|'s timer and returns true. // Does nothing and returns false if the timer wasn't running. bool TriggerTimer() WARN_UNUSED_RESULT; ... private: DelayedSoundPlayer* player_; // Not owned. DISALLOW_COPY_AND_ASSIGN(TestApi); }; ... private: // Used to play a sound after a delay. base::OneShotTimer timer_; ... }; // delayed_sound_player.cc: DelayedSoundPlayer::TestApi::TestApi(DelayedSoundPlayer* player) : player_(player) {} bool DelayedSoundPlayer::TestApi::TriggerTimer() { if (!player_->timer_.IsRunning()) return false; // Copy the task to a local variable so we can stop the timer first (in case // the task wants to restart the timer). base::Closure task = player_->timer_->user_task(); player_->timer_->Stop(); task.Run(); return true; } // delayed_sound_player_unittest.cc: TEST_F(DelayedSoundPlayerTest, Timer) { DelayedSoundPlayer player; TestApi api(&player); ASSERT_TRUE(api.TriggerTimer()); // Check that a sound was played. }
This makes the points of interaction between the tests and the code being tested obvious, and makes it less likely that test code will gain brittle dependencies on the internal implementation that make refactorings difficult in the future. Care must still be taken to ensure that the API limits the scope of implementation details that are exposed to tests, however.
Sometimes the class that's being tested does work that needs to be avoided when running as part of a test instead of on a real device -- imagine code that calls ioctl()
or reboots the system, for instance.
It can be tempting to add additional logic to the code that's being tested to instruct it to avoid actually doing anything (e.g. a running_on_real_system
member), or to mark some methods as virtual and then override them within tests using Google Mock (known as a “partial mock”), but both of these approaches make the code harder to read (“Why is this method virtual? Oh, just for testing”) and make it easy to accidentally overlook dangerous “real” code and execute it within tests.
Instead, try to structure your code so it's easy for tests to inject their own implementations of functionality that can only run on real systems.
If the untestable portion of the code is trivial, it can be moved out of the class and into a delegate interface, which can then be stubbed out in tests. For example, imagine a class that's responsible for rebooting the system:
// power_manager.h: class PowerManager { public: ... // Does a bunch of work to prepare for reboot and then reboots the system. // Returns false on failure. bool Reboot(); ... };
Instead of rebooting the system directly, the class can define a nested Delegate
interface and ask it to do the actual rebooting:
// power_manager.h: class PowerManager { public: // The Delegate interface performs work on behalf of PowerManager. class Delegate { public: virtual ~Delegate() = default; // Asks the system to reboot itself. Returns false on failure. virtual bool RequestReboot() = 0; }; explicit PowerManager(std::unique_ptr<Delegate> delegate); ... private: std::unique_ptr<Delegate> delegate_; ... }; // power_manager.cc: PowerManager::PowerManager(std::unique_ptr<Delegate> delegate) : delegate_(std::move(delegate) {} bool PowerManager::Reboot() { // Do lots of stuff to prepare to reboot. ... return delegate_->RequestReboot(); }
In the shipping version of the code, pass a “real” implementation of the delegate:
// main.cc: namespace { class RealDelegate : public PowerManager::Delegate { public: RealDelegate() = default; ~RealDelegate() override = default; // PowerManager::Delegate: bool RequestReboot() override { return system("/sbin/reboot") == 0; } private: DISALLOW_COPY_AND_ASSIGN(RealDelegate); }; } // namespace int main(int argc, char** argv) { ... PowerManager manager(std::make_unique<RealDelegate>()); ... }
Inside of the test file, define a stub implementation of Delegate
that keeps track of how many times RequestReboot
has been called and returns a canned result:
// power_manager_unittest.cc: namespace { // Stub implementation of PowerManager::Delegate for testing that just keeps // track of what has been requested. class StubDelegate : public PowerManager::Delegate { public: StubDelegate() = default; ~StubDelegate() override = default; int num_reboots() const { return num_reboots_; } void set_reboot_result(bool result) { reboot_result_ = result; } // PowerManager::Delegate: bool RequestReboot() override { num_reboots_++; return reboot_result_; } private: // Number of times RequestReboot has been called. int num_reboots_ = 0; // Result to return from RequestReboot. bool reboot_result_ = true; DISALLOW_COPY_AND_ASSIGN(StubDelegate); }; } // namespace
Now the test can inject a StubDelegate
and use it to verify that RequestReboot
was called:
TEST_F(PowerManagerTest, Reboot) { // |delegate| is owned by |manager|. StubDelegate* delegate = new StubDelegate(); PowerManager manager(base::WrapUnique(delegate)); EXPECT_EQ(0, delegate->num_reboots()); EXPECT_TRUE(manager.Reboot()); EXPECT_EQ(1, delegate->num_reboots()); // Check that PowerManager reports failure. delegate->set_reboot_result(false); EXPECT_FALSE(manager.Reboot()); EXPECT_EQ(2, delegate->num_reboots()); }
If the code being tested reads or writes large amounts of data from files, it's advisable to stub out those interactions in unit tests to improve performance. Stubbing may also be necessary when exercising code that handles some classes of errors that are difficult to trigger within tests (e.g. permission errors).
When the code‘s interaction with the filesystem is limited and fast, it’s often better to let it operate on the real filesystem, but in a controlled way that doesn't have negative effects on the rest of the system where the tests are being run.
If the code being tested needs to read or write multiple files within a directory, try to structure it so the directory is passed into it:
// Reads and writes files within |dir|, a sysfs device directory. bool UpdateDevice(const base::FilePath& dir) { ... }
In the test, use base::ScopedTempDir
to create a temporary directory that will be automatically cleaned up when the test is done:
#include <base/files/scoped_temp_dir.h> ... TEST_F(MyTest, UpdateDevice) { base::ScopedTempDir temp_dir; ASSERT_TRUE(temp_dir.CreateUniqueTempDir()); // Create needed files under |temp_dir|. ... ASSERT_TRUE(UpdateDevice(temp_dir.GetPath())); // Verify that expected files were written under |temp_dir|. ... }
If a class being tested uses hardcoded file paths, consider adding the ability to override them from tests:
// backlight_lister.h: class BacklightLister { public: BacklightLister(); ... // Should be called immediately after the object is created. void set_base_dir_for_test(const base::FilePath& dir) { base_dir_ = path; } ... private: // Base sysfs directory under which backlight devices are located. base::FilePath base_dir_; ... }; // backlight_lister.cc: namespace { // Real sysfs directory under which backlight devices are located. const char kDefaultBaseDir[] = "/sys/class/backlight"; } // namespace BacklightLister::BacklightLister() : base_dir_(kDefaultBaseDir) {} ...
In the test, use set_base_dir_for_test
to inject a temporary directory into the class:
TEST_F(BacklightListerTest, ListBacklights) { base::ScopedTempDir temp_dir; ASSERT_TRUE(temp_dir.CreateUniqueTempDir()); // Create fake device files under |temp_dir|. ... BacklightLister lister; lister.set_base_dir_for_test(temp_dir.GetPath()); // Verify that |lister| lists expected devices. ... }
Sometimes code needs to know the current time. Imagine the following class:
// stopwatch.h: // Measures how long an operation takes. class Stopwatch { ... // Records the start of the operation. void StartOperation(); // Returns the elapsed time since the operation started. base::TimeDelta EndOperation(); private: // The time at which StartOperation() was called. base::TimeTicks start_time_; ... }; // stopwatch.cc: void Stopwatch::StartOperation() { start_time_ = base::TimeTicks::Now(); } base::TimeDelta Stopwatch::EndOperation() { return base::TimeTicks::Now() - start_time_; }
Note that this code uses base::TimeTicks
and base::TimeDelta
to hold time-related values rather than time_t
and integers; the former types are efficient and self-documenting (e.g. no confusion about milliseconds vs. seconds).
It's still difficult to test, though: how does one verify that the class accurately measures time? Resist the temptation to call sleep()
from within tests; short delays add up when many tests are run, and build systems are often overloaded, making it impossible to predict how long an operation will take and avoid flakiness when tests depend on the passage of time.
Instead, use base::Clock
and base::TickClock
to get the current time:
// stopwatch.h: class Stopwatch { public: // Ownership of |clock| remains with the caller. explicit Stopwatch(base::TickClock* clock); ... private: base::Clock* clock_; // Not owned. ... }; // stopwatch.cc: Stopwatch::Stopwatch(base::TickClock* clock) : clock_(clock) {} void Stopwatch::StartOperation() { start_time_ = clock_->NowTicks(); } base::TimeDelta Stopwatch::EndOperation() { return clock_->NowTicks() - start_time_; } // main.cc: int main(int argc, char** argv) { base::DefaultClock clock; Stopwatch stopwatch(&clock); stopwatch.StartOperation(); ... const base::TimeDelta duration = stopwatch.EndOperation(); ... }
Now tests can pass in their own base::SimpleTestTickClock
and adjust the time that it returns as “now”:
TEST_F(StopwatchTest, MeasureTime) { base::SimpleTestTickClock clock; Stopwatch stopwatch(&clock); stopwatch.StartOperation(); constexpr base::TimeDelta kDelay = base::TimeDelta::FromSeconds(5); clock.Advance(kDelay); EXPECT_EQ(kDelay, stopwatch.EndOperation()); }
Sometimes, the code that's being tested does work asynchronously. Imagine a function that posts a task that performs some work and runs a callback:
// Starts work. |callback| is run asynchronously on completion. void Start(const base::Closure& callback) { base::MessageLoop::current()->PostTask( FROM_HERE, base::Bind(&DoWork, callback); } // Performs work on behalf of Start() and runs |callback|. void DoWork(const base::Closure& callback) { ... callback.Run(); }
A test can use base::RunLoop
to run the message loop asynchronously and verify that the callback was executed:
namespace { // Increments the value pointed to by |num_calls|. Used as a callback for // Start(). void RecordCompletion(int* num_calls) { *num_calls++; } } // namespace TEST(CallbackIsRun) { int num_calls = 0; Start(base::Bind(&RecordCompletion, base::Unretained(&num_calls))); base::RunLoop().RunUntilIdle(); EXPECT_EQ(1, num_calls); }
Alternately, the callback can be used to stop the message loop:
TEST(CallbackIsRun) { base::RunLoop run_loop; Start(run_loop.QuitClosure()); run_loop.Run(); }
Also consider forgoing base::MessageLoop
entirely in cases like this and instead injecting a base::TaskRunner
so tests can control where tasks are run.
Multithreaded code can be challenging to test reliably. Avoid sleeping within the test to wait for other threads to complete work; as described above, doing so both makes the test slower than necessary and leads to flakiness.
It's often best to hide threading details by structuring your code to post callbacks back to the calling message loop. See base::TaskRunner::PostTaskAndReply()
and the “Asynchronous tasks” section above.
If that isn't possible, tests can use base::WaitableEvent
or other synchronization types to wait just as long as is necessary.
Chrome OS supports many different hardware configurations. Much of the variability is described at build-time using Portage USE flags. A common pattern is to map these USE flags to C preprocessor macros and then test them within C++ code using #ifdef
or #if defined(...)
.
When this is done, only one specific configuration is compiled and tested when the package is built for a given board. If a test wants to exercise code that‘s only enabled for a subset of configurations, the test must be protected with a similar preprocessor check. If a developer is modifying the code, they must perform builds and run unit tests for multiple boards to make sure they haven’t broken any supported configurations.
It's often better to attempt to keep as much of the code as is possible testable when a build is performed for a single board. This makes development much easier: even if the board that the developer is using corresponds to a Chromebook, they can test Chromebox-specific code, for example.
Instead of sprinkling preprocessor checks across the codebase, store the configuration parameters in a struct or class that can be constructed and injected by unit tests to exercise different configurations. If preprocessor checks are necessary, constrain them to a single location that's not compiled into tests (e.g. the file containing the main
function). If the code requires extensive configuration, consider introducing configuration or preference files (which can be written by tests).
Preprocessor checks can also be made more readable by hiding them within functions instead of distributing them throughout the code:
inline bool HasKeyboard() { #if defined(DEVICE_HAS_KEYBOARD) return true; #else return false; #endif } void DoSomething() { if (HasKeyboard()) { ... } else { ... } }
This pattern also makes it easier to move away from preprocessor-based configuration in the future.
Google Mock (formerly “gmock”, and now part of Google Test) is “an extension to Google Test for writing and using C++ mock classes”. It can be used to create mock objects that return canned values and verify that they‘re called in an expected manner. This can be done with minimal changes to the code that’s being tested -- if a class has virtual methods, you can use Google Mock to create a partial mock of it. To accomplish this, Google Mock introduces what's essentially a new language on top of C++, heavily dependent on preprocessor macros and template magic. As a result, tests that use Google Mock can be very difficult to read, write, and debug. Incorrectly using Google Mock macros can result in very long and confusing compilers errors due to its heavy use of templates.
Google Mock can be useful in some cases. For example, testing that methods across multiple objects are invoked with the right arguments and in the correct order is much easier to do using Google Mock than with a roll-your-own approach.
However, Google-Mock-based tests are often brittle and difficult to maintain. Tests that use Google Mock frequently verify behavior that‘s not directly related to the functionality being tested; when refactoring code later, it then becomes necessary to make changes to unrelated tests. On the other hand, if tests instead underspecify the behavior of mocked objects, Google Mock can produce large amounts of log spam (e.g. Uninteresting mock function call
warnings that are logged when a mock’s return values are not specified).
Don‘t use Google Mock to paper over code that wasn’t designed with testability in mind. Instead, modify the code so that you can add the tests cleanly. Define and use clear, pure-virtual interfaces wherever possible. Feel free to use Google Mock to define mock implementations of those interfaces if doing so makes your tests shorter and more readable, but don‘t use it to mock concrete classes, especially when those classes are what you’re trying to test.
See the 2012 “Should we officially deprecate GMock?” and 2016 “Using NiceMock?” chromium-dev mailing list threads for more discussion. The “Friends You Can Depend On” and “Know Your Test Doubles” posts on the Google Testing Blog also include more information about different types of test objects.
MOCK_METHOD
macros instead of MOCK_METHOD1
, MOCK_METHOD2
, etc.override
in MOCK_METHOD
on a virtual method so that you get a compiler error if the declarations in the class and mock do not match.clang-format
sometimes formats Bar* bar
as Bar * bar
in MOCK_METHOD
. See https://crbug.com/1006323.class Task { public: Task() = default; virtual ~Task() = default; virtual bool IsDone() const = 0; virtual bool DoIt(const std::map<int, int>& arguments, std::string* result, const base::Closure& error_callback) = 0; }; // Old way class MockTask : public Task { public: MOCK_CONST_METHOD0(IsDone, bool()); MOCK_METHOD3(DoIt, bool(const std::map<int, int>&, std::string*, const base::Closure&)); MOCK_METHOD0(OnError, void()); }; // New preferred way class MockTask : public Task { public: MOCK_METHOD(bool, IsDone, (), (const, override)); // NOTE: const and override MOCK_METHOD(bool, DoIt, ((const std::map<int, int>&), std::string*, const base::Closure&), (override)); // NOTE: override specified MOCK_METHOD(void, OnError, ()); };
There is currently no standardized way to obtain code coverage reports for unit tests, so the exact instructions will differ from package to package. Also, note that these instructions assume that your package uses Clang.
The general idea is to run USE=coverage FEATURES="test noclean" emerge-<BOARD> <PKG>
, which runs the unit tests and deposits the following two artifacts:
default.profraw
, which might be found in the package's source directory (the CWD where unit test ran)./build/<BOARD>/tmp/portage/<FULL_PKG_PATH>-9999/work/build/out/Default/
. One way to find the executable would be to determine the unit test executable name from the package's BUILD.gn file and use the find utility.Once you have identified the two artifacts, you can analyze the code coverage using llvm-cov
.
Here is an example of this workflow for the biod package.
export BOARD=nocturne USE=coverage ./build_packages --board=$BOARD --skip_chroot_upgrade biod cros_workon --board=$BOARD start biod LLVM_PROFILE_FILE="/tmp/default.profraw" USE=coverage FEATURES="test noclean" emerge-$BOARD biod cd /build/$BOARD/tmp/ llvm-profdata merge -o default.profdata default.profraw # Now inspect the ~/trunk/src/platform2/biod/BUILD.gn file to # determine the test executable name, which is biod_test_runner in this case. # Next, search /build/$BOARD/tmp/portage for that binary name. # To generate an HTML summary with line-by-line reporting, use the following: llvm-cov show -instr-profile=default.profdata -format=html -output-dir=. -object=/build/$BOARD/tmp/portage/chromeos-base/biod-9999/work/build/out/Default/biod_test_runner # Now, point your web browser to ~/chromiumos/chroot/build/$BOARD/tmp/index.html # To show console based coverage metrics per file, use the following: llvm-cov report -instr-profile=default.profdata -object=/build/$BOARD/tmp/portage/chromeos-base/biod-9999/work/build/out/Default/biod_test_runner