Our current best advice on how to start fuzzing is by using FuzzTest, which has its own getting started guide here. If you‘re reading this page, it’s probably because you've run into limitations of FuzzTest and want to create a libfuzzer fuzzer instead. This is a slightly older approach to fuzzing Chrome, but it still works well - read on.
This document walks you through the basic steps to start fuzzing and suggestions for improving your fuzz targets. If you're looking for more advanced fuzzing topics, see the main page.
Before writing any code let us look at a simple example of a test that uses input fuzzing. The test is setup to exercise the CreateFnmatchQuery
function. The role of this function is to take a user query and produce a case-insensitive pattern that matches file names containing the query in them. For example, for a query “1abc” the function generates “*1[aA][bB][cC]*”. Unlike a traditional test, an input fuzzing test does not care about the output of the tested function. Instead it verifies that no matter what string the user enters CreateFnmatchQuery
does not do something unexpected, such as a crash, overriding a memory region, etc. The test create_fnmatch_query_fuzzer.cc is shown below:
#include <stddef.h> #include <stdint.h> #include <string> #include "chrome/browser/ash/extensions/file_manager/search_by_pattern.h" extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { std::string str = std::string(reinterpret_cast<const char*>(data), size); extensions::CreateFnmatchQuery(str); return 0; }
The code starts by including stddef.h
for size_t
definition, stdint.h
for uint8_t
definition, string
for std::string
definition and finally the file where extensions::CreateFnmatchQuery
function is defined. Next it declares and defines the LLVMFuzzerTestOneInput
function, which is the function called by the testing framework. The function is supplied with two arguments, a pointer to an array of bytes, and the size of the array. These bytes are generated by the fuzzing test harness and their specific values are irrelevant. The job of the test is to convert those bytes to input parameters of the tested function. In our case bytes are converted to a std::string
and given to the CreateFnmatchQuery
function. If the function completes its job and the code successfully returns, the LLVMFuzzerTestOneInput
function returns 0, signaling a successful execution.
The above pattern is typical to fuzzing tests. You create a LLVMFuzzerTestOneInput
function. You then write code that uses the provided random bytes to form input parameters to the function you intend to test. Next, you call the function, and if it successfully completes, return 0.
To run this test we need to create a fuzzer_test
target in the appropriate BUILD.gn
file. For the above example, the target is defined as
fuzzer_test("create_fnmatch_query_fuzzer") { sources = [ "extensions/file_manager/create_fnmatch_query_fuzzer.cc" ] deps = [ ":ash", "//base", "//chrome/browser", "//components/exo/wayland:ui_controls_protocol", ] }
The source field typically specified just the file that contains the test. The dependencies are specific to the tested function. Here we are listing them for the completeness. In your test all but //base
dependencies are unlikely to be required.
Having seen a concrete example, let us describe the generic flow of steps to create a new fuzzing test.
In the same directory as the code you are going to fuzz (or next to the tests for that code), create a new <my_fuzzer>.cc
file.
testing/libfuzzer/fuzzers
directory. This directory was used for initial sample fuzz targets but is no longer recommended for landing new targets.In the new file, define a LLVMFuzzerTestOneInput
function:
#include <stddef.h> #include <stdint.h> extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { // Put your fuzzing code here and use |data| and |size| as input. return 0; }
BUILD.gn
file, define a fuzzer_test
GN target:import("//testing/libfuzzer/fuzzer_test.gni") fuzzer_test("my_fuzzer") { sources = [ "my_fuzzer.cc" ] deps = [ ... ] }
Once you created your first fuzz target, in order to run it, you must set up your build environment. This is described next.
Generate build files by using the use_libfuzzer
GN argument together with a sanitizer. Rather than generating a GN build configuration by hand, we recommend that you run the meta-builder tool using GN config that corresponds to the operating system of the DUT you're deploying to:
# AddressSanitizer is the default config we recommend testing with. # Linux: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Linux ASan' out/libfuzzer # Chrome OS: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Chrome OS ASan' out/libfuzzer # Mac: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Upload Mac ASan' out/libfuzzer # Windows: python tools\mb\mb.py gen -m chromium.fuzz -b "Libfuzzer Upload Windows ASan" out\libfuzzer
If testing things locally these are the recommended configurations
# AddressSanitizer is the default config we recommend testing with. # Linux: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Linux ASan' out/libfuzzer # Chrome OS: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Chrome OS ASan' out/libfuzzer # Mac: tools/mb/mb.py gen -m chromium.fuzz -b 'Libfuzzer Local Mac ASan' out/libfuzzer # Windows: python tools\mb\mb.py gen -m chromium.fuzz -b "Libfuzzer Local Windows ASan" out\libfuzzer
tools/mb/mb.py
is “a wrapper script for GN that [..] generate[s] build files for sets of canned configurations.” The -m
flag selects the builder group, while the -b
flag selects a specific builder in the builder group. The out/libfuzzer
is the directory to which GN configuration is written. If you wish, you can inspect the generated config by running gn args out/libfuzzer
, once the mb.py
script is done.
You can also invoke AFL by using the use_afl
GN argument, but we recommend libFuzzer for local development. Running libFuzzer locally doesn't require any special configuration and gives quick, meaningful output for speed, coverage, and other parameters.
It’s possible to run fuzz targets without sanitizers, but not recommended, as sanitizers help to detect errors which may not result in a crash otherwise. use_libfuzzer
is supported in the following sanitizer configurations.
GN Argument | Description | Supported OS |
---|---|---|
is_asan=true | Enables AddressSanitizer to catch problems like buffer overruns. | Linux, Windows, Mac, Chrome OS |
is_msan=true | Enables MemorySanitizer to catch problems like uninitialized reads[*]. | Linux |
is_ubsan_security=true | Enables UndefinedBehaviorSanitizer to catch[*] undefined behavior like integer overflow. | Linux |
For more on builder and sanitizer configurations, see the Integration Reference page.
symbol_level
attribute.After you create your fuzz target, build it with autoninja and run it locally. To make this example concrete, we are going to use the existing create_fnmatch_query_fuzzer
target.
# Build the fuzz target. autoninja -C out/libfuzzer chrome/browser/ash:create_fnmatch_query_fuzzer # Run the fuzz target. ./out/libfuzzer/create_fnmatch_query_fuzzer
Your fuzz target should produce output like this:
INFO: Seed: 1511722356 INFO: Loaded 2 modules (115485 guards): 22572 [0x7fe8acddf560, 0x7fe8acdf5610), 92913 [0xaa05d0, 0xafb194), INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes INFO: A corpus is not provided, starting from an empty corpus #2 INITED cov: 961 ft: 48 corp: 1/1b exec/s: 0 rss: 48Mb #3 NEW cov: 986 ft: 70 corp: 2/104b exec/s: 0 rss: 48Mb L: 103/103 MS: 1 InsertRepeatedBytes- #4 NEW cov: 989 ft: 74 corp: 3/106b exec/s: 0 rss: 48Mb L: 2/103 MS: 1 InsertByte- #6 NEW cov: 991 ft: 76 corp: 4/184b exec/s: 0 rss: 48Mb L: 78/103 MS: 2 CopyPart-InsertRepeatedBytes-
A ... NEW ...
line appears when libFuzzer finds new and interesting inputs. If your fuzz target is efficient, it will find a lot of them quickly. A ... pulse ...
line appears periodically to show the current status.
For more information about the output, see libFuzzer's output documentation.
odr-violation
error in the log, please try setting the following environment variable: ASAN_OPTIONS=detect_odr_violation=0
and running the fuzz target again.If your fuzz target crashes when running locally and you see non-symbolized stacktrace, make sure you add the third_party/llvm-build/Release+Asserts/bin/
directory from Chromium’s Clang package in $PATH
. This directory contains the llvm-symbolizer
binary.
Alternatively, you can set an external_symbolizer_path
via the ASAN_OPTIONS
environment variable:
ASAN_OPTIONS=external_symbolizer_path=/my/local/llvm/build/llvm-symbolizer \ ./fuzzer ./crash-input
The same approach works with other sanitizers via MSAN_OPTIONS
, UBSAN_OPTIONS
, etc.
ClusterFuzz and the build infrastructure automatically discover, build and execute all fuzzer_test
targets in the Chromium repository. Once you land your fuzz target, ClusterFuzz will run it at scale. Check the ClusterFuzz status page after a day or two.
If you want to better understand and optimize your fuzz target’s performance, see the Efficient Fuzzing Guide.
Your fuzz target may immediately discover interesting (i.e. crashing) inputs. You can make it more effective with several easy steps:
Create a seed corpus. You can guide the fuzzing engine to generate more relevant inputs by adding the seed_corpus = "src/fuzz-testcases/"
attribute to your fuzz target and adding example files to the appropriate directory. For more, see the Seed Corpus section of the Efficient Fuzzing Guide.
Create a mutation dictionary. You can make mutations more effective by providing the fuzzer with a dict = "protocol.dict"
GN attribute and a dictionary file that contains interesting strings / byte sequences for the target API. For more, see the Fuzzer Dictionary section of the [Efficient Fuzzer Guide].
Specify testcase length limits. Long inputs can be problematic, because they are more slowly processed by the fuzz target and increase the search space. By default, libFuzzer uses -max_len=4096
or takes the longest testcase in the corpus if -max_len
is not specified.
ClusterFuzz uses different strategies for different fuzzing sessions, including different random values. Also, ClusterFuzz uses different fuzzing engines (e.g. AFL that doesn't have -max_len
option). If your target has an input length limit that you would like to strictly enforce, add a sanity check to the beginning of your LLVMFuzzerTestOneInput
function:
if (size < kMinInputLength || size > kMaxInputLength) return 0;
Generate a code coverage report. See which code the fuzzer covered in recent runs, so you can gauge whether it hits the important code parts or not.
Note: Since the code coverage of a fuzz target depends heavily on the corpus provided when running the target, we recommend running the fuzz target built with ASan locally for a little while (several minutes / hours) first. This will produce some corpus, which should be used for generating a code coverage report.
If the code you’re fuzzing generates a lot of error messages when encountering incorrect or invalid data, the fuzz target will be slow and inefficient.
If the target uses Chromium logging APIs, you can silence errors by overriding the environment used for logging in your fuzz target:
struct Environment { Environment() { logging::SetMinLogLevel(logging::LOGGING_FATAL); } }; extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { static Environment env; // Put your fuzzing code here and use data+size as input. return 0; }
By default, a fuzzing engine such as libFuzzer mutates a single input (uint8_t* data, size_t size
). However, APIs often accept multiple arguments of various types, rather than a single buffer. You can use three different methods to mutate multiple inputs at once.
If you need to mutate multiple inputs of various types and length, see Getting Started with libprotobuf-mutator in Chromium.
.proto
definition (unless you fuzz an existing protobuf) and C++ code to pass the proto message to the API you are fuzzing (you'll have a fuzzed protobuf message instead of data, size
buffer).FuzzedDataProvider is a class useful for splitting a fuzz input into multiple parts of various types.
To use FDP, add #include <fuzzer/FuzzedDataProvider.h>
to your fuzz target source file.
To learn more about FuzzedDataProvider
, check out the upstream documentation on it. It gives an overview of the available methods and links to a few example fuzz targets.
If your API accepts a buffer with data and some integer value (i.e., a bitwise combination of flags), you can calculate a hash value from (data, size
) and use it to fuzz an additional integer argument. For example:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { std::string str = std::string(reinterpret_cast<const char*>(data), size); std::size_t data_hash = std::hash<std::string>()(str); APIToBeFuzzed(data, size, data_hash); return 0; }