Debugging flaky tests with Bazel

Suppose you have a test that is passing… most of the time. When you start debugging it, you might try running the test and, unhelpfully, it passes:

$ bazel test :flaker
INFO: Found 1 test target...
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.223s, Critical Path: 0.04s
//:flaker                                                                PASSED
 
Executed 1 out of 1 tests: 1 test passes.

At this point, if you simply run bazel test :flaker again, Bazel knows that no files that affect the test have changed, so it won’t bother re-running it:

$ bazel test :flaker
INFO: Found 1 test target...
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.207s, Critical Path: 0.00s
//:flaker                                                   (1/0 cached) PASSED 
Executed 0 out of 1 tests: 1 test passes.

Note the “cached” message: your test wasn’t even run! This is usually a good thing: you don’t want to waste processing power on retesting something that you already know passes. However, it isn’t very convenient if you know that your test is flaky. So what do you do now?

You can always run the test manually (./bazel-bin/flaker) over and over, or write a script that will run it, but neither is very convenient.

Enter Bazel’s --runs_per_test option.

Specifying this option runs your test multiple times, prints a summary of what happened, and (by default) only keeps the logs for the failing tests:

$ bazel test --runs_per_test=10 :flaker
INFO: Found 1 test target...
FAIL: //:flaker (run 10 of 10) (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test_run_10_of_10.log).
FAIL: //:flaker (run 4 of 10) (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test_run_4_of_10.log).
FAIL: //:flaker (run 5 of 10) (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test_run_5_of_10.log).
FAIL: //:flaker (run 9 of 10) (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test_run_9_of_10.log).
FAIL: //:flaker (run 3 of 10) (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test_run_3_of_10.log).
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.828s, Critical Path: 0.42s
//:flaker                                                                FAILED
 
Executed 1 out of 1 tests: 1 fails locally.

This ran the test 10 times, 5 runs of which failed. --runs_per_test lets you easily run a flaky test hundreds of times in a row (if necessary) to track down what’s going on. Bazel creates unique log files for each failing run automatically and discards the passing tests’ logs.

Getting more logs

By default, Bazel doesn’t keep logs around for passing tests (since you don’t usually care what happened when a test passed). However, during development, sometimes even passing logs can be handy. You can get logs for your passing tests using the --test_output=all flag:

$ bazel test --test_output=all :flaker
INFO: Found 1 test target...
INFO: From Testing //:flaker:
==================== Test output for //:flaker:
Should I pass or should I fail?
Thinking about it...



Okay, I'll pass.
================================================================================
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.873s, Critical Path: 0.71s
//:flaker                                                                PASSED

Executed 1 out of 1 tests: 1 test passes.

This way you can see stdout/stderr from tests that you’re still working on (or ones that are unexpectedly passing).

Turning off caching

Finally, as mentioned above, Bazel tries to cache test runs whenever possible. If you don’t want to use --runs_per_test but still want to rerun a test, you can specify --cache_test_results=no:

$ bazel test :flaker
INFO: Found 1 test target...
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.282s, Critical Path: 0.04s
//:flaker                                                                PASSED
 
Executed 1 out of 1 tests: 1 test passes.
$ bazel test --cache_test_results=no :flaker
INFO: Found 1 test target...
FAIL: //:flaker (see /private/var/tmp/_bazel_kchodorow/16a1114002542b106523c47d490a1041/test/bazel-out/local_darwin-fastbuild/testlogs/flaker/test.log).
Target //:flaker up-to-date:
  bazel-bin/flaker
INFO: Elapsed time: 0.178s, Critical Path: 0.03s
//:flaker                                                                FAILED
 
Executed 1 out of 1 tests: 1 fails locally.

It’s a bit redundant (you can get the same behavior from --runs_per_test), but sometimes I’m just in more of a --cache_test_results mood.

There are several other features Bazel has to simplify testing and debugging, but these are a few of the flags that I find most helpful.

Check out the Bazel user manual for more detailed documentation on:

Happy testing!

kristina chodorow's blog