Trimming the (build) tree with Bazel

Jonathan Lange wrote a great blog post about how Bazel caches tests. Basically: if you run a test, change your code, then run a test again, the test will only be rerun if you changed something that could actually change the outcome of the test. Bazel takes this concept pretty far to minimize the work your build needs to do, in some ways that aren’t immediately obvious.

Let’s take an example. Say you’re using Bazel to “build” rigatoni arrabiata, which could be represented as having the following dependencies:

recipe

Each food is a library which depends on the libraries below it. Suppose you change a dependency, like the garlic:

change-garlic

Bazel will stat the files of the “garlic” library and notice this change, and then make a note that the things that depend on “garlic” may have also changed:

dirty

The fancy term for this is “invalidating the upward transitive closure” of the build graph, aka “everything that depends on a thing might be dirty.” Note that Bazel already knows that this change doesn’t affect several of the libraries (rigatoni, tomato-puree, and red-pepper), so they definitely don’t have to be rebuilt.

Bazel will then evaluate the “sauce” node and figures out if its output has changed. This is where the secret sauce (ha!) happens: if the output of the “sauce” node hasn’t changed, Bazel knows that it doesn’t have to recompile rigatoni-arrabiata (the top node), because none of its direct dependencies changed!

The sauce node is no longer "maybe dirty" and so its reverse dependencies (rigatoni-arrabiata) can also be marked as clean.

The sauce node is no longer “maybe dirty” and so its reverse dependencies (rigatoni-arrabiata) can also be marked as clean.

In general, of course, changing the code for a library will change its compiled form, so the “maybe dirty” node will end up being marked as “yes, dirty” and re-evaluated (and so on up the tree). However, Bazel’s build graph lets you compile the bare minimum for a well-structured library, and in some cases avoid compilations altogether.

  • ittai zeidman

    Thanks!
    I did a bit of experimentation and saw that tests work a bit differently.
    In the above example tests of rigatoni-arrabiata would run, right?

  • kristina1

    if the test depended on rigatoni-arabiata (rigatoni-arabiata-test -> rigatoni-arabiata -> sauce) then no, the test wouldn’t have to be re-run (similarly for sauce, if you had a test that depended on sauce). If the test depended on garlic, it _would_ be rerun. Is that not the behavior you’re seeing?

  • ittai zeidman

    It’s not the behavior I’m seeing.
    I’ve created a repo to share it (https://github.com/ittaiz/bazel-transitive-impact)
    In a nutshell:
    I have
    GreeterTest => greeter (depends BUILD, compile and runtime)
    greeter => before_greeter (depends only BUILD file wise, source code has no relation and that is to create a situation where change in before_greeter outputs the same greeter)
    before_greeter => greeter (depends only BUILD file wise, source code has no relation and that is to create a situation where change in before_before_greeter outputs the same before_greeter)

    When changing a single character in BeforeBeforeGreeting.java (which nothing depends on source wise) and running “bazel test GreeterTest –explain=foo.log” GreeterTest is run

    Contents of explain:
    Build options: –explain=foo.log Executing action ‘BazelWorkspaceStatusAction stable-status.txt’: unconditional execution is requested. Executing action ‘Building libbefore_before_greeter.jar (1 source file)’: One of the files has changed. Executing action ‘Extracting interface //:before_before_greeter’: One of the files has changed. Executing action ‘Testing //:GreeterTest’: One of the files has changed.

  • kristina1

    Ah, I see the confusion. The problem is that Bazel can’t know that BeforeBeforeGreeter.java doesn’t effect the outcome. libbefore_before_greeter.jar changes and it’s an input to the test, so the test has to be rerun. If you look at bazel-bin/GreeterTest (the test runner shell script), you can see:

    CLASSPATH=”${RUNPATH}GreeterTest.jar:${RUNPATH}../bazel_tools/tools/jdk/TestRunner_deploy.jar:${RUNPATH}libgreeter.jar:${RUNPATH}libbefore_greeter.jar:${RUNPATH}libbefore_before_greeter.jar”

    If you make a change to before_before_greeter that doesn’t change the contents of the jar (e.g., add a comment), _then_ the test will be cached. Obviously, most of the time this isn’t super helpful (as I mentioned in the post), but it can help.

  • ittai zeidman

    Thanks! This behavior is actually what I was looking for because a library’s implementation might have changed and will fail the tests of the deployable while not changing the API at all.
    My example was a bit contrived for simplicity…

kristina chodorow's blog