Four alternative debugging techniques

I’ve recently been working on a side project that uses WebGL and a physics engine that was transpiled from C++ into JavaScript so… printing variable to the console and using the debugger just weren’t cutting it. I started thinking about the other ways I debug things:

  1. Ship of Theseus debugging: the ship of Theseus is a thought experiment: if you have something and you gradually replace every part, at which point is it a different thing? This is debugging via finding a working example, then gradually mutating it (without breaking it, source control is helpful here) into what you actually want to do. There are a couple of problems: this often leaves some cruft around from the original program and I often never figure out why it wasn’t working in the first place. Which brings us to…
  2. Homeland security debugging: if you see a suspicious variable name or method call, don’t keep it to yourself. Tell a coder or library maintainer. This is where you go back through all of the sketchy parts of your code and make sure they are actually doing what you expect. After working on a program for a while, I’ll usually end up with parts of codebase that are a bit questionable. Why am I passing a literal “1” as an argument in here? Why do I have two names for this variable? Basically, I’m going through my program, line by line, checking all of my vague suspicions.
  3. Thunderdome debugging: one coder and one bug enter, one coder leaves. This is kind of a meta-technique I use for weird, difficult-to-reproduce issues in integration tests where I have to dig through logs or do a 12-step process every time I need to test a change. Basically, I get a big cup of coffee, sit down at my machine, and try everything while mainlining caffeine. Afterwards, I generally couldn’t even tell you what the bug ended up being, but it is no more. This is the kind of debugging that generally does not happen unless I’m being paid for it.
  4. Wooden nickel debugging: try testing code against the most useless possible inputs. Sometimes I have a very complex chunk of code that doesn’t work. I don’t want to spend the time to get a meaningful test running against it, but I start writing unit tests, passing in the most trivial input possible: the empty string, 0, an empty array. And often, after an input or two, I’ve figured out what it’s doing wrong.

Anyone else have any non-traditional ways that they debug?

Compilation à la mode

Sundae from Black Tap, incidentally around the corner from the NYC Google office.

Sundae from Black Tap, incidentally around the corner from the NYC Google office.

Bazel lets you set up various “modes” of compilation. There are several built-in (fast, optimized, debug) and you can define your own. The built in ones are:

  • Fast: build your program as quickly as possible. This is generally best for development (when you want a tight compile/edit loop) and is the default, when you don’t specify anything. Your build’s output is generated in bazel-out/local-fastbuild.
  • Optimized: code is compiled to run fast, but may take longer to build. You can get this by running bazel build -c opt //your:target and it’s output will be generated in bazel-out/local-opt. This mode is the best for code that will be deployed.
  • Debug: this leaves in symbols and generally optimizes code for running through a debugger. You can get this by running bazel build -c dbg //your:target and it’s output will be generated in bazel-out/local-dbg

You don’t have to actually know where the build’s output is stored, bazel will update the bazel-bin/bazel-genfiles symlinks for you automatically at the end of your build, so they’ll always point to the right bazel-out subdirectory.

Because each flavor’s output is stored in a different directory, each mode effectively has an entirely separate set of build artifacts, so you can get incremental builds when switching between modes. On the downside: when you build in a new mode for the first time, it’s essentially a clean build.

Note: -c is short for --compilation_mode, but everyone just says -c.

Defining a new mode

Okay, you can’t really define your own mode without tons of work, but you can create named sets of options (which is probably what you wanted anyway unless you’re writing your own toolchain).

For example, let’s say I generally have several flags I want to run with when I’m trying to debug a failing test. I can create a “gahhh” config in my ~/.bazelrc as follows:

test:gahhh --test_output=all
test:gahhh --nocache_test_results
test:gahhh --verbose_failures
test:gahhh -c dbg

The “test” part indicates what command this applies to (“build”, “query” or “startup” are other useful ones). The “:gahhh” names the config, and then I give the option. Now, when I get frustrated, I can run:

$ bazel test --config gahhh //my:target

and I don’t have to remember the four options that I want to use.

The Mixed-Up Directories of Mrs. Bazel E. Frankweiler

Bazel has several directory trees that it uses during a build.


The most obvious directory is the source tree where your code lives and where you run your builds. This is, by default, what Bazel uses for source files.

However, you can combine several source trees by using the --package_path option. This basically overlays them, from Bazel’s point of view. For instance, if you had:


Then if you ran bazel build --package_path=/home/user/gitroot/my-project:/usr/local/other-proj //..., Bazel would “see” the directory tree:


If a package is defined in multiple package paths, the first path “wins” (e.g., the top-level package containing foo/ above).

Finally, source code for external repositories is tucked away in Bazel’s output base. You can see it by running:

$ ls $(bazel info output_base)/external

Execution root

Once bazel figures out what packages a build is going to use, it creates a symlink tree called the execution root, where the build actually happens.* It basically traverses all of the packages it found and comes up with the most efficient way it can to symlink them together. For example, in the directory tree above, you’d end up with:

    WORKSPACE -> /home/user/gitroot/my-project/WORKSPACE
    BUILD -> /home/user/gitroot/my-project/BUILD
    foo/ -> /home/user/gitroot/my-project/foo
    bar/ -> /usr/local/other-proj/bar
    ... # Tools built into the Bazel binary
    ... # C++ compiler tools

You can check out the execution root by running:

$ ls $(bazel info execution_root)


Here’s the * from the execution root section above! If you’re on Linux (and, hopefully, OS X soon), your build actually takes place in a sandbox based on the execution root. All of the files your build needs (and hopefully none it doesn’t) are mounted into their own namespace and executed in a hermetically sealed environment: no network or filesystem access (other than what you specified).

You can see what’s being mounted and where by running Bazel with a couple extra flags:

$ bazel build --sandbox_debug --verbose_failures //...

Derived roots

We’re working on improving how configurable Bazel is, but for years the main configuration options have been the platform you’re building for and what your compiler options were. If you run an optimized build, then a debug build, and then an optimized build again, you’d like your results to be cached from the first run, not overwritten each time. In a somewhat questionable design move, Bazel uses a special set of output directories that are named based on the configuration. So if you build an optimized binary, it’ll create it under execroot/my-project/bazel-out/local-opt/bin/my-binary. If you then build it as a debug binary, it’ll put it under execroot/my-project/bazel-out/local-dbg/bin/my-binary. Then, if you build it optimized again, it’ll be able to switch back to using the local-opt directory. (However, bazel uses symlinks out the wazoo, so I don’t know why it doesn’t use symlinks to track which configuration is being used. Seems like it’d be a lot easier to have the outputs directly under execroot/my-project.)

Also, Bazel distinguishes between files created by genrule and… everything else. Almost all output is under bazel-out/config/bin, but genrules are under bazel-out/config/genfiles.

(I’m kind of bitter about these files, I’ve been working on a change for what feels like months because of this stupid directory structure.)

Note that Bazel symlinks execroot/ws/bazel-out/config/bin to bazel-bin and execroot/ws/bazel-out/config/genfiles to bazel-genfiles. These “convenience symlinks” are the outputs shown at the end of your build.


Suppose you build a binary in your favorite language. When you run that binary, it tries to load a file during runtime. Bazel encourages you to declare these files as runfiles, runtime dependencies of your build. If they change, your binary won’t be recompiled (because they’re runtime, not compile-time dependencies) but they will cause tests to be re-run.

Bazel create a directory for these files as a sibling to your binary: if you build //foo:my-binary, the runfiles will be under bazel-bin/foo/my-binary.runfiles. You can explore the directory or see a list of them all in the runfiles manifest, also a sibling of the binary:

$ cat bazel-bin/foo/my-project.runfiles_manifest

Note that a binary can run files from anywhere on the filesystem (they’re binaries, after all). We just recommend using runfiles so that you can keep them together and express them as build dependencies.

Custom, locally-sourced output filenames

Skylark lets you use templates in your output file name, e.g., this would create a file called target.timestamp:

touch = rule(
    outputs = {"date_and_time": "%{name}.timestamp"},
    implementation = _impl,

So if you had touch(name = "foo") in a BUILD file and built :foo, you’d get foo.timestamp.

I’d always used %{name}, but I found out the other day that you can actually use other attributes, too. For example, you could have:

greet = rule(
    attrs = {"my_name": attr.string()},
    outputs = {"greeting": "hi-there-%{my_name}"},
    implementation = _impl,

Then if you have greet(name = "a-greeting", my_name = "kristina") and build :a-greeting, you’ll get “hi-there-kristina” as an output file.

The entire source for this example is available as a GitHub gist (all four lines of implementation function not shown above).

Using environment variables in Skylark repository rules

If you’ve every used the AppEngine rules, you know the pain that is wait for all 200 stupid megabytes of API to be downloaded. The pain is doubled because I already have a copy of these rules on my workstation.

To use the local rules, all I have to do is override the @com_google_appengine_java repository in my WORKSPACE file, like so:

load("//appengine:appengine.bzl", "APPENGINE_BUILD_FILE")
    name = "com_google_appengine_java",
    path = "/Users/kchodorow/Downloads",
    build_file_content = APPENGINE_BUILD_FILE,

However, this is still imperfect: I don’t really want to maintain changes that basically amount to a performance optimization in my local client.

By using environment variables in the appengine_repository rule, we can do even better. I’m going to create a new rule that checks if the APPENGINE_SDK_PATH environment variable is set. If it is, it will use a local_repository to pull in AppEngine, otherwise it will fall back on downloading the .zip.

So, to start, let’s take a look at the existing rule that pulls in the AppEngine SDK. As of this writing, it looks like this:

      name = "com_google_appengine_java",
      sha256 = "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4",
      build_file_content = APPENGINE_BUILD_FILE,

First, let’s modify this to use a repository rule instead of native.maven_jar:

def _find_locally_or_download_impl(repository_ctx):
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)
_find_locally_or_download = repository_rule(
  implementation = _find_locally_or_download_impl,
  local = False,
def appengine_repositories():
  _find_locally_or_download(name = "com_google_appengine_java")

This code functions identically (basically) to the original code, so now let’s add an option for using a local path. Modify in the implementation function to check the environment:

def _find_locally_or_download_impl(repository_ctx):
  if 'APPENGINE_SDK_PATH' in repository_ctx.os.environ:
    path = repository_ctx.os.environ['APPENGINE_SDK_PATH']
    if path == "":
      fail("APPENGINE_SDK_PATH set, but empty")
    repository_ctx.symlink(path, APPENGINE_DIR)
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)

Now we can download a copy of the SDK and try our rule (feel free to use an existing copy, if you have one on your system).

APPENGINE_SDK_PATH=/path/to/your/sdk/download bazel build //your/appengine/app

Problems with this:

  • You can’t actually set APPENGINE_SDK_PATH to where Bazel downlaoded the SDK the first time around ($(bazel info output_base)/external/com_google_appengine_java), which is suuuuper tempting to do. If you do, Bazel will delete the downloaded copy (because you changed the repository def) and then symlink the empty directory to itself. Never what you want.
  • It caches the environment variable, so if you change your mind you have to run bazel clean to use a different APPENGINE_SDK_PATH. I think this is a bug, although there’s some debate about that.

Resting BUILD face

I am super excited that pmbethe09 and lautentlb just put in a bunch of extra work to open source Buildifier. Buildifier is a great tool we use in Google to format BUILD files. It automatically organizes attributes, corrects indentation, and generally makes them more readable and excellent.

To try it out, clone the repo and build it with Bazel:

$ git clone
$ cd buildifier
$ bazel build //buildifier
Extracting Bazel installation...
INFO: Found 1 target...
Target //buildifier:buildifier up-to-date:
INFO: Elapsed time: 203.309s, Critical Path: 7.54s
INFO: Build completed successfully, 8 total actions

Now try it out on an ugly BUILD file:

$ echo 'cc_library(srcs = ["", ""], name = "foo")' > BUILD
$ ~/gitroot/buildifier/bazel-bin/buildifier/buildifier BUILD
$ cat BUILD
    name = "foo",
    srcs = [

Finally, why run commands manually when you can have your editor do it for you? I use emacs, so I can set up a hook like this:

(add-hook 'after-save-hook
            (if (string-match "BUILD" (file-name-base (buffer-file-name)))
                  (shell-command (concat "/path/to/buildifier/bazel-bin/buildifier/buildifier " (buffer-file-name)))
                  (find-alternate-file (buffer-file-name))))))

You could also set up a git hook to run this before committing, if that’s more your style. Regardless, give it a try! It’s a quick, easy way to make your BUILD files more readable.

Communicating between Bazel rules: how to use Skylark providers

Rules in Bazel often need information from their dependencies. My previous post touched on a special case of this: figuring out what a dependency’s runfiles are. However, Skylark is actually capable of passing arbitrary information between rules using a system known as providers.

Suppose we have a rule, analyze_flavors, that figures out what all of the flavors are in a dish. Our build file looks like:

load(":food.bzl", "analyze_flavors")
    name = "burger",
    ingredients = [
    name = "beef",
    tastes_like = "umame",
    name = "ketchup",
    tastes_like = "sweet",

We want to build up a flavor profile for :burger, based on its ingredients.

To do this, food.bzl looks like:

def _flavor_impl(ctx):
  # Build up a flavor profile from this rule & its ingredients.
  flavor_profile = []
  for ingredient in ctx.attr.ingredients:
    if ingredient.flavor != None:
      flavor_profile += ingredient.flavor
  if ctx.attr.tastes_like != "":
    flavor_profile += [ctx.attr.tastes_like]
  # Write the list of flavors to a file.
    output = ctx.outputs.out,
    content = "%s tastes like %s\n" % (, " and ".join(flavor_profile))
  # Return the list of flavors so it can be used by rules that depend on this.  return struct(flavor = flavor_profile) 
analyze_flavors = rule(
    attrs = {
        "ingredients": attr.label_list(),
        "tastes_like": attr.string(),
    outputs = {"out": "flavors-of-%{name}"},
    implementation = _flavor_impl,

The highlighted lines are where the rule returns a provider, flavor, to be consumed by its reverse dependencies (the targets depending on it).

Our BUILD file gives us the following build graph:


:burger depends on :beef and :ketchup. :beef and :ketchup each provide :burger with a flavor. Thus, if we build :burger and check its output file, we get:

$ bazel build :burger
INFO: Found 1 target...
Target //:burger up-to-date:
INFO: Elapsed time: 0.270s, Critical Path: 0.00s
INFO: Build completed successfully, 2 total actions
$ cat bazel-bin/flavors-of-burger
burger tastes like umame and sweet

This can be used to communicate rich information from rule-to-rule in Skylark. See the Skylark cookbook for another example of providers.

Collecting transitive runfiles with skylark

Bazel has a concept it calls runfiles for files that a binary uses during execution. For example, a binary might need to read in a CSV, an ssh key, or a .json file. These files are generally specified separately from your sources for a couple of reasons:

  • Bazel can understand that it is a runtime, not compile dependency (so if the runfiel changes, the binary does not need to be rebuilt).
  • The type is less restrictive: most rules have restrictions on what its sources can “look like” (e.g., Java sources end in .java or .jar, Go sources end in .go, Python sources end in .py or .pyc, etc.).

Thus, these runfiles are often specified in a separate data attribute.

If you’re writing a skylark rule that combines several executables, you will probably want the skylark rule to also combine the runfiles for all of them. Let’s create a rule that can combine several executables and include all of their runfiles. As a toy example, I created a rule below that creates an executable. The rule has one attribute, data, that can be other files or rules. The executable, when run, will just list all of the runfiles it has available.

For example, suppose you had the following BUILD file:

    name = 'main-course', 
    data = ['lasagna.txt']

If we ran bazel run :main-course, it would print:


However, we can also provide list_runfiles targets as data. For example, our BUILD file could say:

    name = 'main-course', 
    data = [
    name = 'side-dishes', 
    data = [
    name = 'soup', 
    data = ['gazpacho.txt']
    name = 'salad', 
    data = ['waldorf.txt']
    name = 'drink', 
    data = ['milk.txt']

Then running bazel run :main-course will print:


:main_course has collected all of its transitive runfiles in its runfiles tree.

Here’s the list_runfiles rule definition:

def _list_runfiles_impl(ctx): 
    output = ctx.outputs.executable,
    content = '\n'.join([
        "cd $0.runfiles",
    executable = True)
  return struct(runfiles = ctx.runfiles(collect_data = True)) 
list_runfiles = rule(
    attrs = {
        "data": attr.label_list(
            allow_files = True,
            cfg = DATA_CFG,
    executable = True,
    implementation = _list_runfiles_impl,

The key is the line is runfiles = ctx.runfiles(collect_data = True). collect_data will automatically “harvest” the runfiles from data, srcs, and deps attributes. We’ve only defined data for this rule, so that’s what it will use.

Note that this won’t work if you change "data": attr.label_list( to something not covered by collect_data, e.g., "stuff": attr.label_list(. (Theoretically this should be covered by transitive_files, but I couldn’t actually get that working.)


Startup idea #6ec4e42a-28cc-4425-9ebc-61ac8e224580: Adventurer’s gear for geeky hikers

I’m going to start “calling” my startup ideas in the same way Andy Dwyer calls band names.


So, first up: it’s like REI for D&D players.

We’d sell a “basic adventurer’s kit” that came with iron rations, wineskin, torches, 50 feet of rope, etc.

Then you could get “class specialization” kits, for example:

  • Rogue: contains lockpicks, a pack of cards, and invisible ink.
  • Wizard: parchment, ink, and a dozen small vials of reagents, orb.
  • Cleric: bandages, salves, holy symbol.

We could also offer Tolkien-esque maps of hiking areas and fancy medieval-looking bags/knives/hiking boots. See what carrying 40lbs of gear into the woods actually feels like! Then get it as a gift for a friend.

Gotta get the gear.

Gotta get the gear.

Using a generated header file as a dependecy

Someone asked me today about how to use a generated header as a C++ dependency in Bazel, so I figured I’d write up a quick example.

Create a BUILD file with a genrule that generates the header and a cc_library that wraps it, say, foo/BUILD:

    name = "header-gen",
    outs = ["my-header.h"],
    # This command would probably actually call whatever tool was generat
    cmd = "echo 'int x();' > $@",
    name = "lib",
    hdrs = ["my-header.h"],
    srcs = [""],
    visibility = ["//visibility:public"]

Now you can depend on //foo:lib as you would a “normal” cc_library:

    name = "bin",
    srcs = [""],
    deps = ["//foo:lib"],

And would look like:

#include "foo/my-header.h"
// ...
int main() {
   x();  // Uses x defined in my-header.h.
kristina chodorow's blog