9 years of blogging have totally been worth it

Worth of Web is kind of a neat site:

Oh well. It’s been worth it to me.

Aspects: the fan-fic of build rules

Aspects are a feature of Bazel that are basically like fan-fic, if build rules were stories: aspects let you add features that require intimate knowledge of the build graph, but that that the rule maintainer would never want to add.

For example, let’s say we want to be able to generate Makefiles from a Bazel project’s C++ targets. Bazel isn’t going to add support for this to the built-in C++ rules. However, lots of projects might want to support a couple of build systems, so it would be nice to be able to automatically generate build files for Make. So let’s say we have a simple Bazel C++ project with a couple of rules in the BUILD file:

cc_library(
    name = "lib",
    srcs = ["lib.cc"],
    hdrs = ["lib.h"],
)
 
cc_binary(
    name = "bin",
    srcs = ["bin.cc"],
    deps = [":lib"],
)

We can use aspects to piggyback on Bazel’s C++ rules and generate new outputs (Makefiles) from them. It’ll take each Bazel C++ rule and generate a .o-file make target for it. For the cc_binary, it’ll link all of the .o files together. Basically, we’ll end up with a Makefile containing:

bin : bin.o lib.o
	g++ -o bin bin.o lib.o
 
bin.o : bin.cc
	g++ -c bin.cc
 
lib.o : lib.cc
	g++ -c lib.cc

(If you have any suggestions about how to make this better, please let me know in the comments, I’m definitely not an expert on Makefiles and just wanted something super-simple.) I’m assuming a basic knowledge of Bazel and Skylark (e.g., you’ve written a Skylark macro before).

Create a .bzl file to hold your aspect. I’ll call mine make.bzl. Add the aspect definition:

makefile = aspect(
    implementation = _impl,
    attr_aspects = ["deps"],
)

This means that the aspect will follow the “deps” attribute to traverse the build graph. We’ll invoke it on //:bin, and it’ll follow //:bin‘s dep to //:lib. The aspect’s implementation will be run on both of these targets.

Add the _impl function. We’ll start by just generating a hard-coded Makefile:

def _impl(target, ctx):
  # If this is a cc_binary, generate the actual Makefile.
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    content = "bin : bin.cc lib.cc lib.h\n\tg++ -o bin bin.cc lib.cc\n"
    ctx.file_action(content = content, output = output)
    outputs = [output]
 
  return struct(output_groups = {"makefiles" : set(outputs)})

Now we can run this:

$ bazel build //:bin --aspects make.bzl%makefile --output_groups=makefiles
INFO: Found 1 target...
INFO: Elapsed time: 0.901s, Critical Path: 0.00s
$

Bazel doesn’t print anything, but it has generated bazel-bin/Makefile. Let’s create a symlink to it in our main directory, since we’ll keep regenerating it and trying it out:

$ ln -s bazel-bin/Makefile Makefile 
$ make
g++ -o bin bin.cc lib.cc
$

The Makefile works, but is totally hard-coded. To make it more dynamic, first we’ll make the aspect generate a .o target for each Bazel rule. For this, we need to look at the sources and propagate that info up.

The base case is:

  source_list= [f.path for src in ctx.rule.attr.srcs for f in src.files]
  cmd = target.label.name + ".o : {sources}\n\tg++ -c {sources}".format(
      sources = " ".join(source_list)
  )

Basically: run g++ on all of the srcs for a target. You can add a print(cmd) to see what cmd ends up looking like. (Note: We should probably do something with headers and include paths here, too, but I’m trying to keep things simple and it isn’t necessary for this example.)

Now we want to collect this command, plus all of the commands we’ve gotten from any dependencies (since this aspect will have already run on them):

  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    transitive_cmds += dep.cmds

Finally, at the end of the function, we’ll return this whole list of commands, so that rules “higher up” in the tree have deps with a “cmds” attribute:

  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
  )

Now we can change our output file to use this list:

    ctx.file_action(
        content = "\n\n".join(transitive_cmds) + "\n",
        output = output
    )

Altogether, our aspect implementation now looks like:

def _impl(target, ctx):
  source_list= [f.path for src in ctx.rule.attr.srcs for f in src.files]
  cmd = target.label.name + ".o : {sources}\n\tg++ -c {sources}".format(
      sources = " ".join(source_list)
  )
 
  # Collect all of the previously generated Makefile targets.                                                                                                                                                                                                                                                                  
  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    transitive_cmds += dep.cmds
 
  # If this is a cc_binary, generate the actual Makefile.                                                                                                                                                                                                                                                                      
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    ctx.file_action(
        content = "\n\n".join(transitive_cmds) + "\n",
        output = output
    )
    outputs = [output]
 
  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
  )

If we run this, we get the following Makefile:

bin.o : bin.cc
	g++ -c bin.cc
 
lib.o : lib.cc
	g++ -c lib.cc

Getting closer!

Now we need the last “bin” target to be automatically generated, so we need to keep track of all the intermediate .o files we’re going to link together. To do this, we’ll add a “dotos” list that this aspect propagates up the deps.

This is similar to the transitive_cmds list, so add a couple lines to our deps traversal function:

  # Collect all of the previously generated Makefile targets.                                                                                                                                                                                                                                                                  
  dotos = [ctx.label.name + ".o"]
  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    dotos += dep.dotos
    transitive_cmds += dep.cmds

Now propagate them up the tree:

  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
      dotos = dotos,
  )

And finally, add binary target to the Makefile:

  # If this is a cc_binary, generate the actual Makefile.                                                                                                                                                                                                                                                                      
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    content = "{binary} : {dotos}\n\tg++ -o {binary} {dotos}\n\n{deps}\n".format(
        binary = target.label.name,
        dotos = " ".join(dotos),
        deps = "\n\n".join(transitive_cmds)
    )
    ctx.file_action(content = content, output = output)
    outputs = [output]

If we run this, we get:

bin : bin.o lib.o
	g++ -o bin bin.o lib.o
 
bin.o : bin.cc
	g++ -c bin.cc
 
lib.o : lib.cc
	g++ -c lib.cc

Documentation about aspects can be found on bazel.io. Like Skylark rules, I find aspects a little difficult to read because they are inherently recursive functions, but it helps to break it down (and use lots of prints).

That’s senior programmer to you, buddy

After about a decade of professional programming, I have finally gotten promoted. For the first time. This is a weird industry.

Regardless, I am now a “Senior Software Engineer.” Woo!

Thinking about it, this has been a goal of mine for a long time. Now that I’ve achieved it, I’m not sure what’s next.

“…and Alexander wept, for there were no more worlds to conquer.”

The Haunted Homesteader

Andrew and I have always loved old places. Optimally, we’d like to live in a wizard’s tower on top of a mountain. However, we’d be willing to settle for a castle with a thousand acres of land. More realistically, we’d like an old place with a couple of acres. That you can reach without a car from NYC (I said more realistically, not actually realistically).

So, sometimes I browse the real estate listings, especially near the train stations along Metro North and one day I noticed something unusual. There was a place that was being sold right next to the train station. It was 10 acres of property. They were asking less than $1 million.

Okay, those were the good things. There were… a couple downsides. It was old (good) and had obviously been abandoned for years (not so good). It was missing certain crucial elements like windows. And a roof. It needed a completely new septic system and well replacement (well water in combination with septic tank problems: eww), needed new wiring, and god knows what else. It was a “historical property,” so any repairs we made had to be okay-ed by a historical accuracy board and materials would probably be exorbitantly expensive: no off-the-shelf windows from Home Depot. On top of all that, the walls were literally made of asbestos, so either we pay an exorbitant amount to have all of the asbestos removed, or pay an exorbitant amount for each repair we did because everyone would have to wear spacesuits and take crazy precautions.

So, I said, “Let’s just go up on the train an take a look. So I can get it out of my system.”

Andrew pretended to believe me and off we went. We got off the train and… the property was right there. Like, 5-minute stroll up the hill from where we got off the train. Note the “up-the-hill” part: this thing was basically invisible from every angle. We’d gone hiking from this train stop a hundred times and never seen it before. It was on a bluff overlooking the Hudson. We carefully circled the property, the ridiculously large property, trying to get a clear view of the place. There was no actual road leading to it, it was just… abandoned by time, on a bluff overlooking the Hudson. I was done for.

We continued to circle it and the “backyard” (the part not overlooking the Hudson, did I mention it fucking OVERLOOKS THE HUDSON?!) melts into reserved state land, so it’ll never be developed. We’d have a forest in our backyard for perpetuity. We walked down one of the trails through the woods in the “backyard” and sat down on a big rock overlooking a waterfall. We discussed, and negotiated, and fantasized. We started off agreeing that we’d both be comfy offering 1/5 the asking price. A few hours later, our butts were freezing, Domino had crawled into Andrew’s lap, and we had negotiated ourselves up to, “well, the asking price is sort of reasonable…” We’re terrible negotiators.

So, we contacted the real-estate agent so we could see the inside of the house. She took us on a tour and, in some ways, the place was amazing: floor-to-ceiling bay windows with views of the river, fireplaces in every room, and more rooms that we knew what to do with. In other ways, it was… not so amazing. For example, it was built before indoor plumbing, so the original floor plan had not accounted for things like bathrooms. Nor closets, apparently that wasn’t a thing. It was built for rich people in the 1800s, so the kitchen was in the basement (keep the help out of the way). There was obviously no electricity, so none of the ceilings have wiring for light fixtures. The owners built an addition with plumbing & electric in the 20s, but given its size and shape, it’s a bit quirky. For instance, they added an awesome “secret door” bookcase that swings open to reveal… a bathtub in a closet.

After seeing the inside, we were in love, but the love was tempered by the desire to not have to work until we died restoring the place. We talked to a some contractors. We talked to our friends. We talked to our parents.

And… we decided against it. I’m a bit heartsick over it, but it’s just 10 years too early for us to be able to dedicate that kind of time to fixing up a place. Doing the mature, responsible thing sucks.

Now we’re mainlining Ask This Old House. When the time comes, we’ll be so ready.

The living room.

The living room.

Recruiting review

Just got this message from a recruiter:

I know I recently reached out to you, but seriously, you’re the cat’s meow and I can see you being the perfect fit for [company]. Your current experience at Google is spot on with what our Talent Team is looking for.

[Description of company]

If I am way off base in my analysis of your profile, please let me know. However, if there is a small chance that I may have hit the nail on the head – I’d love to discuss the opportunity to join their team.

[Sign off]

deTECHtive | Talent Acquisition Manager

Points for actually describing what the company does. Points off for:

  • Using more analogies than I could swing a dead cat at.
  • deTECHtive
  • Being a Googler == what they’re looking for.

All in all, I rate it two resumes out of five: nothing egregious, but nothing appealing, either.

Using AutoValue with Bazel

AutoValue is a really handy library to eliminate boilerplate in your Java code. Basically, if you have a “plain old Java object” with some fields, there are all sorts of things you need to do to make it work “good,” e.g., implement equals and hashCode to use it in collections, make all of its fields final (and optimally immutable), make the fields private and accessed through getters, etc. AutoValue generates all of that for you.

To get AutoValue work with Bazel, I ended up modifying cushon’s example. There were a couple of things I didn’t like about it, since I didn’t want the AutoValue library to live in my project’s BUILD files. I set it up so it was defined in the AutoValue project, so I figured I’d share what I came up with.

In your WORKSPACE file, add a new_http_archive for the AutoValue jar. I’m using the one in Maven, but not using maven_jar because I want to override the BUILD file to provide AutoValue as both a Java library and a Java plugin:

new_http_archive(
    name = "auto_value",
    url = "http://repo1.maven.org/maven2/com/google/auto/value/auto-value/1.3/auto-value-1.3.jar",
    build_file_content = """
java_import(
    name = "jar",
    jars = ["auto-value-1.3.jar"],
)
 
java_plugin(
    name = "autovalue-plugin",
    generates_api = 1,
    processor_class = "com.google.auto.value.processor.AutoValueProcessor",
    deps = [":jar"],
)
 
java_library(
    name = "processor",
    exported_plugins = [":autovalue-plugin"],
    exports = [":jar"],
    visibility = ["//visibility:public"],
)
""",
)

Then you can depend on @auto_value//:processor in any java_library target:

java_library(
    name = "project",
    srcs = ["Project.java"],
    deps = ["@auto_value//:processor"],
)

…and Bob’s your uncle.

You do you

I’m a little tired and depressed this week. However, this was a very inspiring speech Neil Gaiman gave to new grads of an art school:

Neil Gaiman Addresses the University of the Arts Class of 2012 from The University of the Arts (Phl) on Vimeo.

I think that his point about doing what you love, regardless of the money, holds doubly true for programmers. We are extremely lucky in that, unlike artists, we can make an okay salary nearly anywhere. We might as well work on things that make us happy.

Snail Spam

When I started blogging, I called my blog “Snail in a Turtleneck,” a cute image that Andrew & I came up with. I drew up my mascot:

A bemused snail, wearing a turtleneck.

A bemused snail, wearing a turtleneck.

and I began posting cartoons I had drawn. I quickly became bored of doing cartoons, and found I was more motivated to put up technical blog posts. Most of my initial readers were coworkers and MongoDB users. When Andrew and I got married, I told my teammates the day before that I’d be out the next day, as I was getting married (we got married at the city clerk’s, so it wasn’t a big production). When I got back to work, I found this at my desk:

A stuffed snail, a snail tape dispenser, and a very lovely bouquet (that was entirely free of snails).

A stuffed snail, a snail tape dispenser, and a very lovely bouquet (that was entirely free of snails).

I was very touched by their thoughtfulness: Andrew and I still have the stuffed snail and I brought the tape dispenser along to Google (where, unfortunately, it was later lost during an intra-office move).

However, as MongoDB gained popularity, some of my posts became very popular and I began to regret the name: customers seemed a little embarrassed to mention they had gotten advice from it and a lot of people didn’t realize that I was actually behind it. I purchased kchodorow.com, set up permanent redirects, and basically stopped referencing “Snail in a Turtleneck.” After a couple of years, I let the domain name lapse.

Last week, someone told me that my site had been hacked. I was confused, until they told me it was snailinaturtleneck.com. I took a look and, bizarrely, someone seems to have taken a dump of my site circa 2011, put spam on the index, and put it up at snailinaturtleneck.com, complete with my artwork, cartoons, etc. The domain was registered through a privacy protection service, so I guess the next step is sending a DCMA takedown notice to the registrar.

Who does this? (I mean, spammers, but… so annoying.)

Four alternative debugging techniques

I’ve recently been working on a side project that uses WebGL and a physics engine that was transpiled from C++ into JavaScript so… printing variable to the console and using the debugger just weren’t cutting it. I started thinking about the other ways I debug things:

  1. Ship of Theseus debugging: the ship of Theseus is a thought experiment: if you have something and you gradually replace every part, at which point is it a different thing? This is debugging via finding a working example, then gradually mutating it (without breaking it, source control is helpful here) into what you actually want to do. There are a couple of problems: this often leaves some cruft around from the original program and I often never figure out why it wasn’t working in the first place. Which brings us to…
  2. Homeland security debugging: if you see a suspicious variable name or method call, don’t keep it to yourself. Tell a coder or library maintainer. This is where you go back through all of the sketchy parts of your code and make sure they are actually doing what you expect. After working on a program for a while, I’ll usually end up with parts of codebase that are a bit questionable. Why am I passing a literal “1” as an argument in here? Why do I have two names for this variable? Basically, I’m going through my program, line by line, checking all of my vague suspicions.
  3. Thunderdome debugging: one coder and one bug enter, one coder leaves. This is kind of a meta-technique I use for weird, difficult-to-reproduce issues in integration tests where I have to dig through logs or do a 12-step process every time I need to test a change. Basically, I get a big cup of coffee, sit down at my machine, and try everything while mainlining caffeine. Afterwards, I generally couldn’t even tell you what the bug ended up being, but it is no more. This is the kind of debugging that generally does not happen unless I’m being paid for it.
  4. Wooden nickel debugging: try testing code against the most useless possible inputs. Sometimes I have a very complex chunk of code that doesn’t work. I don’t want to spend the time to get a meaningful test running against it, but I start writing unit tests, passing in the most trivial input possible: the empty string, 0, an empty array. And often, after an input or two, I’ve figured out what it’s doing wrong.

Anyone else have any non-traditional ways that they debug?

Compilation à la mode

Sundae from Black Tap, incidentally around the corner from the NYC Google office.

Sundae from Black Tap, incidentally around the corner from the NYC Google office.

Bazel lets you set up various “modes” of compilation. There are several built-in (fast, optimized, debug) and you can define your own. The built in ones are:

  • Fast: build your program as quickly as possible. This is generally best for development (when you want a tight compile/edit loop) and is the default, when you don’t specify anything. Your build’s output is generated in bazel-out/local-fastbuild.
  • Optimized: code is compiled to run fast, but may take longer to build. You can get this by running bazel build -c opt //your:target and it’s output will be generated in bazel-out/local-opt. This mode is the best for code that will be deployed.
  • Debug: this leaves in symbols and generally optimizes code for running through a debugger. You can get this by running bazel build -c dbg //your:target and it’s output will be generated in bazel-out/local-dbg

You don’t have to actually know where the build’s output is stored, bazel will update the bazel-bin/bazel-genfiles symlinks for you automatically at the end of your build, so they’ll always point to the right bazel-out subdirectory.

Because each flavor’s output is stored in a different directory, each mode effectively has an entirely separate set of build artifacts, so you can get incremental builds when switching between modes. On the downside: when you build in a new mode for the first time, it’s essentially a clean build.

Note: -c is short for --compilation_mode, but everyone just says -c.

Defining a new mode

Okay, you can’t really define your own mode without tons of work, but you can create named sets of options (which is probably what you wanted anyway unless you’re writing your own toolchain).

For example, let’s say I generally have several flags I want to run with when I’m trying to debug a failing test. I can create a “gahhh” config in my ~/.bazelrc as follows:

test:gahhh --test_output=all
test:gahhh --nocache_test_results
test:gahhh --verbose_failures
test:gahhh -c dbg

The “test” part indicates what command this applies to (“build”, “query” or “startup” are other useful ones). The “:gahhh” names the config, and then I give the option. Now, when I get frustrated, I can run:

$ bazel test --config gahhh //my:target

and I don’t have to remember the four options that I want to use.

kristina chodorow's blog