What’s next?

I joined Google for three main reasons:

  1. To prove I could. They rejected me when I was a college senior, so I got a lot of personal satisfaction from just getting an offer.
  2. To learn what “good programming” was.  A startup isn’t the best place to learn how to write readable, maintainable code.  Google was superb for this: I learned so much about how you should program.
  3. To get promoted.  A personal goal of mine, which I finally accomplished last year.

However, once I got promoted, there was nothing left that was really driving me.  I began looking around for alternatives.

I thought it might be interesting for people to see my process here: I think of myself as pretty successful, but when you drill down I actually fail ~99% of the time, I just try a lot of crap.  For instance, this process of finding something else to do began early this year, when I:

  • March:
    • Considered becoming a general contractor. Dissuaded by Andrew (for now).
  • April:
    • Talked to a friend about founding a startup, but haven’t gotten a MVP together (yet).
    • Released a side project (which I was hoping would make me a bazillion dollars: it did not).
    • Applied for a transfer to Google Ventures (interviewed with every member of the team and didn’t hear back).
  • June:
    • Applied to a “startup within Google” program (rejected).
    • Talked to a manager in Google that I knew was good, who had no headcount.
    • Emailed TravisCI about job (rejected pre-interview, although I took forever to get back to them, so I’m going to blame it on that).
    • Emailed GitHub, they emailed me back, I lost track of things and never follow up.  Whoops.
    • Applied to Compass (tech-focused real estate company in NYC), didn’t hear anything.
    • Got an email inviting me to WhiteTruffle, which looked interesting. None of the jobs I was matched with were right, though.

Keep in mind that, at this point, I have been job searching (casually, but still) for a couple of months with 0% success rate. Then July rolled around.

  • July
    • Saw the AVC article on Flip, realized that tied in very well with my real estate interests*, applied through AngelList. I didn’t hear anything from them for a while, which made me sad.  It turns out that they tried to contact me through AngelList, but I never got the message and they ended up tracking down my blog and finding my email and contacting me directly.  Cool!  Met with one of the founders for coffee.
    • Compass emailed me back! Turned out the recruiter was on vacation.  Came back and set up a call.
    • AngelList contacted me about their A-List program for top candidates and I opted in.  I got 16 requests in the first 24 hours, very awesomely tailored to my interests.  Interesting requests continued to trickle in over the next week, until I got overwhelmed and asked them to take me off.
    • AngelList asked me if I wanted to apply to AngelList itself (very flattering). I was pretty impressed by the experience so far so I agreed to talk to them.

On July 12, I had intro calls with Morty, Waverly Labs, Compass, and AngelList. Then Google Ventures emailed me: after three months, they were actually making a decision! And then I got an email from Flip: they wanted to make an offer!  It was a big day.

I decided that, with offers from Flip and Google Ventures I was excited about, I’d cancel the rest of my job search.  I was having a hard time deciding between the two, so I made a pros and cons list:

Staying at Google:

  • Pros:
    • Lunch/breakfast with Andrew
    • Money
    • Friends
    • Known evil
    • Short commute
    • Prestige
    • Pasta every day
    • Scurvy less likely
    • Boxing classes
  • Cons:
    • Corporate
    • Decisions take forever
    • People are slackers (certainly not everyone, but some)
    • Doesn’t matter what I do
    • MTV is always #1
    • Travel!
    • Port Authority building is depressing

Working at Flip:

  • Pros:
    • Tons of stock.
    • Awesome office that makes me happy.
    • Cool work
    • Can take Domino every day
  • Cons
    • Unknown evil
    • So much less money
    • What if I suck?
    • Unknown stability
    • Coworkers might suck
    • Founders might be asses
    • I might get scurvy

Through this process, I realized that I should probably be eating more fruits and vegetables.  But aside from that, looking down the list, I’m mostly staying at Google because I’m afraid of the unknown.  Considered that way, I’d rather do something I’m passionate about, even if it’s risky. So, I’ve given my notice and will be starting at Flip this fall. I’m so excited!

Although I will miss having pasta for lunch every day.


* After seeing the house, I got super into real estate and ended up getting licensed as a NY State real estate agent.

The next great frontier in ML: dogs in hats

I’ve been messing around with Keras, a ML library that makes it pretty easy to do AI experiments. I decided to try out image recognition, so I found a picture of Domino in a fez:

Then I wrote the following Python program which uses the existing ResNet50 image recognition library.

import numpy as np
from keras.preprocessing import image
from keras.applications import resnet50
# Load the image
img = image.load_img('fez.jpg', target_size=(224, 224))
input = image.img_to_array(img)
# Keras can process an array of images, but we're only passing one, so turn it into an array with one element.
input = np.expand_dims(input, axis=0)
# Normalize the RGB values into a 0-1 range, since ML algorithms get thrown off by big ranges.
input = resnet50.preprocess_input(input)
# Predict what's in the photo.
model = resnet50.ResNet50()
predictions = model.predict(input)
predicted_classes = resnet50.decode_predictions(predictions, top=10)
for _, name, liklihood in predicted_classes[0]:
    print("this is an image of {} {}".format(name, liklihood))

In addition to the obvious prereqs, you have to sudo apt-get install libhdf5-dev; pip install pillow certifi h5py.

This prints:

this is an image of Bouvier_des_Flandres 0.4082675576210022
this is an image of briard 0.3710797429084778
this is an image of Newfoundland 0.10781265050172806
this is an image of giant_schnauzer 0.04042242094874382
this is an image of Scotch_terrier 0.038422249257564545
this is an image of komondor 0.012891216203570366
this is an image of Tibetan_terrier 0.0026010528672486544
this is an image of affenpinscher 0.0024157813750207424
this is an image of standard_poodle 0.0021669857669621706
this is an image of Kerry_blue_terrier 0.002110496861860156

It’s pretty solid on “it’s a dog.” I’m disappointed in the lack of fez-related IDs. However, I like some of these as possible breeds for Domino:

Tibetan terrier is pretty close, they are relatives.

Newfies are adorable, but the size is a little off.

I have no idea where Komondor came from. They’re amazing looking, but pretty distinct.

So, still needs some work. But not bad for less than 30 lines of code.

Keeping your deps tidy

My coworker Carmi just published a blog post on the Bazel blog about how Java tracks dependencies. Bazel has some nice facilities built in to let you know when you need to add dependencies:

ERROR: /home/kchodorow/test/a/BUILD:24:1: Building libA.jar (1 source file) failed: Worker process sent response with exit code: 1.
A.java:6: error: [strict] Using type C from an indirect dependency (TOOL_INFO: "//:C"). See command below **  C getC() {
** Please add the following dependencies:
  //:C  to //:A

He mentioned unused_deps, a great tool to go the other way: what if your BUILD file declares dependencies you’re not using? unused_deps lets you quickly clean up your BUILD files.

To get unused_deps, clone the buildtools repository and build it:

$ git clone git@github.com:bazelbuild/buildtools.git
$ cd buildtools
$ bazel build //unused_deps

Now go to your project and run it:

$ cd ~/my-project
$ ~/gitroot/buildtools/bazel-bin/unused_deps/unused_deps //... > buildozer-cmds.sh

This will print a bunch of info to stderr as it runs but, when it’s done, you should have a list of buildozer commands in buildozer-cmds.sh. For example, running this on the Bazel codebase yields:

buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:auth_and_tls_options' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:build-base' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:concurrent' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:events' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote

This is a list of shell commands, so now you need to execute this file. The buildozer tool also lives in the buildtools repository, so you just have to build that and then add it to your path:

$ cd ~/gitroot/buildtools
$ bazel build //buildozer
$ cd ~/my-project
$ chmod +x buildozer-cmds.sh
$ PATH=$HOME/gitroot/buildtools/bazel-bin/buildozer:$PATH ./buildozer-cmds.sh

This will run all of the buildozer commands and then you can commit the changes, e.g.,

$ git diff
diff --git a/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD b/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
index 022a4037d..5d5cdf8d0 100644
--- a/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
+++ b/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
@@ -6,7 +6,6 @@ java_test(
     deps = [
-        "//third_party:junit4",
@@ -17,7 +16,6 @@ java_test(
     deps = [
-        "//third_party:junit4",
@@ -39,7 +37,6 @@ java_test(
     deps = [
-        "//third_party:junit4",
@@ -50,7 +47,6 @@ java_test(
     deps = [
-        "//third_party:junit4",

It’s a good idea to run unused_deps regularly to keep things tidy. For example, the Bazel project does not run it automatically and has nearly 1000 unneeded deps (oops). You might want to add a git hook or something to your CI to run for every change.

Messy closet vs. clean closet

unused_deps: the Container Store of build tools.

GitHub notification… notifier

Here is what my inbox look like each morning:

All those pink-tagged messages are emails from GitHub. Gmail cannot figure out which ones are important and GitHub’s notification stream, in my experience, is useless. It’s noisy and doesn’t clear the way I’d expect. The problem is, people actually do mention me on bugs. And I often have no idea.

So, I made my own notification system. It’s a Chrome extension that is a little ear that sits next to your location bar. If you have an unread GitHub notification, it turns red. If you’re all caught up, it’s green.

If this would be useful to you, please give it a try!

Download the Chrome extension here.

Feedback and suggestions welcome!

Life hacks

I was thinking about a couple of little things that have made my life a lot better in the last year and I figured I’d share:

Buying cheese powder
I love mac & cheese, particularly Annie’s sharp cheddar. However, 1) they always give too many noodles and not enough cheese and 2) Annie’s switched over to only selling either gluten free (which has a glue-like consistency) or organic (and I’m philosophically, or at least stubbornly, against organic food). However, it turns out that you can get cheese powder on Amazon. I can use as much as I want (I discovered that there is such a thing as “too much cheese powder”) and on any pasta I want.
Hemming my jeans
I took a jeans-making class last year and, although I doubt I’ll ever make another full pair of jeans (it was a lot of work), I am now pretty comfortable with modifying existing pairs. I recently got a pile of $10-a-pop jeans from Goodwill and spent 20 minutes hemming up the bottoms and now have jeans that fit me much better than the $80-a-pair jeans I used to get.

To hem jeans you need a sewing machine, iron, and the ability to sew a straight line, but other than that, it’s pretty straight-forward. Put them on, pin where you want the hem to fall, measure. Say it’s 5″ from the existing hem (shut up, I’m short). Draw a line with a Sharpie (or tailor’s chalk, but I’m assuming most people reading this don’t have a well-stocked sewing room), giving yourself 1.5″ for the new hem (so a line 3.5″ from the existing hem). Cut off just within that line. Fold over the hem .5″ Sew it down. Iron the shit out of it. No one likes to switch modes and iron, but it’ll look super jenky and handmade if you don’t. Then fold over the hem again, which shaves off the last .5″, and sew it down ~7/16″ from the edge (as far away from the edge as possible, while still catching your folded-over part).

If you want the overstitching to be visible, use some sort of triple stitch, but I usually just use the normal stitch and it looks fine. You will probably have to hand-crank the machine across the seams, since it’ll be trying to stitch through 12 layers of denim at that point. Then iron again and you’re done.

Cooking breakfast on weekends
I have discovered that scones and dutch babies are very easy to make. It’s very luxurious having hot pastries and good coffee while the Google Home plays jazz.

Scones are great because they need cold, even frozen, butter (pro tip: get a 4-pack and stick it in your freezer, it’ll last forever) and don’t even require eggs. Just throw together flour, sugar, baking powder, butter, and salt, then add cranberries, chocolate chips, maple syrup and walnuts, or sprinkle with cinnamon and sugar. Always put a little extra sugar on top before baking, because it’ll form a delicious crust.

Dutch babies are great because, again, it doesn’t matter what temperature the butter is: you put it in a pan to start with and stick it in the oven anyway. Also, the taste reminds me of something from my childhood, but I haven’t figured out what, yet. I’ll have to keep eating them until I figure it out.

How to Skylark – the class

I’ve heard a lot of users say they want a more comprehensive introduction to writing build extensions for Bazel (aka, Skylark). One of my friends has been working on Google Classroom and they just launched, so I created a build extensions crash course. I haven’t written much content yet (and I don’t understand exactly how Classroom works), but we can learn together! If you’re interested:

It’s free and you can get in on the ground floor of… whatever this is. If you’ve enjoyed/found useful my posts on Skylark, this should be a more serious business and well-structured look at the subject.

I’ll try to release content at least once a week until we’ve gotten through all the material that seems sensible, I get bored, or people stop “attending.”

Stamping your builds

By default, Bazel tries not to include anything about the system state in build outputs. However, released binaries and libraries often want to include something like the version they were built at or the branch or tag they came from.

To reconcile this, Bazel has an option called the workspace status command. This command is run outside of any sandboxes on the local machine, so it can access anything about your source control, OS, or anything else you might want to include. It then dumps its output into bazel-out/volatile-status.txt, which you can use (and certain language rulesets provide support for accessing from code).

For our example, let’s suppose we’re creating fortune cookie binaries. We want each binary to be “stamped” with a different fortune, so we’ll use the fortune command as our status. Then, our build will add “in bed” to the end of the fortune, since we’re all super mature here.

Let’s create a genrule:

    name = "cookie",
    srcs = [],
    outs = ["fortune-cookie"],
    cmd = "cat bazel-out/volatile-status.txt | grep -v BUILD_TIMESTAMP > $@; echo '...in bed.' >> $@",
    stamp = True,

The most important part here is the stamp attribute: this tells the genrule that it depends on the volatile-status.txt file. Without this attribute, the volatile-status.txt file is not a dependency of this rule, so it might not exist when the genrule is run.

cmd prints out this status. The status has a default “BUILD_TIMESTAMP” field as well, so we strip that out.

Now create a fortune-teller.sh script (and make it executable):

/usr/games/fortune -s | tr '\n' ' '

This generates short-ish fortunes and removes all of the newlines (Bazel status files are line-based, each new line is independent and written to volatile-status.txt in any order the status generator feels like).

Now we can build our fortune cookie by supplying the fortune teller to the build:

$ bazel build --stamp --workspace_status_command=$PWD/fortune-teller.sh //:cookie
INFO: Found 1 target...
Target //:cookie up-to-date:
INFO: Elapsed time: 0.467s, Critical Path: 0.09s

Things to note:

  • You must enable stamping on the command line.
  • You also must pass the full path to the script, otherwise Bazel won’t find it.

Now if your take a look at bazel-genfiles/fortune-cookie:

$ cat bazel-genfiles/fortune-cookie
Truth is the most valuable thing we have -- so let us economize it. 		-- Mark Twain
...in bed.


Using secrets with Google AppEngine

For side project #4323194 (implement a chrome extension that looks like this: 👂 and turns red when someone mentions you on GitHub), I needed to implement oauth from AppEngine to GitHub. As I’ve mentioned before, oauth is my nemesis, but for this project there didn’t seem to be a great way around it. It actually wasn’t as bad as I remember… maybe I know more about HTTP now? Either way, I only messed up ~16 times before I got it authenticating properly.

When you want an app to work with the GitHub API, you go to GitHub and set up a new application, tell it what URL it should send people to after login, and it gives you a “secret key” that no one else should know. Then you simply implement oauth’s easy, intuitive flow:

  1. Redirect a user to https://github.com/login/oauth/authorize when you want them to log in.
  2. GitHub will ask the person to log in, then redirect back to the URL you gave it when you set up your app.
  3. You POST the secret key you got to https://github.com/login/oauth/access_token.
  4. GitHub replies with an access token, which you can then use in the header of subsequent requests to access the API.

The problem here is #3: the secret key. In ye olde world of “I have a server box, I shall SSH into it and poke things,” I would simply set an environment variable, SOOPER_SECRET=<shhh>, then get that from my Java code. However, AppEngine prevents that sort of (convenient) nonsense.

So, I poked around and the thing I saw people recommending was to store the value in the database. This is an interesting idea. On the downside, it’s approximately a zillion times slower than accessing an environment variable. On the other hand, this is a login flow that makes three separate HTTP requests, I don’t think a database lookup is going to make a huge difference. On the plus side, it “automatically propagates” to new machines as you scale.

So I began working on a class to store the secret. Requirements:

  • Easy to set: I want to be able to visit a URL (e.g., /secrets) to set the value.
  • Difficult for others to set: I don’t want users to be able to override keys I’ve created, or create their own keys.
  • Perhaps most importantly: difficult for me to unintentionally commit to a public GitHub repo. I am super bad at this, so I need a completely brain-dead way to never, ever have this touch my local code, otherwise it will end up on GitHub.

What I decided on:

Create a servlet (/secrets) that takes the key/value to set as a query parameter. The servlet will only set the secret key for keys I’ve defined in code (so visitors can’t set up their own secret keys) and will only set the secret key if it doesn’t exist in the database, yet. Thus, after the first time I visit /secrets, it’ll be a no-op (and actually can be disabled entirely in production). Because the secret is given as a query parameter, it never hits my code base. It will appear in request logs, but I’m willing to live with that.

What this looks like in an AppEngine app:

<!-- web.xml - add handling for this URI -->

And the Java code does some URI parsing and then:

  private void findOrInsert(String key, String value) {
    Entity entity = getEntity(key);
    if (entity != null) {
      // No need to insert.
    entity = new Entity(ENTITY_TYPE);
    entity.setProperty("key", key);
    entity.setProperty("value", value);

And the nice thing about using Google’s AppEngine datastore is that it’s easy (relatively) to write tests for all this.

You can check out the sources & tests at my git repo. (Note that the extension doesn’t actually work yet, right now it just logs in. I’ll write a followup post once it’s functional, since I think this might be relevant to some of my readers’ interests.)

Low-fat Skylark rules – saving memory with depsets

In my previous post on aspects, I used a Bazel aspect to generate a simple Makefile for a project. In particular, I passed a list of .o files up the tree like so:

  dotos = [ctx.label.name + ".o"]
  for dep in ctx.rule.attr.deps:
    # Create a new array by concatenating this .o with all previous .o's.
    dotos += dep.dotos
  return struct(dotos = dotos)

In a toy example, this works fine. However, in a real project, we might have tens of thousands of .o files across the build tree. Every cc_library would create a new array and copy every .o file into it, only to move up the tree and make another copy. It’s very inefficient.

Enter nested sets. Basically, you can create a set with pointers to other sets, which isn’t inflated until its needed. Thus, you can build up a set of dependencies using minimal memory.

To use nested sets instead of arrays in the previous example, replace the lists in the code with depset and |:

  dotos = depset([ctx.label.name + ".o"])
  for dep in ctx.rule.attr.deps:
    dotos = dotos | dep.dotos

Nested sets use | for union-ing two sets together.

“Set” isn’t a great name for this structure (IMO), since they’re actually trees and, if you think of them as sets, you’ll be very confused about their ordering if you try to iterate over them.

For example, let’s say you have the following macro in a .bzl file:

def order_test():
  srcs = depset(["src1", "src2"])
  first_deps = depset(["dep1", "dep2"])
  second_deps = depset(["dep3", "dep4"])
  src_and_deps = srcs | first_deps
  everything = second_deps | src_and_deps
  for item in everything:

Now call this from a BUILD file:

load('//:playground.bzl', 'order_test')

And “build” the BUILD file to run the function:

$ bazel build //:BUILD
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep1.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep2.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: src1.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: src2.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep3.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep4.

How did that code end up generating that ordering? We start off with one set containing src1 and src2:

Add the first deps:

And then create a deps set and add the tree we previously created to it:

Then the iterator does a postorder traversal.

This is just the default ordering, you can specify a different ordering. See the docs for more info on depset.

9 years of blogging have totally been worth it

Worth of Web is kind of a neat site:

Oh well. It’s been worth it to me.

kristina chodorow's blog