Tutorial: how to write Scala rules for Bazel

Bazel comes with built-in support for several languages and allows you to write your own support for any other languages in Python.

Although you could probably get more abstract, let’s define a rule as something that takes some files, does something to them, and then gives you some output files. Specifically, for this example, we want a scala_binary rule where we can give it a Scala source file and it turns it into an executable binary.

Part 1: Creating a Scala source file

Let’s create a simple example of a Scala source file (mercilessly ripped from the Scala hello world example):

// HelloWorld.scala
object HelloWorld {
  def main(args: Array[String]) {
    println("Hello, world!")
  }
}

Note that I’ve never used Scala before today, so please let me know in the comments if I’ve made an mistakes.

Before proceeding, I think it’s a good idea to try building this without bazel (especially if you’re not too familiar with the language’s build tool… ahem) as a sanity check:

$ scalac HelloWorld.scala
$ scala HelloWorld
Hello, world!

Looking good! Now, let’s try to get bazel building that.

Adding a BUILD file and dummy scala_binary rule

We’ll create a BUILD file that references our (currently non-existent) scala_binary rule. This lets us plan out what we’ll need our rule to look like:

# BUILD
load('/scala', 'scala_binary')
 
scala_binary(
    name = "hello-world",
    src = "HelloWorld.scala",
)

The load() statement means that we’ll declare the scala_binary rule in a file called scala.bzl in the root of the workspace (due to the ‘/’ prefix on ‘/scala’). Let’s create that file now:

# scala.bzl
def impl(ctx):
    pass
 
scala_binary = rule(
    attrs = {
        'src': attr.label(
            allow_files=True,
            single_file=True),
    },
    outputs = {'sh': "%{name}.sh"},
    implementation = impl,
)

scala_binary‘s definition says that rules can have one attribute (other than name), src, which is a single file. The rule is supposed to output a file called name.sh, so for our example we should end up with hello-world.sh. The implementation of our rule should be happening in the function impl. The rule implementation doesn’t do anything yet, but we can at least try building now:

$ touch WORKSPACE # if you haven't already...
$ bazel build :hello-world
ERROR: /Users/kchodorow/blerg/BUILD:4:1: in scala_binary rule //:hello-world: 
: The following files have no generating action:
hello-world.sh
.
ERROR: Analysis of target '//:hello-world' failed; build aborted.
INFO: Elapsed time: 0.286s

The error is expected: our rule definition says that hello-world.sh should be an output, but there’s no code creating it yet. Let’s add some functionality to the implementation function by replacing existing function with the following:

def impl(ctx):
    ctx.action(
        inputs = [ctx.file.src],
        command = "echo %s > %s" % (ctx.file.src.path, ctx.outputs.sh.path),
        outputs = [ctx.outputs.sh]
    )

This adds an action to the build. It says that, if the inputs have changed (the src file), run the command (which right now is just echoing src‘s path) to the output file. Note that ctx.action(...) doesn’t actually run the action, it just adds that action to “things that need to be run in the future” for the rule.

Now if we build :hello-world again, we get:

$ bazel build -s :hello-world
INFO: Found 1 target...
>>>>> # //:hello-world [action 'Unknown hello-world.sh']
(cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && \
  exec env - \
  /bin/bash -c 'echo HelloWorld.scala > bazel-out/local_darwin-fastbuild/bin/hello-world.sh')
Target //:hello-world up-to-date:
  bazel-bin/hello-world.sh
INFO: Elapsed time: 0.605s, Critical Path: 0.02s

I used bazel’s -s option here, which is very helpful for debugging what your rule is doing. It prints all of the subcommands a build is running. As you can see, now our rule has an action (>>>>> # //:hello-world [action 'Unknown hello-world.sh']) that creates bazel-bin/hello-world.sh by echoing the source file name. You can verify this by cating bazel-bin/hello-world.sh.

Adding a dependency on the scala compiler

We want our rule to actually call scalac. Even if you have the scala compiler installed on your system, you cannot simply create an action with a command = 'scalac MySourceFile.scala' line, as actions are run in a “clean room” environment: nothing* is there that you don’t specify.

As you probably don’t want to add the scala compiler to your workspace, open up your WORKSPACE file and add it as an external dependency:

# WORKSPACE
new_http_archive(
    name = "scala",
    url = "http://downloads.typesafe.com/scala/2.11.7/scala-2.11.7.tgz",
    sha256 = "ffe4196f13ee98a66cf54baffb0940d29432b2bd820bd0781a8316eec22926d0",
    build_file = "scala.BUILD",
)

Also create the scala.BUILD file in the root of your workspace:

# scala.BUILD
exports_files([
    "bin/scala",
    "bin/scalac",
    "lib/scala-library.jar"
])

Now add a dependency on scalac to your scala_binary rule by adding a “hidden attribute.” Add calling scalac in your impl, so your scala.bzl file looks something like this:

def impl(ctx):
    ctx.action(
        inputs = [ctx.file.src],
        command = "%s %s; echo 'blah' > %s" % (
            ctx.file._scalac.path, ctx.file.src.path, ctx.outputs.sh.path),
        outputs = [ctx.outputs.sh]
    )
 
scala_binary = rule(
    attrs = {
        'src': attr.label(
            allow_files=True,
            single_file=True),
        '_scalac': attr.label(
            default=Label("@scala//:bin/scalac"),
            executable=True,
            allow_files=True,
            single_file=True),
    },
    outputs = {'sh': "%{name}.sh"},
    implementation = impl,
)

Building now shows that scalac is successfully being run on our source file!

$ bazel build -s :hello-world
INFO: Found 1 target...
>>>>> # //:hello-world [action 'Unknown hello-world.sh']
(cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && \
  exec env - \
  /bin/bash -c 'external/scala/bin/scalac HelloWorld.scala; echo '\''blah'\'' > bazel-out/local_darwin-fastbuild/bin/hello-world.sh')
Target //:hello-world up-to-date:
  bazel-bin/hello-world.sh
INFO: Elapsed time: 4.634s, Critical Path: 4.11s

There are still many issues with this implementation:

  • The output from calling scalac doesn’t actually go anywhere, hello-world.sh is still a dummy file.
  • No support for multiple source files, never mind dependencies.
  • [action 'Unknown hello-world.sh'] is pretty ugly.
  • You can’t call bazel run //hello-world, even though the output should be executable.

However, this post is already running long, so let’s wrap it up here and get to some of these issue in the next post. Until next time!

References

* Obviously there are some commands there (our original rule uses echo, for instance) and you can see what’s in the empty environment by writing env to an output file in an action. This can actually cause issues: sometimes commands in the default PATH have different behavior on different systems. To get a completely hermetic build, you should really provide every command your rule uses. However, we’re just using echo for debugging here anyway, so we’ll let it slide.

kristina chodorow's blog