Building a Monorepo with Bazel
We’re Earthly. We make building software simpler and therefore faster using containerization. This article discusses some of the benefits of using a Monorepo. Earthly is particularly useful if you’re working with a Monorepo. Check us out.
Update: September, 2022
Read our interview series with Bazel experts on when to use Bazel.
A monorepo is perhaps what you would expect from the name: a single code repository for your entire codebase.
Wikipedia describes it as a decade-old software development strategy for storing all your code in a single repository, but you can also think of it as a higher-level architecture pattern for governing loosely tied applications. For instance, if you have a full-stack web application stored in one repository and an Android client in another, a monorepo would essentially wrap them in the same repository codebase.
Who Uses Monorepos and Why?
Google is one of the most notable adopters of the monorepo pattern, and companies like Dropbox, LinkedIn, and Uber use monorepos to manage their large codebases. This is because large-scale projects having little or no dependency on each other can be developed, tested, and built without bisecting them into smaller projects.
If you’re from a JavaScript or npm background, you can think of a monorepo as a project having a single package.json
file for managing all your project dependencies. It also allows you to easily share code between multiple environments using isolated modules as published packages. You can configure a single bundler for performing unit tests, integration tests, and other configurations without worrying about language and ecosystem-specific configurations.
The Efficiency of Building a Monorepo with Bazel
Bazel is an open-source build tool developed by Google to give power and life to your monorepo. It’s similar to other build tools like Maven, Gradle, and Buck, but it has a number of advantages:
- Bazel supports multiple languages (Java, JavaScript, Go, C++, to name a few) and platforms (Linux, macOS, and Windows).
- It’s built with Starlark, a high-level language similar to Python that allows it to perform complex operations on binaries, scripts, and data sets.
- Even for large source files, Bazel is blazingly fast at building, as it caches previous work and rebuilds only the code that needs to be.
This article will walk you through the core concepts of Bazel and set you up for building and compiling your own monorepo in JavaScript.
Bazel Basics
First, some vocabulary.
Workspace
Bazel calls your top-level source file a workspace, which contains other source files in a nested fashion. Your workspace is what builds your entire software by taking a set of inputs and generating the desired output.
Packages
A package contains all your related files and dependencies and a file named BUILD
. Subdirectories falling under a package are called subpackages.
Consider the following directory tree:
src/app/BUILD
src/app/core/input.txt
src/app/tests/BUILD
It has two packages: app
, and a subpackage, app/tests
, since both contain their own BUILD files. Note that app/core
is not a package but a regular directory inside app
.
Targets
Elements of a package are called targets, which can be categorized as files and rules.
Files can either be source files containing the code of a developer or generated files generated by Bazel according to a specific set of rules. A rule specifies the relationship between a set of inputs and outputs along with the necessary steps to derive the latter from the former.
Labels
The name of a target is its label, which uniquely identifies it and always starts with //
.
@repo//app/main:app_binary
Each label has two parts: a package name (app/main
) and a target name (app_binary
).
Dependencies
Target X is considered a dependency for target Y, if Y needs X at build or execution time. The dependency relation produces a Directed Acyclic Graph (DAG) called a dependency graph, which is used to classify these dependencies further. You can read more about these types and their definitions.
Build Files
Build files contain the top-level program that describes the set of declared rules. These files are periodically updated with respect to changes in the dependencies being used.
Bazel Basic Commands
You can check if you have Bazel installed on your system by running the version
command.
$ bazel --version
bazel 4.0.0
The primary build
command builds your project’s targets, and analyze-profile
can analyze your builds. You can also remove output files and close the server using the clean
command. You can get a list of all these basic commands along with their use cases by running:
$ bazel help
Usage: bazel <command> <options> ...
Available commands:
analyze-profile Analyzes build profile data.
...
How to Build a Monorepo With Bazel
Once you have the Node.js installed on your system, you can run the following command to install Bazel globally:
npm install -g @bazel/bazelisk
You can also install iBazel to enable hot reloading. This lets you see your changes live in real time.
npm install --save-dev @bazel/ibazel
npm install --global @bazel/ibazel
Configuring the bazel.rc
File
You can write build options in a bazel.rc
file to apply them on every build. You can use the same settings for your project by creating a tools/bazel.rc
file at the root of your Bazel workspace.
If you don’t want to share these settings, you can move out the .bazel.rc
file to the root directory and add it to your .gitignore
list instead. You can also personalize these settings locally by moving it in your home directory.
The following is a generic bazel.rc
file that you can modify according to your needs:
###############################
# Directory structure #
###############################
# Artifacts are typically placed in a directory called "dist"
# Be aware that this setup will still create a bazel-out symlink in
# your project directory, which you must exclude from version control and your
# editor's search path.
build --symlink_prefix=dist/
###############################
# Output #
###############################
# A more useful default output mode for bazel query, which
# prints "ng_module rule //foo:bar" instead of just "//foo:bar".
query --output=label_kind
# By default, failing tests don't print any output, it's logged to a
# file instead.
test --test_output=errors
###############################
# Typescript / Angular / Sass #
###############################
# Make TypeScript and Angular compilation fast by keeping a few
# copies of the compiler running as daemons, and cache SourceFile
# ASTs to reduce parse time.
build --strategy=TypeScriptCompile=worker --strategy=AngularTemplateCompile=worker
# Enable debugging tests with --config=debug test:debug --test_arg=--node_options=--inspect-brk --test_output=streamed --test_strategy=exclusive --test_timeout=9999 --nocache_test_results
Adding the buildifier
Dependency to Your Project
Buildifier is a formatting tool that ensures all BUILD
files are formatted in a similar fashion. It creates a standardized formatting for all your BUILD
and .bzl
files. It also has a linter out of the box to help you detect issues in your code and automatically fix them. You can add the buildifier
dependency to your project either using npm:
npm install --save-dev @bazel/buildifier
or Yarn:
yarn add -D @bazel/buildifier
You will need the following scripts inside your package.json
file to run the buildifier
:
"scripts": {
"bazel:format": "find . -type f \\( -name \"*.bzl\" -or -name WORKSPACE -or -name BUILD -or -name BUILD.bazel \\) ! -path \"*/node_modules/*\" | xargs buildifier -v --warnings=attr-cfg,attr-license,attr-non-empty,attr-output-default,attr-single-file,constant-glob,ctx-actions,ctx-args,depset-iteration,depset-union,dict-concatenation,duplicated-name,filetype,git-repository,http-archive,integer-division,load,load-on-top,native-build,native-package,out-of-order-load,output-group,package-name,package-on-top,positional-args,redefined-variable,repository-name,same-origin-load,string-iteration,unsorted-dict-items,unused-variable",
"bazel:lint": "yarn bazel:format --lint=warn",
"bazel:lint-fix": "yarn bazel:format --lint=fix"
}
Building/Compiling Code
In this example, you’ll start from a new empty directory and build and compile a simple Node.js application using Bazel. You’ll end up with the following structure:
WORKSPACE
BUILD.bazel
es5.babelrc
app.js
package-lock.json
package.json
Instead of manually configuring everything, you can use the following commands to get started:
npm init @bazel bazel_build_nodejs
Or if you’re using Yarn:
yarn create @bazel bazel_build_nodejs
The previous commands use @bazel/create
under the hood to set up your monorepo with some minimal configurations. This means that it automatically creates package.json
, WORKSPACE
, and BUILD.bazel
files for you.
The package.json
is exactly how it’s created when you’re initializing any Node.js project using the npm init
command. It contains some development time dependencies and some scripts through which you can run your build.
Also notice how it automatically adds buildifier
to your project so you can avoid manually setting it up. This is only a starting point though, and you would need to manually set up a buildifier
depending on the requirements of your project.
{
"name": "bazel_build_nodejs",
"version": "0.1.0",
"private": true,
"devDependencies": {
"@bazel/bazelisk": "latest",
"@bazel/ibazel": "latest",
"@bazel/buildifier": "latest"
},
"scripts": {
"build": "bazel build //...",
"test": "bazel test //..."
}
}
Let’s install these packages and a few more like Babel to transpile your JavaScript code.
npm install @babel/core @babel/cli @babel/preset-env
A package-lock.json
file will also be automatically created for you. Your WORKSPACE.bazel
file should look like this:
# Bazel workspace created by @bazel/create 3.4.1
# Declares that this directory is the root of a Bazel workspace.
# See https://docs.bazel.build/versions/master/build-ref.html#workspace
workspace(
# How this workspace would be referenced with absolute labels from another workspace
name = "bazel_build_nodejs",
# Map the @npm bazel workspace to the node_modules directory.
# This lets Bazel use the same node_modules as other local tooling.
managed_directories = {"@npm": ["node_modules"]},
)
# Install the nodejs "bootstrap" package
# This provides the basic tools for running and packaging Node.js programs in Bazel
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "build_bazel_rules_nodejs",
sha256 = "a160d9ac88f2aebda2aa995de3fa3171300c076f06ad1d7c2e1385728b8442fa",
urls = ["https://github.com/bazelbuild/rules_nodejs/releases/download/3.4.1/rules_nodejs-3.4.1.tar.gz"],
)
# The npm_install rule runs Yarn anytime the package.json or package-lock.json file changes.
# It also extracts any Bazel rules distributed in an npm package.
load("@build_bazel_rules_nodejs//:index.bzl", "npm_install")
npm_install(
# Name this npm so that Bazel Label references look like @npm//package
name = "npm",
package_json = "//:package.json",
package_lock_json = "//:package-lock.json",
)
It basically tells Bazel where to pull the tools for running your project and fetches all the required rules to create a build. You also need to tell Bazel to use auto-generated rules. Add the following line to the top of your BUILD.bazel
:
load("@npm//@babel/cli:index.bzl", "babel")
Let’s add a simple console statement inside app.js
:
console.log('NodeJS Built using Bazel!');
Next add the following code inside es5.babelrc
to configure Babel for transpiling JavaScript code:
{
"sourceMaps": "inline",
"presets": [
[
"@babel/preset-env",
{
"modules": "systemjs"
}
]
]
}
Finally, you need to tell Bazel how to take JavaScript inputs and convert them to transpiled or ES5 output. Add the following code inside BUILD.bazel
file after the previous load statement:
babel(
name = "compile",
data = [
"app.js",
"es5.babelrc",
"@npm//@babel/preset-env",
],
outs = ["app.es5.js"],
args = [
"app.js",
"--config-file",
"./$(execpath es5.babelrc)",
"--out-file",
"$(execpath app.es5.js)",
],
)
Run the following command to build and compile your JavaScript code:
npm run build
In case you run into an error, try renaming your WORKSPACE.bazel
file to simply WORKSPACE
. If all goes well, you should see something similar to the following screenshot on your terminal:
You will see bazel-out
and a dist
directory where your output files will be present.
If you check inside dist/bin/app.es5.js
, you should see your transpiled ES5 JavaScript code as shown:
.register([], function (_export, _context) {
System"use strict";
return {
setters: [],
execute: function () {
console.log('NodeJS Built using Bazel!');
} ;
}; })
Setting Up Continuous Integration
Bazel recommends using container environments like the ngcontainer Docker image for continuous integration (CI). You can easily add specific CI settings using the build:ci
or test:ci
prefixes to your bazel.rc
file.
If you’re using CircleCI, you can use this example as a reference. If you’re using GitLab, you can set up CI in minutes using the following scripts:
variables:
BAZEL_DIGEST_VERSION: "f670e9aec235aa23a5f068566352c5850a67eb93de8d7a2350240c68fcec3b25" # Bazel 3.4.1
build:
image:
name: gcr.io/cloud-marketplace-containers/google/bazel@sha256:$BAZEL_DIGEST_VERSION
entrypoint: [""]
stage: build
script:
- bazel --output_base output build //main/...
artifacts:
paths:
- bazel-bin/main/hello-world
cache:
key: $BAZEL_DIGEST_VERSION
paths:
- output
The above scripts define the build outputs and cache directory and also ensures immutability. Luckily the GitLab team has a dedicated article on this for the best reference.
Downsides of Bazel and the Monorepo Pattern
The monorepo pattern is trendy these days, but there are some trade-offs you should be aware of. For a large and diverse team working on a monorepo, it might not be a great idea to expose every ounce of that codebase to novice developers.
Besides someone messing things up accidentally, keeping open access to all your config files, API keys, and so on might pose an issue from a security standpoint. On similar lines, you can understand why open-source projects aren’t living inside monorepos yet.
While Bazel definitely does some magic to ease out this pain for developers, it doesn’t have a large open-source community backing it yet. Having all your source code in one place could slow down the general process of approving pull requests and running the build scripts every now and then.
Bazel also promotes a strict demarcation between your dependencies and source code, while modern languages and frameworks have dedicated directories for bookkeeping dependencies. For instance, an npm project will always have its dependencies in a node_modules
directory inside the root directory. Diverging away from that pattern can present a steep learning curve, or at minimum an uncomfortable change.
Conclusion
Due to better structured configurational files and multiple language support, Bazel is a viable option for your large multi-language project deployed on multiple platforms. It’s fast, and you can even optimize your slow builds using your own build cache. Google has tried and tested Bazel’s core features to validate its stability, and their extensive documentation is some compensation for the small community.
If you’d like to explore further, you can build your own React or Angular app using Bazel to see how it treats different environments of the same language. You can also try out their tutorials for different languages to get a bigger picture of how Bazel works. And if today’s the day you’re welcoming Bazel into your project, definitely take a moment to familiarize yourself with its documented best practices.
If the benefits of Bazel look promising but the downsides prevent you from adopting it, then take a look at Earthly. It supports monorepo and polyrepos and has a gentler learning curve.
Earthly makes CI/CD super simple
Fast, repeatable CI/CD with an instantly familiar syntax – like Dockerfile and Makefile had a baby.