Byte-Sized Design

Byte-Sized Design

Knowing When to Stop Engineering: Airbnb’s Hardest Lesson

Tens of millions of lines of code. 700 services. 450 data pipelines. 4.5 years of migration. And the thing that could have cut the timeline in half was knowing when to stop engineering.

Byte-Sized Design's avatar
Byte-Sized Design
Feb 01, 2026
∙ Paid

5x faster local builds. 3x faster IntelliJ syncs. 3x faster deploys to dev. Build satisfaction jumping from 38% to 68%.

Those are the numbers. They’re impressive. And it took Airbnb 4.5 years to get there.

With hindsight, they could have gotten there a lot sooner. Not by being smarter about Bazel. By being smarter about when to optimize.

Let’s get into it.


🚨 Why Gradle Was Killing Them

Gradle’s single-threaded configuration was a ticking clock. Large projects took minutes just to configure before a single line of code compiled. On CI, they were already vertically scaling to the biggest machines AWS offered. The sharding heuristics they built to split work across machines were leaking efficiency everywhere, machines sat half-idle while shared tasks duplicated across nodes.

But speed was only half the problem.

Gradle tasks had full access to the file system. Sounds fine until one engineer writes a cleanup task that wipes recent files in /tmp/. That task races with every other Gradle task using /tmp/. CI starts failing at scale. Thousands of tasks have to rerun. Nobody catches it until it’s already in production.

This was not a one-off. It was structural. Gradle gave tasks too much trust, and at the scale of tens of millions of lines of code, trust becomes a liability.


🔍 What Bazel Actually Fixed

Sandboxing killed the ghost dependencies. If a file isn’t declared as an input to a build action, it doesn’t exist. Period. That /tmp/ race condition? Can’t happen. Undeclared dependencies that work on your laptop but fail in CI? Gone.

Remote execution changed the math entirely. Instead of sharding builds across a handful of machines with heuristics, Bazel fanned out to thousands of parallel actions. RBE workers are short-lived — spin up, do work, die. No machine sits idle. No duplicated shared tasks. And Build without the Bytes meant only downloading the subset of outputs you actually need, not every cached artifact.

Starlark forced discipline. Bazel’s configuration language is constrained to be side-effect-free. That’s not a limitation, it’s what makes parallel analysis possible. Gradle’s configuration phase was single-threaded because it couldn’t be parallelized. Starlark’s constraints made it safe to be.

The results landed hard: 3–5x faster local builds, build satisfaction scores jumping from 38% to 68%, and CI times that actually made developers feel productive again.


🏗️ How They Actually Did It (The Parts That Matter)

User's avatar

Continue reading this post for free, courtesy of Byte-Sized Design.

Or purchase a paid subscription.
© 2026 Byte-Sized Design · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture