Doubling the Speed of Windows Firefox Builds using sccache-dist

I’m one of the many users but few developers of Firefox on Windows. One of the biggest obstacles stopping me from doing more development on Windows instead of this beefy Linux desktop I have sitting under my table is how slow builds are.

Luckily, distributed compilation (and caching) using sccache is here to help. This post is a step-by-step version of the rather-more-scattered docs I found on the github repo and in Firefox’s documentation. Those guides are excellent and have all of the same information (though they forgot to remind me to put the ports on the url config variables), but they have to satisfy many audiences with many platforms and many use cases so I found myself having to switch between all three to get myself set up.

To synthesize what I learned all in one place, I’m writing my Home Office Version to be specific to “using a Linux machine to help your Windows machine compile Firefox on a local network”. Here’s how it goes:

  1. Ensure the Build Scheduler (Linux-only), Build Servers (Linux-only), and Build Clients (any of Linux, MacOS, Windows) all have sccache-dist.
    • If you have a Firefox Build present, ./mach bootstrap already gave you a copy at .mozbuild/sccache/bin
    • My Build Scheduler and solitary Build Server are both the same Linux machine.
  2. Configure how the pieces all talk together by configuring the Scheduler.
    • Make a file someplace (I put mine in ~/sccache-dist/scheduler.conf) and put in the public-facing IP address of the scheduler (better be static), the method and secret that Clients use to authenticate themselves, and the method and secret that Servers use to authenticate themselves.
    • Keep the tokens and secret keys, y’know, secret.
# Don't forget the port, and don't use an internal iface address like 127.0.0.1.
# This is where the Clients and Servers should find the Scheduler
public_addr = "192.168.1.1:10600"

[client_auth]
type = "token"
# You can use whatever source of random, long, hard-to-guess token you'd like.
# But chances are you have openssl anyway, and it's good enough unless you're in
# a VM or other restrained-entropy situation.
token = "<whatever the output of `openssl rand -hex 64` gives you>"

[server_auth]
type = "jwt_hs256"
secret_key = "<whatever the output of `sccache-dist auth generate-jwt-hs256-key` is>"
  1. Start the Scheduler to see if it complains about your configuration.
    • ~/.mozconfig/sccache/sccache-dist scheduler –config ~/sccache-dist/scheduler.conf
    • If it fails fatally, it’ll let you know. But you might also want to have `–syslog trace` while we’re setting things up so you can follow the verbose logging with `tail -f /var/log/syslog`
  2. Configure the Build Server.
    • Ensure you have bubblewrap >= 0.3.0 to sandbox your build jobs away from the rest of your computer
    • Make a file someplace (I put mine in ~/sccache-dist/server.conf) and put in the public-facing IP address of the server (better be static) and things like where and how big the toolchain cache should be, where the Scheduler is, and how you authenticate the Server with the Scheduler.
# Toolchains are how a Linux Server can build for a Windows Client.
# The Server needs a place to cache these so Clients don’t have to send them along each time.
cache_dir = "/tmp/toolchains"
# You can also config the cache size with toolchain_cache_size, but the default of 10GB is fine.

# This is where the Scheduler can find the Server. Don’t forget the port.
public_addr = "192.168.1.1:10501"

# This is where the Server can find the Scheduler. Don’t forget http. Don’t forget the port.
# Ideally you’d have an https server in front that’d add a layer of TLS and
# redirect to the port for you, but this is Home Office Edition.
scheduler_url = "http://192.168.1.1:10600"

[builder]
type = "overlay" # I don’t know what this means
build_dir = "/tmp/build" # Where on the fs you want that sandbox of build jobs to live
bwrap_path = "/usr/bin/bwrap" # Where the bubblewrap 0.3.0+ binary lives

[scheduler_auth]
type = "jwt_token"
token = "<what sccache-dist auth generate-jwt-hs256-server-token --secret-key <that key from scheduler.conf> --server <the value in public_addr including port>"
  1. Start the Build Server
    • `sudo` is necessary for this part to satisfy bubblewrap
    • sudo ~/.mozbuild/sccache/sccache-dist server –config ~/sccache-dist/server.conf
    • I’m not sure if it’s just me, but the build server runs in foreground without logs. Personally, I’d prefer a daemon.
    • If your scheduler’s tracelogging to syslog, you should see something in /var/log about the server authenticating successfully. If you aren’t, we can query the whole build network’s status in Step 7.
  2. Configure the Build Client.
    • This config file needs to have a specific name and location to be picked up by sccache. On Windows it’s `%APPDATA%\Mozilla\sccache\config\config`.
    • In it you need to write down how the Client can find and authenticate itself with the Scheduler. On not-Linux you also need to specify the toolchains you’ll be asking your Build Servers to use to compile your code.
[dist]
scheduler_url = "http://192.168.1.1:10600" # Don’t forget the protocol or port
toolchain_cache_size = 5368709120 # The default of 10GB is at least twice as big as you need.

# Gonna need two toolchains, one for C++ and one for Rust
# Remember to replace all <user> with your user name on disk
[[dist.toolchains]]
type = "path_override"
compiler_executable = "C:/Users/<user>/.mozbuild/clang/bin/clang-cl.exe"
archive = "C:/Users/<user>/.mozbuild/clang-dist-toolchain.tar.xz"
archive_compiler_executable = "/builds/worker/toolchains/clang/bin/clang"

[[dist.toolchains]]
type = "path_override"
compiler_executable = "C:/Users/<user>/.rustup/toolchains/stable-x86_64-pc-windows-msvc/bin/rustc.exe"
archive = "C:/Users/<user>/.mozbuild/rustc-dist-toolchain.tar.xz"
archive_compiler_executable = "/builds/worker/toolchains/rustc/bin/rustc"

# Near as I can tell, these dist.toolchains blocks tell sccache
# that if a job requires a tool at `compiler_executable` then it should instead
# distribute the job to be compiled using the tool present in `archive` at
# the path within the archive of `archive_compiler_executable`.
# You’ll notice that the `archive_compiler_executable` binaries do not end in `.exe`.

[dist.auth]
type = "token"
token = "<the value of scheduler.conf’s client_auth.token>"
  1. Perform a status check from the Client.
    • With the Scheduler and Server both running, go to the Client and run `.mozbuild/sccache/sccache.exe –dist-status`
    • It will start a sccache “client server” (ugh) in the background and try to connect. Ideally you’re looking for a non-0 “num_servers” and non-0 “num_cpus”
  2. Configure mach to use sccache
    • You need to tell it that it has a ccache and to configure clang to use `cl` driver mode (because when executing compiles on the Build Server it will see it’s called `clang` not `clang-cl` and thus forget to use `cl` mode unless you remind it to)
# Remember to replace all <user> with your user name on disk
ac_add_options CCACHE="C:/Users/<user>/.mozbuild/sccache/sccache.exe"

export CC="C:/Users/<user>/.mozbuild/clang/bin/clang-cl.exe --driver-mode=cl"
export CXX="C:/Users/<user>/.mozbuild/clang/bin/clang-cl.exe --driver-mode=cl"
export HOST_CC="C:/Users/<user>/.mozbuild/clang/bin/clang-cl.exe --driver-mode=cl"
export HOST_CXX="C:/Users/<user>/.mozbuild/clang/bin/clang-cl.exe --driver-mode=cl"
  1. Run a test build
    • Using the value of “num_cpus” from Step 7’s `–dist-status`, run `./mach build -j<num_cpus>`
    • To monitor if everything’s working, you have some choices
      • You can look at network traffic (expect your network to be swamped with jobs going out and artefacts coming back)
      • You can look at resource-using processes on the Build Server (you can use `top` to watch the number of `clang` processes)
      • If your Scheduler or Server is logging, you can `tail -f /var/log/syslog` to watch the requests and responses in real time

Oh, dang, I should manufacture a final step so it’s How To Speed Up Windows Firefox Builds In Ten Easy Steps (if you have a fast Linux machine and network). Oh well.

Anyhoo, I’m not sure if this is useful to anyone else, but I hope it is. No doubt your setup is less weird than mine somehow so you’ll be better off reading the general docs instead. Happy Firefox developing!

:chutten

List of 14 Project FOG Proposals including C++ API, Documentation Design, and Glean API Frontend for Firefox Telemetry among others.

This Week in Glean: Proposals for Asynchronous Design

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)

At last count there are 14 proposals for Firefox on Glean, the effort that, last year, brought the Glean SDK to Firefox Desktop. What in the world is a small, scrappy team in a small, scrappy company like Mozilla doing wasting so much time with old-school Waterfall Model overhead?!

Because it’s cheaper than the alternative.

Design is crucial before tackling difficult technological problems that affect multiple teams. At the very least you’re writing an API and you need to know what people want to do with it. So how do you get agreement? How do you reach the least bad design in the shortest time?

We in the Data Org use a Proposal Process. It’s a very lightweight thing. You write down in a (sigh) Google Doc what it is you’re proposing (we have a snazzy template), attach it to a bug, then needinfo folks who should look at it. They use Google Docs’ commenting and suggested changes features to improve the proposal in small ways and discuss it, and use Bugzilla’s comments and flags to provide overall feedback on the proposal itself (like, should it even exist) and to ensure they keep getting reminded to look at the proposal until the reviewer’s done reviewing. All in all, it’ll take a week or two of part-time effort to write the proposal, find the right people to review it, and then incorporate the feedback and consider it approved.

(( Full disclosure, the parts involving Bugzilla are my spin on the Proposal Process. It just says you should get feedback, not how. ))

Proposals vs Meetings

Why not use a meeting? Wouldn’t that be faster?

Think about who gets to review things in a meeting as a series of filters. First and foremost, only those who attend can review. I’ve talked before about how distributed across the globe my org is, and a lot of the proposals in Project FOG also needed feedback from subject matter experts across Mozilla as a whole (we are not jumping into the XPIDL swamp without a guide). No way could I find a space in all those calendars, assuming that any of them even overlap due to time zones.

Secondly, with a defensive Proposer, feedback will be limited to those reviewers they can’t overpower in a meeting. So if someone wants to voice a subtle flaw in the C++ Metrics API Design (like how I forgot to include any details about how to handle Labeled Metrics), they have to first get me to stop talking. And even though I’m getting better at that (still a ways to go), if you are someone who doesn’t feel comfortable providing feedback in a meeting (perhaps you’re new and hesitant, or you only kinda know about the topic and are worried about looking foolish, or you are generally averse to speaking in front of others) it won’t matter how quiet I am. The proposal won’t be able to benefit from your input.

Thirdly, some feedback can’t be thought of in a meeting. There’s a rough-and-readiness, an immediacy, to feedback in a meeting setting. You’re thinking on your feet, even if the Proposal and meeting agenda are set well in advance. Some critiques need time to percolate, or additional critical voices to bounce off of. Meetings aren’t great for that unless you can get everyone in a room for a day. Pandemic aside, when was the last time you all had that much time?

Proposal documents are just so much more inclusive than design meetings. You probably still want to have a meeting for early prototyping with a small group of insiders, and another at the end to coax out any lingering doubts… but having the main review stages be done asynchronously to your reviewers’ schedules allows you to include a wider variety of voices. You wouldn’t feel comfortable asking a VP to an hour-long design meeting, but you might feel comfortable sending the doc in an email for visibility.

Asynchronicity For You and Me

On top of being more inclusive, proposals are also more respectful. I don’t know what your schedule is today. I don’t know what life you’re living. But I can safely assume that, unless you’re on vacation, you’ll have enough time between now and, say, next Friday to skim a doc and see if there’s anything foolish in it you need to stop me from doing. Or think of someone else who I didn’t think of who should really take a look.

And by setting a feedback deadline, you the Proposer are setting yourself free. You’ll be getting emails as feedback comes in. You’ll be responding to questions, accepting and rejecting changes, and having short little chats. But you can handle that in bite sized chunks on your own schedule, asynchronously, and give yourself the freedom to schedule synchronous work and meetings in the meantime.

Proposal Evolution

Name a Design that was implemented exactly as written. Go on, I’ll wait.

No? Can’t think of one? Neither can I.

Designs (and thus Proposals) are always incomplete. They can’t take into consideration everything. They’re necessarily at a higher level than the implementation. So in some way, the implementation is the evolution of the Design. But implementations lose the valuable information about Why and How that was so important to set down in the Design. When someone new comes to the project and asks you why we implemented it this way, will you have to rely on the foggy remembrance of oral organizational history? Or will you find some way of keeping an objective record?

Only now have we started to develop the habit of indexing and archiving Proposals internally. That’s how I know there’s been fourteen Project FOG proposals (so far). But I don’t think a dusty wiki is the correct place for them.

I think, once accepted, Proposals should evolve into Documentation. Documentation is a Design adjusted by the realities encountered during implementation and maintained by users asking questions. Documentation is a living document explaining Why and How, kept in sync with the implementation’s explanation of What.

But Documentation is a discussion for another time. Reference Documentation vs User Guides vs Design Documentation vs Marketing Copy vs… so much variety, so little time. And I’ve already written too much.

:chutten