Anatomy of a Firefox Update

Alessio (:Dexter) recently landed a new ping for Firefox 56: the “update” ping with reason “ready”. It lets us know when a client’s Firefox has downloaded and installed an update and is only waiting for the user to restart the browser for the update to take effect.

In Firefox 57 he added a second reason for the “update” ping: reason “success”. This lets us know when the user’s started their newly-updated Firefox.

I thought I might as well see what sort of information we could glean from this new data, using the recent shipping of the new Firefox Quantum Beta as a case study.

This is exploratory work and you know what that means[citation needed]: Lots of pretty graphs!

First: the data we knew before the “update” ping: Nothing.

Well, nothing specific. We would know when a given client would use a newly-released build because their Telemetry pings would suddenly have the new version number in them. Whenever the user got around to sending them to us.

We do have data about installs, though. Our stub installer lets us know how and when installs are downloaded and applied. We compile those notifications into a dataset called download_stats. (for anyone who’s interested: this particular data collection isn’t Telemetry. These data points are packaged and sent in different ways.) Its data looks like this:Screenshot-2017-9-29 Recent Beta Downloads.png

Whoops. Well that ain’t good.

On the left we have the tailing edge of users continuing to download installs for Firefox Beta 56 at the rate of 50-150 per hour… and then only a trace level of Firefox Beta 57 after the build was pushed.

It turns out that the stub installer notifications were being rejected as malformed. Luckily we kept the malformed reports around so that after we fixed the problem we could backfill the dataset:Screenshot-2017-10-4 Recent Beta Downloads

Now that’s better. We can see up to 4000 installs per hour of users migrating to Beta 57, with distinct time-of-day effects. Perfectly cromulent, though the volume seems a little low.

But that’s installs, not updates.

What do we get with “update” pings? Well, for one, we can run queries rather quickly. Querying “main” pings to find the one where a user switched versions requires sifting through terabytes of data. The query below took two minutes to run:

Screenshot-2017-10-3 Users Updating to Firefox Quantum Beta 57(1)

The red line is update/ready: the number of pings we received in that hour telling us that the user had downloaded an update to Beta 57 and it was ready to go. The blue line is update/success: the number of pings we received that hour telling us the user had started their new Firefox Quantum Beta instance.

And here it is per-minute, just because we can:Screenshot-2017-10-3 Users Updating to Firefox Quantum Beta 57(2).png

September 30 and October 1 were the weekend. As such, we’d expect their volumes to be lower than the weekdays surrounding them. However, looking at the per-minute graph for update/ready (red), why is Friday the 29th the same height as Saturday the 30th? Fridays are usually noticeably busier than Saturdays.

Friday was Navarati in India (one of our largest market for Beta) but that’s a multi-day festival that started on the Wednesday (and other sources for client data show only a 15% or so dip in user activity on that date in India), so it’s unlikely to have caused a single day’s dip. Friday wasn’t a holiday at all in any of our other larger markets. There weren’t any problems with the updater or “update” ping ingestion. There haven’t been any dataset failures that would explain it. So what gives?

It turns out that Friday’s numbers weren’t low: Saturday’s were high. In order to improve the stability of what was going to become the Firefox 56 release we began on the 26th to offer updates to the new Firefox Quantum Beta to only half of updating Firefox Beta users. To the other half we offered an update to the Firefox 56 Release Candidate.

What is a Release Candidate? Well, for Firefox it is the stabilized, optimized, rebuilt, rebranded version of Firefox that is just about ready to ship to our release population. It is the last chance we have to catch things before it reaches hundreds of millions of users.

It wasn’t until late on the 29th that we opened the floodgates and let the rest of the Beta users update to Beta 57. This contributed to a higher than expected update volume on the 30th, allowing the Saturday numbers to be nearly as voluminous as the Friday ones. You can actually see exactly when we made the change: there’s a sharp jump in the red line late on September 29 that you can see clearly on both “update”-ping-derived plots.

That’s something we wouldn’t see in “main” pings: they only report what version the user is running, not what version they downloaded and when. And that’s not all we get.

The “update”-ping-fueled graphs have two lines. This rather abruptly piques my curiosity about how they might relate to each other. Visually, the update/ready line (red) is almost always higher than the update/success line (blue). This means that we have more clients downloading and installing updates than we have clients restarting into the updated browser in those intervals. We can count these clients by subtracting the blue line from the red and summing over time:Screenshot-2017-10-3 Outstanding Updates for Users Updating to Firefox Quantum Beta 57

There are, as of the time I was drafting this post, about one half of one million Beta clients who have the new Firefox Quantum Beta… but haven’t run it yet.

Given the delicious quantity of improvements in the new Firefox Quantum Beta, they’re in for a pleasant surprise when they do.

And you can join in, if you’d like.


(NOTE: earlier revisions of this post erroneously said download_stats counted updater notifications. It counts stub installer notifications. I have reworded the post to correct for this error. Many thanks to :ddurst for catching that)