Three-Year Moziversary

Another year at Mozilla. They certainly don’t slow down the more you have of them.

For once a year of stability, organization-wise. The two biggest team changes were the addition of Jan-Erik back on March 1, and the loss of our traditional team name “Browser Measurement II” for a more punchy and descriptive “Firefox Telemetry Team.”

I will miss good ol’ BM2, though it is fun signing off notification emails with “Your Friendly Neighbourhood Firefox Telemetry Team (:gfritzsche, :janerik, :Dexter, :chutten)”

We’re actually in the market for a Mobile Telemetry Engineer, so if you or someone you know might be interested in hanging out with us and having their username added to the above, please take a look right here.

In blogging velocity I think I kept up my resolution to blog more. I’m up to 32 posts so far in 2018 (compared to year totals of 15, 26, and 27 in 2015-2017) and I have a few drafts kicking in the bin that ought to be published before the end of the year. Part of this is due to two new blogging efforts: So I’ve Finished (a series of posts about video games I’ve completed), and Ford The Lesser (a series summarizing the deeds and tone of the new Ontario Provincial Government). Neither are particularly mozilla-related, though, so I’m not sure if the count of work blogposts has changed.

Thinking back to work stuff, let’s go chronologically. Last November we released Firefox Quantum. It was and is a big deal. Then in December all hands went to Austin, Texas.

Electives happened again so I did a reprise of Death-Defying Stats (where I stand up and solve data questions, Live On Stage). We saw Star Wars: The Last Jedi (I’m not sure why the internet didn’t like it. I thought it was grand. Though the theatre ruined the impact of That One Scene by letting us know that no, the sound didn’t actually cut out it was deliberate. Yeesh). We partied at a pseudo-historical southwestern US town, drinking warm beverages out of gigantic branded mugs we got to take home afterwards.

Then we launched straight into 2018. Interviews increased to a crushing density for the role that was to become Jan-Erik’s and for two interns: one (Erin Comerford) working on redesigns for the venerable Telemetry Measurement Dashboards, and another (Agnieszka Ciepielewska) working on automatic change detection and explanation for Telemetry metrics.

In June we met again in San Francisco, but this time without Georg who was suffering a cold. Sunah and I gave a talk about Event Telemetry, Steak Club met again, and we got to mess around with science stuff at the Exploratorium.

This year’s Summer Big Project… y’know, there were a few of them. The first was the Event Telemetry “event” ping. Then there was the Measurement Dashboard redesign project where I ended up mentoring more than I expected due to PTO and timezones.

Also in the summer I was organizing and then going on a trip to celebrate a different anniversary (my tenth wedding anniversary) for nearly the entire month of July.

In August the team met in Berlin, and this time I was able to join in. That was a fun and productive time where we settled matters of team identity, ownership, process, and developed some delightful in-jokes to perplex anyone not in the in-group. I acted as an arm of Ontario Craft Beer Tourism by importing a few local cans (Waterloo Dark and Mad & Noisy Lagered Ale) while asking (well-intentioned but numerous and likely ignorant) questions about European life and politics and food and history and …

And that brings us more or less to now. September was busy. October is busy. I’m helping :frank put authentication on the old Measurement Dashboards so we can put release-channel data back up there without someone taking it and misinterpreting it. (As an org we’ve made the conscious decision to use our public data in a deliberate fashion to support truthful narratives about our products and our users. Like on the Firefox Public Data Report.) I’m looking into how we might take what we learned with Erin’s redesign prototype and productionize it with real data. I’m also improving documentation and consulting with a variety of teams on a variety of data things.

So, resolutions for the next twelve months? Keep on keeping on, I guess. I’m happy with the progress I have made this past year. I’m pleased with the direction our team and the broader org is headed. I’m interested to see where time and effort will take us.

:chutten

 

Advertisements

Going from New Laptop to Productive Mozillian

laptopStickers

My old laptop had so many great stickers on it I didn’t want to say goodbye. So I put off my hardware refresh cycle from the recommended 2 years to almost 3.

To speak the truth it wasn’t only the stickers that made me wary of switching. I had a workflow that worked. The system wasn’t slow. It was only three years old.

But then Windows started crashing on me during video calls. And my Firefox build times became long enough that I ported changes to my Linux desktop before building them. It was time to move on.

Of course this opened up a can of worms. Questions, in order that they presented themselves, included:

Should I move to Mac, or stick with Windows? My lingering dislike for Apple products and complete unfamiliarity with OSX made that choice easy.

Of the Windows laptops, which should I go for? Microsoft’s Surface lineup keeps improving. I had no complaints from my previous Lenovo X1 Carbon. And the Dell XPS 15 and 13 were enjoyed by several of my coworkers.

The Dells I nixed because I didn’t want anything bigger than the X1 I was retiring, and because the webcam is positioned at knuckle-height. I felt wary of the Surfacebooks due to the number that mhoye had put in the ground due to manufacturing defects. Yes, I know he has an outsized effect on hardware and software. It really only served to highlight how much importance I put on familiarity and habit.

X1 Carbon 6th Generation it is, then.

So I initiated the purchase order. It would be sent to Mozilla Toronto, the location charged with providing my IT support, where it would be configured and given an asset number. Then it would be sent to me. And only then would the work begin in setting it up so that I could actually get work done on it.

First, not being a fan of sending keypresses over the network, I disabled Bing search from the Start Menu by setting the following registry keys:

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Search
BingSearchEnabled dword:00000000
AllowSearchToUseLocation dword:00000000
CortanaConsent dword:00000000

Then I fixed some odd defaults in Lenovo’s hardware. Middle-click should middle-click, not enter into a scroll. Fn should need to be pressed to perform special functions on the F keys (it’s like FnLock was default-enabled).

I installed all editions of Firefox. Firefox Beta installed over the release-channel that came pre-installed. Firefox Developer Edition and Nightly came next and added their own icons. I had to edit the shortcuts for each of these individually on the Desktop and in the Quick Launch bar to have -P --no-remote arguments so I wouldn’t accidentally start the wrong edition with the wrong profile and lose all of my data. (This should soon be addressed)

In Firefox Beta I logged in to sync to my work Firefox Account. This brought me 60% of the way to being useful right there. So much of my work is done in the browser, and so much of my browsing experience can be brought to life by logging in to Firefox Sync.

The other 40% took the most effort and the most time. This is because I want to be able to compile Firefox on Windows, for my sins, and this isn’t the most pleasant of experiences. Luckily we have “Building Firefox for Windows” instructions on MDN. Unluckily, I want to use git instead of mercurial for version control.

  1. Install mozilla-build
  2. Install Microsoft Visual Studio Community Edition (needed for Win10 SDKs)
  3. Copy over my .vimrc, .bashrc, .gitconfig, and my ssh keys into the mozilla-build shell environment
  4. Add exclusions to Windows Defender for my entire development directory in an effort to speed up Windows’ notoriously-slow filesystem speeds
  5. Install Git for Windows
  6. Clone and configure git-cinnabar for working with Mozilla’s mercurial repositories
  7. Clone mozilla-unified
    • This takes hours to complete. The download is pretty quick, but turning all of the mercurial changesets into git commits requires a lot of filesystem operations.
  8. Download git-prompt.sh so I can see the current branch in my mozilla-build prompt
  9.  ./mach bootstrap
    • This takes dozens of minutes and can’t be left alone as it has questions that need answers at various points in the process.
  10. ​./mach build
    • This originally failed because when I checked out mozilla-unified in Step 7 my git used the wrong line-endings. (core.eol should be set to lf and core.autocrlf to false)
    • Then it failed because ./mach bootstrap downloaded the wrong rust std library. I managed to find rustup in ~/.cargo/bin which allowed me to follow the build system’s error message and fix things
  11. Just under 50min later I have a Firefox build

And that’s not all. I haven’t installed the necessary tools for uploading patches to Mozilla’s Phabricator instance so they can undergo code review. I haven’t installed Chrome so I can check if things are broken for everyone or just for Firefox. I haven’t cloned and configured the frankly-daunting number of github repositories in use by my team and the wider org.

Only with all this done can I be a productive mozillian. It takes hours, and knowledge gained over my nearly-3 years of employment here.

Could it be automated? Technologically, almost certainly yes. The latest mozilla-build can be fetched from a central location. mozilla-unified can be cloned using the version control setup of choice. The correct version of Visual Studio Community can be installed (but maybe not usably given its reliance on Microsoft Accounts). We might be able to get all the way to a working Firefox build from a recent checkout of the source tree before the laptop leaves IT’s hands.

It might not be worth it. How many mozillians even need a working Firefox build, anyway? And how often are they requesting new hardware?

Ignoring the requirement to build Firefox, then, why was the laptop furnished with a release-channel version of Firefox? Shouldn’t it at least have been Beta?

And could this process of setup be better documented? The parts common to multiple teams appear well documented to begin with. The “Building Firefox on Windows” documentation on MDN is exceedingly clear to work with despite the frightening complexity of its underpinnings. And my team has onboarding docs focused on getting new employees connected and confident.

Ultimately I believe this is probably as simple and as efficient as this process will get. Maybe it’s a good thing that I only undertook this after three years. That seems like a nice length of time to amortize the hours of cost it took to get back to productive.

Oh, and as for the stickers… well, Mozilla has a program for buying your own old laptop. I splurged and am using it to replace my 2009 Aspire Revo to connect to my TV and provide living room computing. It is working out just swell.

:chutten

Canadian Holiday Inbound! Thanksgiving 2018 (Monday, October 8)

Monday is Thanksgiving in Canada[1], so please excuse your Canadian colleagues for not being in the office.

We’ll likely be spending the day wondering. We’ll be wondering how family could make such a mess, wondering why we ate so much pie, wondering if it’s okay to eat turkey for breakfast, wondering if pie can be a meal and dessert at the same time, wondering how we fit the leftovers in the fridge, wondering why we bothered hosting this year, wondering whose sock that is by the stairs, wondering when the snow will melt[2] or start to fall[3].

We’ll also be wondering who started the family tradition of having cornbread instead of buttered rolls, wondering where the harvest tradition began, wondering about what all goes into harvesting our food, wondering what it means to be thankful, wondering what we are thankful for, wondering why we ate the evening meal at 4pm, wondering whether 4pm is too late to have a nap.

With heads full of wondering and bellies full of food, we wish you a wonderful Thanksgiving. We’ll be back to work, if not our normal shapes, on Tuesday.

:chutten

PS: Canadian Pro-tip: Leftover food often turns into regret – but this regret can turn back into food if you leave it in the fridge for a little while!

[1]: https://mana.mozilla.org/wiki/display/PR/Holidays%3A+Canada
[2]: Calgary had a (record) snowfall of 32.8cm (1’1″) on Oct 2: https://www.cbc.ca/news/canada/calgary/calgary-october-snow-day-two-1.4848394
[3]: Snow’s a-coming, already or eventually: https://weather.gc.ca/canada_e.html

Distributed Teams: Regional Holidays

Today is German Unity Day, Germany’s National Day. Half of my team live in Berlin, so I vaguely knew they wouldn’t be around… but I’d likely have forgotten if not for a lovely tradition of “Holiday Inbound” emails at Mozilla.

Mozilla is a broadly-distributed organization with employees in dozens of countries worldwide. Each of these countries have multiple days off to rest or celebrate. It’s tough to know across so many nations and religions and cultures exactly who will be unable to respond to emails on exactly which days.

So on the cusp of a holiday it is tradition in Mozilla to send a Holiday Inbound email to all Mozilla employees noting that the country you’re trying to reach can’t come to the phone right now, so please leave a message at the tone.

More than just being a bland notification some Mozillians take the opportunity to explain the history and current significance of the event being celebrated. I’ve taken a crack at explaining the peculiarly-Canadian holiday of Christmas (pronounced [kris-muhs]) in the past.

Sometimes you even get some wonderful piece of alternate history like :mhoye’s delightful, 50% factual exploration of the origins of Canadian Labour Day 2016.

I delight in getting these notifications from our remotees and offices worldwide. It really brings us closer together through understanding, while simultaneously highlighting just how different we all are.

Maybe I should pen a Holiday Inbound email about Holiday Inbound emails. It would detail the long and fraught history of the tradition in a narrative full of villains and heroes and misspellings and misunderstandings…

Or maybe I should just try to get some work done while my German colleagues are out.

:chutten

The End of Firefox Windows XP Support

Firefox 62 has been released. Go give it a try!

At the same time, on the Extended Support Release channel, we released Firefox ESR 60.2 and stopped supporting Firefox ESR 52: the final version of Firefox with Windows XP support.

Now, we don’t publish all-channel user proportions grouped by operating system, but as part of the Firefox Public Data Report we do have data from the release channel back before we switched our XP users to the ESR channel. At the end of February 2016, XP users made up 12% of release Firefox. By the end of February 2017, XP users made up 8% of release Firefox.

If this trend continued without much change after we switched XP users to ESR, XP Firefox users would presently amount to about 2% of release users.

That’s millions of users we kept safe on the Internet despite running a nearly-17-year-old operating system whose last patch was over 4 years ago. That’s a year and a half of extra support for users who probably don’t feel they have much ability to protect themselves online.

It required effort, and it required devoting resources to supporting XP well after Microsoft stopped doing so. It meant we couldn’t do other things, since we were busy with XP.

I think we did a good thing for these users. I think we did the right thing for these users. And now we’re wishing these users the very best of luck.

…and that they please oh please upgrade so we can go on protecting them into the future.

:chutten

 

Data Science is Hard: Counting Users

Screenshot_2018-08-29 User Activity Firefox Public Data Report

Counting is harder than you think. No, really!

Intuitively, as you look around you, you think this can’t be true. If you see a parking lot you can count the cars, right?

But do cars that have left the parking lot count? What about cars driving through it without stopping? What about cars driving through looking for a space? (And can you tell the difference between those two kinds from a distance?)

These cars all count if you’re interested in usage. It’s all well and good to know the number of cars using your parking lot right now… but is it lower on weekends? Holidays? Are you measuring on a rainy day when fewer people take bicycles, or in the Summer when more people are on vacation? Do you need better signs or more amenities to get more drivers to stop? Are you going to have expand capacity this year, or next?

Yesterday we released the Firefox Public Data Report. Go take a look! It is the culmination of months of work of many mozillians (not me, I only contributed some early bug reports). In it you can find out how many users Firefox has, the most popular addons, and how quickly Firefox users update to the latest version. And you can choose whether to look at how these plots look for the worldwide user base or for one of the top ten (by number of Firefox users) countries individually.

It’s really cool.

The first two plots are a little strange, though. They count the number of Firefox users over time… and they don’t agree. They don’t even come close!

For the week including August 17, 2018 the Yearly Active User (YAU) count is 861884770 (or about 862M)… but the Monthly Active User (MAU) count is 256092920 (or about 256M)!

That’s over 600M difference! Which one is right?

Well, they both are.

Returning to our parking lot analogy, MAU is about counting how many cars use the parking lot over a 28-day period. So, starting Feb 1, count cars. If someone you saw earlier returns the next day or after a week, don’t count them again: we only want unique cars. Then, at the end of the 28-day period, that was the MAU for Feb 28. The MAU for Mar 1 (on non-leap-years) is the same thing, but you start counting on Feb 2.

Similarly for YAU, but you count over the past 365 days.

It stands to reason that you’ll see more unique cars over the year than you will over the month: you’ll see visitors, tourists, people using the lot just once, and people who have changed jobs and haven’t been back in four months.

So how many of these 600M who are in the YAU but not in the MAU are gone forever? How many are coming back? We don’t know.

Well, we don’t know _precisely_.

We’ve been at the browser game for long enough to see patterns in the data. We’re in the Summer slump for MAU numbers, and we have a model for how much higher the numbers are likely to be come October. We have surveyed people of varied backgrounds and have some ideas of why people change browsers to or away from Firefox.

We have the no-longer users, the lapsed users, the lost-and-regained users, the tried-us-once users, the non-human users, … we have categories and rough proportions on what we think we know about our population, and how that influences how we can better make the internet better for them.

Ultimately, to me, it doesn’t matter too much. I work on Firefox, a product that hundreds of millions of people use. How many hundreds of millions doesn’t matter: we’re above the threshold that makes me feel like I’m making the world better.

(( Well… I say that, but it is actually my job to understand the mechanisms behind these  numbers and why they can’t be exact, so I do have a bit of a vested interest. And there are a myriad of technological and behavioural considerations to account for in code and in documentation and in analysis which makes it an interesting job. But, you know. Hundreds of millions is precise enough for my job satisfaction index. ))

But once again we reach the inescapable return to the central thesis. Counting is harder than you think: one of the leading candidates for the Data Team’s motto. (Others include “Well, it depends.” and “¯\_(ツ)_/¯”). And now we’re counting in the open, so you get to experience its difficulty firsthand. Go have another look.

:chutten

 

When do All Firefox Users Update?

Last time we talked about updates I wrote about all of what goes into an individual Firefox user’s ability to update to a new release. We looked into how often Firefox checks for updates, and how we sometimes lie and say that there isn’t an update even after release day.

But how does that translate to a population?

Well, let’s look at some pictures. First, the number of “update” pings we received from users during the recent Firefox 61 release:update_ping_volume_release61

This is a real-time look at how many updates were received or installed by Firefox users each minute. There is a plateau’s edge on June 26th shortly after the update went live (around 10am PDT), and then a drop almost exactly 24 hours later when we turned updates off. This plot isn’t the best for looking into the mechanics of how this works since it shows volume of all types of “update” pings from all versions, so it includes users finally installing Firefox Quantum 57 from last November as well as users being granted the fresh update for Firefox 61.

Now that it’s been a week we can look at our derived dataset of “update” pings and get a more nuanced view (actually, latency on this dataset is much lower than a week, but it’s been at least a week). First, here’s the same graph, but filtered to look at only clients who are letting us know they have the Firefox 61 update (“update” ping, reason: “ready”) or they have already updated and are running Firefox 61 for the first time after update (“update” ping, reason: “success”):update_ping_volume_61only_wide

First thing to notice is how closely the two graphs line up. This shows how, during an update, the volume of “update” pings is dominated by those users who are updating to the most recent version.

And it’s also nice validation that we’re looking at the same data historically that we were in real time.

To step into the mechanics, let’s break the graph into its two constituent parts: the users reporting that they’ve received the update (reason: “ready”) and the users reporting that they’re now running the updated Firefox (reason: “success”).

update_ping_volume_stackedupdate_ping_volume_unstacked

The first graph shows the two lines stacked for maximum similarity to the graphs above. The second unstacks the two so we can examine them individually.

It is now much clearer to see how and when we turned updates on and off during this release. We turned them on June 26, off June 27, then on again June 28. The blue line also shows us some other features: the Canada Day weekend of lower activity June 30 and July 1, and even time-of-day effects where our sharpest peaks are (EDT) 6-8am, 12-2pm, and a noticeable hook at 1am.

(( That first peak is mostly made up of countries in the Central European Timezone UTC+1 (e.g. Germany, France, Poland). Central Europe’s peak is so sharp because Firefox users, like the populations of European countries, are concentrated mostly in that one timezone. The second peak is North and South America (e.g. United States, Brazil). It is broader because of how many timezones the Americas span (6) and how populations are dense in the Eastern Timezone and Pacific Timezone which are 3 hours apart (UTC-5 to UTC-8). The noticeable hook is China and Indonesia. China has one timezone for its entire population (UTC+8), and Indonesia has three. ))

This blue line shows us how far we got in the last post: delivering the updates to the user.

The red line shows why that’s only part of the story, and why we need to look at populations in addition to just an individual user.

Ultimately we want to know how quickly we can reach our users with updated code. We want, on release day, to get our fancy new bells and whistles out to a population of users small enough that if something goes wrong we’re impacting the fewest number of them, but big enough that we can consider their experience representative of what the whole Firefox user population would experience were we to release to all of them.

To do this we have a two levers: we can change how frequently Firefox asks for updates, and we can change how many Firefox installs that ask for updates actually get it right away. That’s about it.

So what happens if we change Firefox to check for updates every 6 hours instead of every 12? Well, that would ensure more users will check for updates during periods they’re offered. It would also increase the likelihood of a given user being offered the update when their Firefox asks for one. It would raise the blue line a bit in those first hours.

What if we change the %ge of update requests that are offers? We could tune up or down the number of users who are offered the updates. That would also raise the blue line in those first hours. We could offer the update to more users faster.

But neither of these things would necessarily increase the speed at which we hit that Goldilocks number of users that is both big enough to be representative, and small enough to be prudent. Why? The red line is why. There is a delay between a Firefox having an update and a user restarting it to take advantage of it.

Users who have an update won’t install it immediately (see how the red line is lower than the blue except when we turn updates off completely), and even if we turn updates off it doesn’t stop users who have the update from installing it (see how the red line continues even when the blue line is floored).

Even with our current practices of serving updates, more users receive the update than install it for at least the first week after release.

If we want to accelerate users getting update code, we need to control the red line. Which we don’t. And likely can’t.

I mean, we can try. When an update has been pending for long enough, you get a little arrow on your Firefox menu: update_arrow.png

If you leave it even longer, we provide a larger piece of UI: a doorhanger.

restartDoorhanger

We can tune how long it takes to show these. We can show the doorhanger immediately when the update is ready, asking the user to stop what they’re doing and–

windows_update_bluewindows_updatewindows_update_blue_lg

…maybe we should just wait until users update their own selves, and just offer some mild encouragement if they’ve put it off for, say, four days? We’ll give the user eight days before we show them anything as invasive as the doorhanger. And if they dismiss that doorhanger, we’ll just not show it again and trust they’ll (eventually) restart their browser.

…if only because Windows restarted their whole machines when they weren’t looking.

(( If you want your updates faster and more frequent, may I suggest installing Firefox Beta (updates every week) or Firefox Nightly (updates twice a day)? ))

This means that if the question is “When do all Firefox users update?” the answer is, essentially, “When they want to.” We can try and serve the updates faster, or try to encourage them to restart their browsers sooner… but when all is said and done our users will restart their browsers when they want to restart their browsers, and not a moment before.

Maybe in the future we’ll be able to smoothly update Firefox when we detect the user isn’t using it and when all the information they have entered can be restored. Maybe in the future we’ll be able to update seamlessly by porting the running state from instance to instance. But for now, given our current capabilities, we don’t want to dictate when a user has to restart their browser.

…but what if we could ship new features more quickly to an even more controlled segment of the release population… and do it without restarting the user’s browser? Wouldn’t that be even better than updates?

Well, we might answers for that… but that’s a subject for another time.

:chutten