Mozilla All-Hands Tips

25486648678_90fa78a27e_k
All Hands Austin, December 2017, Mitchell Baker presenting. (Photo used under CC BY-NC-SA 2.0)

Twice a year, Mozilla gathers employees, volunteers, and assorted hangers-on in a single place to have a week of planning, working, and socializing. Being as distributed an organization as we are, it’s a bit rare to get enough of us in a single place to generate the kind of cross-talk and beneficial synergistic happenstances that help us work smarter and move in more-or-less the same direction. These are our All Hands events.

They’re a Pretty Big Deal(tm).

So here you are, individual contributor or manager, staff or volunteer, veteran or first-timer. With all these Big Plans, what are we littler folk to do to not become overwhelmed?

I have some tips.

Before You Go

Set up a mail folder/label for relevant email: You’ll be getting some email with details about where you should be, what you should be doing, and when. Organizing these into one place is helpful for reference, so come up with a label (maybe “201807-sanfran” or “mozsf2” or “fogzilla” or something) and organize those emails as they come in.

Act on those emails immediately: If they contain instructions or an announcement that bookings or registration is now open… then do that thing right then. Do not file the email and forget. Do the thing while you are looking at that email. Only then should you file that email and get back to where you were in your brain. If you absolutely can’t just then (have to synchronize with family or what-have-you), put a calendar reminder in that repeats every weekday until you handle it.

Do not upgrade Nightly: You’re running Nightly, right? You’ll be travelling through a land of uncertain connectivity, and the last thing you want is to use it downloading a multi-MB Nightly update that might have accidentally disabled Captive Portal Detection. If it works, keep your Nightly build until you’re certain you have the bandwidth to download a new one. All else fails, keep it until you get back.

Make sure your laptop is in shape: My laptop is often neglected in favour of its Desktop comrade: updates may be pending, credentials may have expired, the source code checkouts might be weeks old, and there may have even been a new version of Mozilla Build released since the last time I tried to compile Firefox. With luck, while at an All Hands you won’t have to compile Gecko on a laptop in your hotel… but we make our own luck, we who are prepared. Prepare your laptop.

Prepare your family: If you don’t live alone, you’ll have non-mozilla prepwork to do. Spouse and kids or roomates and pets, there are lifeforms who normally expect to see you that won’t. Clear the family schedule for the week you’re gone, and do as much preparation ahead of time as you can. Laundry, meal planning, groceries, sitters, dog walkers, even lawn services are things you can arrange to lighten the load that your absence will place on those around you. Even if you’re bringing them with you.

While You’re There

Do not fear missing out: You will not be able to attend both Boardgame Night and your team dinner. There will be karaoke parties you won’t get to, or be invited to. This is fine. This is expected. This is unavoidable when you have so many people disorganizing so many things simultaneously. So don’t fret about it. Prioritize.

Say no: Speaking of prioritizing: prioritize for yourself. You may very well be operating as a Level 100 You for hours at a time. So many people to talk to, so many talks and social events to organize, deliver, and attend… No. You don’t have to stay the entire length of the party. You don’t even have to go. If you feel yourself fading, get out while you have the strength. Regroup. Find a quiet corner or go to sleep early… At my first All Hands, I napped on both Wednesday and Thursday. And I wasn’t even in a different timezone. It really helped.

Wash your hands: Lots. Before meals. After meals. You’ll be talking, working, eating, and otherwise hanging out with a thousand of your closest coworkers. It’s probably your best bet for not catching mozflu, and it’s definitely your best bet to not transmit it.

After You’re Back

Consider taking a day: Generally speaking you’ll be flying back on Saturday and returning to work on Monday. Depending on distance to travel, available flight times, and cancellations, this may result in only a few hours between stumbling through your door and stumbling back to work. Consider booking that Monday off (or, honestly, if your trip back was heinous, don’t even book it off. Just take it. Get some sleep. Work can wait until Tuesday.)

Check in: If you live with family, you haven’t seen them for a week. Even if you brought them with you, you’ve been in meetings and talks and stuff most hours. Check in with them. Get up to speed on what’s been happening in their lives while you’ve been away.

Get excited for the next one: Even immediately back from an All Hands, it’s still only six months to the next one. Take stock of what you liked and what you didn’t like about this one. Rest up, and try not to get impatient :)

:chutten

(( Great minds think alike, because Seburo recently wrote a Wiki article covering even more excellent tips for All Hands events. Check that out, too! ))

Advertisements

Data Science is Hard – Part 1: Data

You’d think that categorizing and measuring populations would be pretty simple. You count them all up, divide them into groups… simple, arithmetic-sounding stuff.

To a certain extent, that’s all it is. You want to know how many people contribute to Firefox in a day? Add ’em up. Want to know what fraction of them are from Europe? Compare the subset from Europe against the entire population.

But that’s where it gets squishy:

  • “in a day?” Which day? Did you choose a weekend day? A statutory holiday? A religious holiday? That’ll change the data. Which 24 hours are you counting? From midnight-to-midnight, sure, but which timezone?
  • “from Europe?” What is Europe? Just the EU? How do you tell if a contributor is from Europe? Are you running a geolocation query against their IP? What if their IP changes over the day, are we going to double-count that user? Are we asking contributors where they are from? What if they lie?

So that leads us to Part 1 of “Data Science is Hard”: Data is Hard.

In a recent 1-on-1, my manager :bsmedberg and I thought that it could be interesting to look into Firefox users whose Telemetry reports come from different parts of the world at different times. Maybe we could identify users who travel (Firefox Users Who Travel: Where do they travel to/from?). Maybe they can help us understand the differing needs of Firefox users who are on vacation as opposed to being at home. Maybe they’ll show us Tor Browser users, or users using other anonymizing techniques and technologies: and maybe we should see if there’s some special handling we could provide for them and their data.

I used this topic as a way to learn how to use our new re:dash dashboard onto the prestodb instance of the Longitudinal Dataset. (which lets me run SQL queries against a 1% random sample of Firefox users’ Telemetry data from the past 180 days)

Immediately I ran into problems. First, with remembering all the SQL I had forgotten in the *mumblesomething* years since I last had to write interesting queries.

But then I quickly ran into problems with the data. I ran a query to boil down how many (and which) unique countries each client had reported Telemetry from:

SELECT
    cardinality(array_distinct(geo_country)) AS country_count
    , array_distinct(geo_country) AS countries
FROM longitudinal_v20160314
ORDER BY country_count DESC
LIMIT 5
Country_count Countries
35 [“CN”,”MX”,”GB”,”HU”,”JP”,”US”,”RU”,”IN”,”HK”,”??”,”CA”,”KR”,”TW”,”CM”,”DK”,”CH”,”ZA”,”PH”,”DE”,”VN”,”NL”,”CO”,”KZ”,”MA”,”TR”,”FR”,”AU”,”GR”,”IE”,”AR”,”BY”,”AT”,”TN”,”BR”,”AM”]
34 [“DE”,”RU”,”LT”,”UA”,”MA”,”GB”,”GI”,”AE”,”FR”,”CN”,”AM”,”NG”,”NL”,”PT”,”TH”,”PL”,”ES”,”NO”,”CH”,”IL”,”ZA”,”BY”,”US”,”UZ”,”HK”,”TW”,”JP”,”PK”,”LU”,”SG”,”FI”,”EU”,”IN”,”ID”]
34 [“US”,”BR”,”KR”,”NZ”,”RO”,”JP”,”ES”,”GB”,”TW”,”CN”,”UA”,”AU”,”NL”,”FR”,”FI”,”??”,”NO”,”CA”,”ZA”,”CL”,”IT”,”SE”,”SG”,”CH”,”RU”,”DE”,”MY”,”IN”,”ID”,”VN”,”PL”,”PH”,”KE”,”EG”]
34 [“GB”,”CN”,”??”,”DE”,”US”,”RU”,”AL”,”ES”,”NL”,”FR”,”KR”,”FI”,”IR”,”CA”,”JP”,”HK”,”AU”,”CH”,”RO”,”CO”,”IE”,”BR”,”SE”,”GR”,”IN”,”MX”,”RS”,”AR”,”TW”,”IT”,”SA”,”ID”,”VN”,”TN”]
34 [“US”,”GI”,”??”,”GB”,”DE”,”SA”,”KR”,”AR”,”ZA”,”CN”,”IN”,”AT”,”CA”,”KE”,”IQ”,”VN”,”TR”,”KZ”,”JP”,”BR”,”FR”,”TW”,”IT”,”ID”,”SG”,”RU”,”CL”,”BA”,”NL”,”AU”,”BE”,”LT”,”PT”,”ES”]

35 unique countries visited? Wow.

The “Countries” column is in order of when they first appeared in the data, so we know that the first user was reporting from China then Mexico then Great Britain then Hungary then Japan then the US then Russia…

Either this is a globetrotting super spy, or we’re looking at some sort of VPN/Tor/anonymizing framework at play here.

( Either way I think it best to say, “Thank you for using Firefox, Ms. Super Spy!” )

Or maybe this is a sign that the geolocation service is unreliable, or that the data intake services are buggy, or something else that would be less than awesome.

Regardless: this data is hugely messy. But, 35 countries over 180 days? That’s just about doable in real life… except that it wasn’t over 180 days, but 2:

SELECT
    cardinality(array_distinct(geo_country)) AS country_count
    , cardinality(geo_country) AS subsession_count
    , cardinality(geo_country) / (date_diff('DAY', from_iso8601_timestamp(array_min(subsession_start_date)), from_iso8601_timestamp(array_max(subsession_start_date))) + 1) AS subsessions_per_day
    , date_diff('DAY', from_iso8601_timestamp(array_min(subsession_start_date)), from_iso8601_timestamp(array_max(subsession_start_date)) + 1) AS duration
FROM longitudinal_v20160314
ORDER BY country_count DESC
LIMIT 1
Country_count Subsession_count Subsessions_per_day Duration
35 169 84 2

This client reported from 35 countries over 2 days. At least 17 countries per day (we’re skipping duplicates).

Also of note to Telemetry devs, this client was reporting 84 subsessions per day.

(Subsessions happen at a user’s local midnight and whenever some aspect of the Environment block of Telemetry changes (your locale, your multiprocess setting, how many addons you have installed). If your Firefox is registering that many subsession edges per day, there might be something wrong with your install. Or there might be something wrong with our data intake or aggregation.)

I still plan on poking around this idea of Firefox Users Who Travel. As I do so I need to remember that the data we collect is only useful for looking at Populations. Knowing that there’s one user visiting 35 countries in 2 days doesn’t help us decide whether or not we should release a special Globetrotter Edition of Firefox… since that’s just 1 of 4 million clients of a dataset representing only 1% of Firefox users.

Knowing that about a dozen users reported days with over 250 subsessions might result in some evaluation of that code, but without something linking these high-subsession-rate users together into a Population (maybe they’re machines running automated testing?), there’s nothing much we can do about it.

Instead I should focus on how, in a 4M user dataset, 112k (2.7%) users report from exactly 2 countries over the duration of the dataset. There are only 44k that report from more than 2, and the other 3.9M or so report exactly 1.

2.7% is a sliver of 1% of the Firefox population, but it is a Population. A Population is something we can analyse and speak meaningfully about, as the noise and mess of individual points of data has been smoothed out by the sheer weight of the Firefox user base.

It’s nice having a user base large enough to speak meaningfully about.

:chutten