Distributed Teams: A Test Failing Because It’s Run West of Newfoundland and Labrador

(( Not quite 500 mile email-level of nonsense, but might be the closest I get. ))

A test was failing.

Not really unusual, that. Tests fail all the time. It’s how we know they’re good tests: protecting us developers from ourselves.

But this one was failing unusually. Y’see, it was failing on my machine.

(Yes, har har, it is a common-enough occurrence given my obvious lack of quality as a developer how did you guess.)

The unusual part was that it was failing only for me… and I hadn’t even touched anything yet. It wasn’t failing on our test infrastructure “try”, and it wasn’t failing on the machine of :raphael, the fellow responsible for the integration test harness itself. We were building Firefox the same way, running telemetry-tests-client the same way… but I was getting different results.

I fairly quickly locked down the problem to be an extra “main” ping with reason “environment-change” being sent during the initial phases of the test. By dumping some logging into Firefox, rebuilding it, and then routing its logs to console with --gecko-log "-" I learned that we were sending a ping because a monitored user preference had been changed: browser.search.region.

When Firefox starts up the first time, it doesn’t know where it is. And it needs to know where it is to properly make a first guess at what language you want and what search engines would work best. Google’s results are pretty bad in Canada unless you use “google.ca”, after all.

But while Firefox doesn’t know where it is, it does know is what timezone it’s in from the settings in your OS’s clock. On top of that it knows what language your OS is set to. So we make a first guess at which search region we’re in based on whether or not the timezone overlaps a US timezone and if your OS’ locale is `en-US` (United States English).

What this fails to take into account is that United States English is the “default” locale reported by many OSes even if you aren’t in the US. And how even if you are in a timezone that overlaps with the US, you might not be there.

So to account for that, Mozilla operates a location service to double-check that the search region is appropriately set. This takes a little time to get back with the correct information, if it gets back to us at all. So if you happen to be in a US-overlapping timezone with an English-language OS Firefox assumes you’re in the US. Then if the location service request gets back with something that isn’t “US”, browser.search.region has to be updated.

And when it updates, it changes the Telemetry Environment.

And when the Environment changes, we send a “main” ping.

And when we send a “main” ping, the test breaks.

…all because my timezone overlaps the OS and my language is “Default” English.

I feel a bit persecuted, but this shows another strength of Distributed Teams. No one else on my team would be able to find this failure. They’re in Germany, Italy, and the US. None of them have that combination of “Not in the US, but in a US timezone” needed to manifest the bug.

So on one hand this sucks. I’m going to have to find a way around this.

But on the other hand… I feel like my Canadianness is a bit of a bug-finding superpower. I’m no Brok Windsor or Captain Canuck, but I can get the job done in a way no one else on my team can.

Not too bad, eh?

:chutten

Advertisements