Data Science is Interesting: Why are there so many Canadians in India?

Any time India comes up in the context of Firefox and Data I know it’s going to be an interesting day.

They’re our largest Beta population:

pie chart showing India by far the largest at 33.2%

They’re our second-largest English user base (after the US):

pie chart showing US as largest with 37.8% then India with 10.8%

 

But this is the interesting stuff about India that you just take for granted in Firefox Data. You come across these factoids for the first time and your mind is all blown and you hear the perhaps-apocryphal stories about Indian ISPs distributing Firefox Beta on CDs to their customers back in the Firefox 4 days… and then you move on. But every so often something new comes up and you’re reminded that no matter how much you think you’re prepared, there’s always something new you learn and go “Huh? What? Wait, what?!”

Especially when it’s India.

One of the facts I like to trot out to catch folks’ interest is how, when we first released the Canadian English localization of Firefox, India had more Canadians than Canada. Even today India is, after Canada and the US, the third largest user base of Canadian English Firefox:

pie chart of en-CA using Firefox clients by country. Canada at 75.5%, US at 8.35%, then India at 5.41%

 

Back in September 2018 Mozilla released the official Canadian English-localized Firefox. You can try it yourself by selecting it from the drop down menu in Firefox’s Preferences/Options in the “Language” section. You may have to click ‘Search for More Languages’ to be able to add it to the list first, but a few clicks later and you’ll be good to go, eh?

(( Or, if you don’t already have Firefox installed, you can select which language and dialect of Firefox you want from this download page. ))

Anyhoo, the Canadian English locale quickly gained a chunk of our install base:

uptake chart for en-CA users in Firefox in September 2018. Shows a sharp uptake followed by a weekly seasonal pattern with weekends lower than week days

…actually, it very quickly gained an overlarge chunk of our install base. Within a week we’d reached over three quarters of the entire Canadian user base?! Say we have one million Canadian users, that first peak in the chart was over 750k!

Now, we Canadian Mozillians suspected that there was some latent demand for the localized edition (they were just too polite to bring it up, y’know)… but not to this order of magnitude.

So back around that time a group of us including :flod, :mconnor, :catlee, :Aryx, :callek (and possibly others) fell down the rabbit hole trying to figure out where these Canadians were coming from. We ran down the obvious possibilities first: errors in data, errors in queries, errors in visualization… who knows, maybe I was counting some clients more than once a day? Maybe I was counting other Englishes (like South African and Great Britain) as well? Nothing panned out.

Then we guessed that maybe Canadians in Canada weren’t the only ones interested in the Canadian English localization. Originally I think we made a joke about how much Canadians love to travel, but then the query stopped running and showed us just how many Canadians there must be in India.

We were expecting a fair number of Canadians in the US. It is, after all, home to Firefox’s largest user base. But India? Why would India have so many Canadians? Or, if it’s not Canadians, why would Indians have such a preference for the English spoken in ten provinces and three territories? What is it about one of two official languages spoken from sea to sea to sea that could draw their attention?

Another thing that was puzzling was the raw speed of the uptake. If users were choosing the new localization themselves, we’d have seen a shallow curve with spikes as various news media made announcements or as we started promoting it ourselves. But this was far sharper an incline. This spoke to some automated process.

And the final curiosity (or clue, depending on your point of view) was discovered when we overlaid British English (en-GB) on top of the Canadian English (en-CA) uptake and noticed that (after accounting for some seasonality at the time due to the start of the school year) this suddenly-large number of Canadian English Firefoxes was drawn almost entirely from the number previously using British English:

chart showing use of British and Canadian English in Firefox in September 2018. The rise in use of Canadian English is matched by a fall in the use of British English.

It was with all this put together that day that lead us to our Best Guess. I’ll give you a little space to make your own guess. If you think yours is a better fit for the evidence, or simply want to help out with Firefox in Canadian English, drop by the Canadian English (en-CA) Localization matrix room and let us know! We’re a fairly quiet bunch who are always happy to have folks help us keep on top of the new strings added or changed in Mozilla projects or just chat about language stuff.

Okay, got your guess made? Here’s ours:

en-CA is alphabetically before en-GB.

Which is to say that the Canadian English Firefox, when put in a list with all the other Firefox builds (like this one which lists all the locales Firefox 88 comes in for Windows 64-bit), comes before the British English Firefox. We assume there is a population of Firefoxes, heavily represented in India (and somewhat in the US and elsewhere), that are installed automatically from a list like this one. This automatic installation is looking for the first English build in this list, and it doesn’t care which dialect. Starting September of 2018, instead of grabbing British English like it’s been doing for who knows how long, it had a new English higher in the list: Canadian English.

But who can say! All I know is that any time India comes up in the data, it’s going to be an interesting day.

:chutten