Data Science is Hard: Counting Users

Screenshot_2018-08-29 User Activity Firefox Public Data Report

Counting is harder than you think. No, really!

Intuitively, as you look around you, you think this can’t be true. If you see a parking lot you can count the cars, right?

But do cars that have left the parking lot count? What about cars driving through it without stopping? What about cars driving through looking for a space? (And can you tell the difference between those two kinds from a distance?)

These cars all count if you’re interested in usage. It’s all well and good to know the number of cars using your parking lot right now… but is it lower on weekends? Holidays? Are you measuring on a rainy day when fewer people take bicycles, or in the Summer when more people are on vacation? Do you need better signs or more amenities to get more drivers to stop? Are you going to have expand capacity this year, or next?

Yesterday we released the Firefox Public Data Report. Go take a look! It is the culmination of months of work of many mozillians (not me, I only contributed some early bug reports). In it you can find out how many users Firefox has, the most popular addons, and how quickly Firefox users update to the latest version. And you can choose whether to look at how these plots look for the worldwide user base or for one of the top ten (by number of Firefox users) countries individually.

It’s really cool.

The first two plots are a little strange, though. They count the number of Firefox users over time… and they don’t agree. They don’t even come close!

For the week including August 17, 2018 the Yearly Active User (YAU) count is 861884770 (or about 862M)… but the Monthly Active User (MAU) count is 256092920 (or about 256M)!

That’s over 600M difference! Which one is right?

Well, they both are.

Returning to our parking lot analogy, MAU is about counting how many cars use the parking lot over a 28-day period. So, starting Feb 1, count cars. If someone you saw earlier returns the next day or after a week, don’t count them again: we only want unique cars. Then, at the end of the 28-day period, that was the MAU for Feb 28. The MAU for Mar 1 (on non-leap-years) is the same thing, but you start counting on Feb 2.

Similarly for YAU, but you count over the past 365 days.

It stands to reason that you’ll see more unique cars over the year than you will over the month: you’ll see visitors, tourists, people using the lot just once, and people who have changed jobs and haven’t been back in four months.

So how many of these 600M who are in the YAU but not in the MAU are gone forever? How many are coming back? We don’t know.

Well, we don’t know _precisely_.

We’ve been at the browser game for long enough to see patterns in the data. We’re in the Summer slump for MAU numbers, and we have a model for how much higher the numbers are likely to be come October. We have surveyed people of varied backgrounds and have some ideas of why people change browsers to or away from Firefox.

We have the no-longer users, the lapsed users, the lost-and-regained users, the tried-us-once users, the non-human users, … we have categories and rough proportions on what we think we know about our population, and how that influences how we can better make the internet better for them.

Ultimately, to me, it doesn’t matter too much. I work on Firefox, a product that hundreds of millions of people use. How many hundreds of millions doesn’t matter: we’re above the threshold that makes me feel like I’m making the world better.

(( Well… I say that, but it is actually my job to understand the mechanisms behind these  numbers and why they can’t be exact, so I do have a bit of a vested interest. And there are a myriad of technological and behavioural considerations to account for in code and in documentation and in analysis which makes it an interesting job. But, you know. Hundreds of millions is precise enough for my job satisfaction index. ))

But once again we reach the inescapable return to the central thesis. Counting is harder than you think: one of the leading candidates for the Data Team’s motto. (Others include “Well, it depends.” and “¯\_(ツ)_/¯”). And now we’re counting in the open, so you get to experience its difficulty firsthand. Go have another look.



Ontario’s 42nd Parliament, 1st Session: July 2018

Logo used for education and illustrative purposes. This is not an official publication and I am not an agent of the OLA.

In an effort to keep informed and politically active I’m watching the proceedings of our new provincial government here in Ontario. This is new for me. The closest I got to reading about governance was skimming a copy of Robert’s Rules of Order to help me chair meetings while I was President for three years of the anime club in University.

Yes, your nerd alert is working just fine.

Anyhoo, given my newness to all this please forgive as I belabour explanations or express confusion of long-held parliamentary weirdness. I’ll try to confine them to procedural notes so they aren’t too bothersome.

So, following the Ontario General Election in June, we ended up with the 42nd Parliament comprising a majority government of 76 Members of Provincial Parliament (MPPs) from the Progressive Conservative Party of Ontario (PC), the official opposition of 40 MPPs of the New Democratic Party of Ontario (NDP), and eight independents: seven from the Liberal Party of Ontario and one from the Green Party of Ontario.

Procedural Note: How are the Liberal MPPs and the Green MPP at the same time members of political parties (and having amongst them the heads of their parties, no less) but listed as independent? Governing the Legislative Assembly of Ontario are rules and regulations including Standing Orders. Standing Orders govern Members of the House differently if they are part of a “recognized party” (defined as a caucus of eight or more members). Since neither the Liberals nor the Greens have eight members, they are not “recognized parties” and are thus independent. This means they have some restrictions placed on their participation in legislative business. For instance they don’t have a right to as many questions as they’d like to field during Question Period (current practice is to give one member one question and one supplemental question each period), and they have much less time to do things like debate the Throne Speech and make inaugural remarks and other Statements.

As the party with the most number of MPPs the PC Party formed the government. Since they have a majority and we have strong party discipline in Ontario, they will not be defeated by a lack of confidence and instead will (barring unusual events) remain in power for their full term of four years and will likely implement their party’s platform citing a strong mandate from the people.

Procedural Note: With 76 of 124 seats it would seem that the PC mandate is strong, but Ontario is a First Past the Post system of representative democracy (each district elects a member to represent that district using a plurality vote). This results in some disproportionality. The Ontario General Election in  2014 had a Gallagher Index of 12.46 (the Canadian Government’s 2016 Special Committee on Electoral Reform recommended the reform of the federal electoral system to choose a method with a typical index less than 5). The election this past June had a Gallagher Index of 17.75, so it isn’t exactly proportional, but the PC Party did receive the highest proportion of the votes.

Even before the 42nd Legislature sat on July 12, the new government took action. Controversially they hired for $1M an ex-head of the PC Party known for underfunding and then shutting down hospitals to oversee an advisory panel charged with helping improve healthcare province-wide. This has featured in many Statements of Members and Questions, which is why I bring it up.

Eventually on July 12 the new government gathered to sit in Queen’s Park in Toronto to start official business. There’s technically a first Bill read and passed by the legislature, the pro forma “Bill 1” that shows we don’t need the Crown in order to legislate, but things really start going with the Throne Speech.

In Monarchies the Throne Speech is the Monarch handing down the priorities for their government to focus on in the coming years. In a Constitutional Monarchy like Canada, the Throne Speech is written by the Cabinet Ministers of the Government and is just read by the representative of the Crown. This was particularly interesting as Ontario’s representative of the Crown, Lieutennant Governor the Honourable Elizabeth Dowdeswell, is a staunch environmental conservationist (has been since the 80s) and she was given a speech to read that announced the end of carbon taxes and cap-and-trade without any particular replacement to keep polluting industries in check. This was commented upon by the Opposition.

The Throne Speech was just about what was expected from the Election. The priorities of the Government will be to: “reduce gas prices”, “lower your hydro bills”, “provide… tax relief to parents, small businesses and the working poor”, “scrapping the cap-and-trade carbon tax”, “make sure Ontario’s best interests are reflected in the NAFTA negotiations”, “reducing the regulatory burden”, “[call] a commission of inquiry into the financial practices of government”, “[perform] a thorough line-by-line audit of all government spending”, “return Ontario to a balanced budget”, “15,000 new long-term-care beds over the next five years and a historic new $3.8B investment in mental health and addictions”, “[replace] failed ideological experiments in the classroom with tried and true methods that work” (here meaning “discovery math” and the current sex-ed curriculum), “build a world-class transit system [in the Greater Toronto Area]”, “[cancelling] green energy contracts [imposed on rural municipalities]”, “freeing [police officers] from onerous restrictions that treat [them] as subjects of suspicion and scorn”, “expand the sale of beer and wine to convenience stores, grocery stores and big box stores”

Notable absence: Reconciliation with Indigenous Peoples.

Some of these priorities can be enacted through general Government business, but most of the big stuff will require legislation. And that’s where I expect to focus most of my time unless something from the Hansard (the parliamentary transcripts) sticks out.

So, Bill 1 was pro forma and we can ignore it.

Bill 2 is titled “Urgent Priorities Act, 2018” and is an omnibus Bill (a Bill that deals with more than one thing) of three parts:

1. “Hydro One Accountability Act, 2018” will require Hydro One reform and publish executive compensation amounts subject to the wills of the Treasury Board (which is now chaired by Peter Bethlenfalvy who sharp-eyed readers might remember as co-president of DBRS Ltd. when it decided to downgrade Ontario’s debt ratings in 2009). These provisions expire at the beginning of 2023 for some reason.

2. “White Pines Wind Project Termination Act, 2018” requires that the nine-turbine wind generation project in Prince Edward County be scrapped mid-construction. This’ll cost around $100M, and may result in lawsuits (despite clauses in the legislation that hope to curtail legal action).

3. “Back to Class Act (York University), 2018” appears to be standard back-to-work legislation for the teachers at York who have been striking since March.

Bill 2 received Royal Assent on July 25 (with division, meaning that not everyone was happy with this), so these things are happening. Schedule 1 is a big ol’ shrug from me… it will discourage Hydro One from being able to hire competent leadership to replace the ones that are being ousted, but I don’t really have a problem with the publication requirements. Honestly I think we’d be better served with a return to public ownership. Schedule 2 is a sad waste of money and will cost us in contractor trust (would you accept a contract knowing the last one was scrapped without consultation?) and in forwarding our green energy plans (we need something to replace the 18% of our electricity generated by burning gas/oil). Schedule 3 isn’t my circus, so I don’t have an opinion on it.

Bill 3 is titled “Compassionate Care Act, 2018” and is general government business instructing the Ministry of Health and Long-term Care to set out a framework for hospice care with reporting requirements. No members dissented the readings and it has been referred to committee for implementation.

Bill 4 is titled “Cap and Trade Cancellation Act, 2018” and is a straight-up repeal of “Climate Change Mitigation and Low-carbon Economy Act, 2016”. The only nod to environmental caution is that the government must now set new targets for greenhouse gas reduction, and must develop a plan to meet them. There are no timelines on those requirements. The outstanding cap and trade credits will be bought by the government at some cost. First reading was completed July 25 (with division).

Bill 5 is titled “Better Local Government Act, 2018” and messes with the wards of the City of Toronto and how the heads of councils of Regional Municipalities are selected: Muskoka, Niagara, Peel, and York will now be appointed. Durham, Halton, and Waterloo by general vote. This confuses me. Why mess with these things? Is Premier Ford still tied up in how he couldn’t win the Toronto City election? Why bother with this at all? First reading was completed July 30 (with division).

Bill 6 is titled “Poet Laureate of Ontario Act (In Memory of Gord Downie), 2018” and establishes the post, selection criteria, and responsibilities for a Poet Laureate of Ontario. First reading was July 30 (no division this time).

In addition to the big-ticket Bills, we also have Motions. They can be of the Government, or they can be of private Members.

The Government Motions of the month are fairly boring. Most are procedural (including the creation of Standing Committees), there’s one fast-forwarding the acceptance of Bill 2 by limiting debate and division (which may give political leverage later if these things blow up, which White Pines may do), and there’s one expressing the opinion of the House that it has a clear mandate of the people (grandstanding on the Party Platform).

Strangely of more interest are the Private Members’ Motions. Many of these are filed and ignored, some of them are passed. The ignored ones fell in three camps: there were a flurry of them asking that Bill 2 be paused until the extent of financial and legal liability under the bill might be assessed, which were all ignored. Bill 4 was also asked a few times to be sent back until it can be found in compliance with the Environmental Bill of Rights, 1993 (which provides many rights to citizens to challenge bills like Bill 4). We’ll see if any of those stick before Second Reading. Similarly, Bill 5 was challenged on the grounds of needing public consultation. To me this seems a weaker argument of challenge, but maybe the Opposition is building a case that the “Government For the People” really isn’t that interested in the people as much as they claim.

Of passed motions there were two: one from Mrs. Fee (PC of Kitchener South — Hespeler) expressing the opinion of the House that the Federal Government owes $200M related to the costs of illegal border crossers. (that choice of language is atrocious both morally and grammatically). The other was Jill Dunlop (PC of Simcoe North) moving that the Government of Ontario should “expedite the creation of sufficient skilled trades people to make skilled labour a competitive advantage for Ontario”, which carried without division (and I can see why as this is the reality we face).

The most circus-like aspect of the Legislative Assembly of Ontario is the Question Period where the Opposition is given an opportunity to call the Government into account, and the Government answers the questions by speechifying, reiterating talking points, and failing to answer the question. The Government also asks itself questions as an opportunity to do the same without having to think on its feet. All this grandstanding results in several standing ovations and lots of failure to keep to temperate language, remember to address comments through the Speaker, and maintain the decorum of the House. A recent illustration of this came on July 31 when the Speaker had to call a Recess early in Question Period. Relevant portion is 25:50 to 35:28:

At 27:57 the sound mix was adjusted so we could only hear the Speaker, but what happened (as the Speaker tells us at 34:43) is that something may have been said that caused general uproar in the House including the Premier and the Leader of the Opposition. The Speaker couldn’t get the House to order by asking so called a 5-min Recess (from 29:00 to 34:43). And from then on the Government refused to answer the Opposition’s questions under Standing Order 37(h) which simply states “A minister may, in his or her discretion, decline to answer any question.”

Reading of later answers by the Government paints that they heard a comment mocking the Member for Mississauga East–Cooksville (the one asking the question) from Gilles Bisson, the Official Opposition’s House Leader. Specifically I think the comment heard was to do with diversity or the Member’s Pakistani heritage since the Government’s House Leader keeps reiterating the Member’s ethnicity and the PC’s diversity of Members.

The Opposition continue on with Question Period gamely. Mr. Bisson denies the charge and as a point of order references Standing Order 23(h) and (i) which states you can’t make “allegations against another member” and impute “false or unavowed motives to another member”.

We’ll see if this continues in August 1st’s Question Period.

And that’s the extent of government business this month. The Opposition’s current points are extremist views are behind repealing the sex-ed curriculum, short-sightedness is behind Hydro One meddling, cronyism is behind everything, and the Premier’s insecurities are behind the municipal election meddling. The Government’s current points are respecting the taxpayer, respecting the ratepayer, Toronto needs provincial meddling to break its political deadlocks, cap-and-trade costs taxpayers, smaller government at all levels, and a little bit of gloating that they have a majority and the NDP doesn’t.

Some general governing, some truly annoying failure to answer questions, some truly peculiar nonsense, and just enough bad ideas to remind you why 60% of Ontarians voted against the Progressive Conservatives.