Data Science is Hard: What’s in a Dashboard

1920x1200-4-COUPLE-WEEKS-AFTER
The data is fake, don’t get excited.

Firefox Quantum is here! Please do give it a go. We have been working really hard on it for quite some time, now. We’re very proud of what we’ve achieved.

To show Mozillians how the release is progressing, and show off a little about what cool things we can learn from the data Telemetry collects, we’ve built a few internal dashboards. The Data Team dashboard shows new user count, uptake, usage, install success, pages visited, and session hours (as seen above, with faked data). If you visit one of our Mozilla Offices, you may see it on the big monitors in the common areas.

The dashboard doesn’t look like much: six plots and a little writing. What’s the big deal?

Well, doing things right involved quite a lot more than just one person whipping something together overnight:

1. Meetings for this dashboard started on Hallowe’en, two weeks before launch. Each meeting had between eight and fourteen attendees and ran for its full half-hour allotment each time.

2. In addition there were several one-off meetings: with Comms (internal and external) to make sure we weren’t putting our foot in our mouth, with Data ops to make sure we weren’t depending on datasets that would go down at the wrong moment, with other teams with other dashboards to make sure we weren’t stealing anyone’s thunder, and with SVPs and C-levels to make sure we had a final sign-off.

3. Outside of meetings we spent hours and hours on dashboard design and development, query construction and review, discussion after discussion after discussion…

4. To say nothing of all the bikeshedding.

It’s hard to do things right. It’s hard to do even the simplest things, sometimes. But that’s the job. And Mozilla seems to be pretty good at it.

One last plug: if you want to nudge these graphs a little higher, download and install and use and enjoy the new Firefox Quantum. And maybe encourage others to do the same?

:chutten

Advertisements

An Unofficial Guide to Unofficial Swag: Stickers

Mozillians like stickers.

laptopStickers

However! Mozilla doesn’t print as many stickers as you might think it does. Firefox iconography, moz://a wordmarks, All Hands-specific rounds, and Mozilla office designs are the limit of official stickers I’ve seen come from official sources.

The vast majority of sticker designs are unofficial, made by humans like you! This guide contains tips that should help you create and share your own unofficial stickers.

Plan 9
(original poster by Tom Jung, modifications by :Yoric and myself. Use under CC-BY-SA 3.0)

Design

I’m not a designer. Luckily for my most recent printing project I was simply updating the existing design you see above. If you are adapting someone else’s design, ensure you either have permission or are staying within the terms of the design’s license. Basic Firefox product identity assets are released under generous terms for remixing, for instance.

Size

The bigger they are, the harder they are to fit in a pocket or on the back of a laptop screen. Or in carry-on. The most successful stickers I’ve encountered have been at most 7cm on the longest side (or in diameter, for rounds), and many have been much smaller. With regards to size, less may in fact be more, but you have to balance this with any included text which must be legible. The design I used wouldn’t work much smaller than 7cm in height, and the text is already a little hard to read.

Distribution

How will you distribute these? If your design is team-specific, a work week is a good chance to hand them out individually. If the design is for a location, then pick a good gathering point in that location (lunchrooms are a traditional and effective choice), fan out some dozen or two stickers, and distribution should take care of itself. All Hands are another excellent opportunity for individual and bulk distribution. If the timing doesn’t work out well to align with a work week or an All Hands, you may have to resort to mailing them over the globe yourself. In this case, foster contacts in Mozilla spaces around the world to help your stickers make it the last mile into the hands and onto the laptops of your appreciative audience.

Volume

50 is a bare minimum both in what you’ll be permitted to purchase by the printer and in how many you’ll want to have on hand to give away. If your design is timeless (i.e. doesn’t have a year on it, doesn’t refer to a current event), consider making enough leftovers for the future. If your design is generic enough that there will be interest outside of your team, consider increasing supply for this demand. Generally the second 50 stickers cost a pittance compared to the first 50, so don’t be afraid to go for a few extra.

Funding

You’ll be paying for this yourself. If your design is team-specific and you have an amenable expense approver you might be able to gain reimbursement under team-building expenses… But don’t depend on this. Don’t spend any money you can’t afford. You’re looking at between 50 to 100 USD for just about any number of any kind of sticker, at current prices.

Location

I’m in Canada. The sticker printer I chose most recently (stickermule) was in the US. Unsurprisingly, it was cheaper and faster to deliver the stickers to the US. Luckily, :kparlante was willing to mule the result to me at the San Francisco All Hands, so I was able to save both time and money. Consider these logistical challenges when planning your swag.

Timing

Two weeks before an All Hands is probably too late to start the process of generating stickers. I’ve made it happen, but I was lucky. Be more prepared than I was and start at least a month ahead. (As of publication time you ought to have time to take care of it all before Austin).

Printing

After putting a little thought into the above areas it’s simply a matter of choosing a printing company (local to your region, or near your distribution venue) and sending in the design. They will likely generate a proof which will show you what their interpretation of your design on their printing hardware will look like. You then approve the proof to start printing, or make changes to the design and have the printer regenerate a proof until you are satisfied. Then you arrange for delivery and payment. Expect this part of the process to take at least a week.

And that’s all I have for now. I’ll compile any feedback I receive into an edit or a Part 2, as I’ve no doubt forgotten something essential that some member of the Mozilla Sticker Royalty will only too happily point out to me soonish.

Oh, and consider following or occasionally perusing the mozsticker Instagram account to see a sample of the variety in the Mozilla sticker ecosystem.

Happy Stickering!

:chutten

Two-Year Moziversary

Today marks two years since I became a Mozillian and MoCo Staff.

What did I do this year… well, my team was switched out from under me again. This time it was during the large Firefox + Platform reorg, and basically means my team (Telemetry Client Engineering) now has a name that more closely matches what I do: writing client-side Telemetry code, performing ad hoc data analysis, and reading a lot of email. I still lurk on #fce and answer questions for :ddurst about data matters from time to time, so it’s not a clean break by any means.

This means my work has been a little more client-focused. I completed my annual summer Big Refactor or Design Thing That Takes The Whole Summer For Some Reason. Last year it was bug 1218576 (whose bug number is lodged in my long-term memory and just won’t leave). This year it was bug 1366294 and its friends where, in support of Quantum, we reduced our storage overhead per-process by quite a fair margin. At the same time we removed excessive string hashes, fast-pathing most operations.

Ah, yes: Quantum. Every aspect of Firefox was under scrutiny… and from a data perspective. I’ve lost count of the number of times I’ve been called in to consult on data matters in support of the quickening of the new Firefox Quantum (coming this November to an Internet Near You!). I even spent a couple days in Toronto as part of a Quantum work week to nail down exactly what we could and should measure before and after shipping each build.

A pity I didn’t leave myself more time to just hang out with the MoCoTo folks.

In All Hands news we hit Hawai’i last December. Well, some of us did. With the unrest in the United States and the remoteness of the location this was a bit more of a Most Hands. Regardless, it was a productive time. Not sure how we managed to find so much rain and snow in a tropical desert, but we’re a special bunch I guess?

In June we were in San Francisco. There I ate some very spicy lunch and helped nail down some Telemetry Health metrics I’ve done some work on this autumn. Hopefully we’ll be able to get those metrics into Mission Control next year with proper thresholds for alerting if things go wrong.

This summer I mentored :flyingrub for Google Summer of Code. That was an interesting experience that ended up taking up quite a lot more time than I imagined it would when I started. I mean, sure, you can write it down on paper how many hours a week you’ll spend mentoring an intern through a project, and how many hours beforehand you’ll spend setting it up… but it’s another thing to actually put in the work. It was well worth it, and :flyingrub was an excellent contributor.

In last year’s Moziversary post I resolved to blog more, think more, and mentor more. I’ve certainly mentored more, with handfuls of mentored bugs contributed by first-time community members and that whole GSoC thing. I haven’t blogged more, though, as though I’ve written 23 posts with only April and July going without a single writing on this here blog, last year I posted 27. I also am not sure I have thought more, as simple and stupid mistakes still cast long shadows in my mind when I let them.

So I guess that makes two New MozYear Resolutions (New Year Mozolutions?) easy:

  • actually blog more, even if they are self-indulgent vanity posts. (Let’s be honest, though: they’re all self-indulgent vanity posts).
  • actually think more. Make fewer stupid mistakes, or if that’s not feasible at least reduce the size of their influence on the world and my mind after I make them.

That might be enough to think about for a year, right?

:chutten

Anatomy of a Firefox Update

Alessio (:Dexter) recently landed a new ping for Firefox 56: the “update” ping with reason “ready”. It lets us know when a client’s Firefox has downloaded and installed an update and is only waiting for the user to restart the browser for the update to take effect.

In Firefox 57 he added a second reason for the “update” ping: reason “success”. This lets us know when the user’s started their newly-updated Firefox.

I thought I might as well see what sort of information we could glean from this new data, using the recent shipping of the new Firefox Quantum Beta as a case study.

This is exploratory work and you know what that means[citation needed]: Lots of pretty graphs!

First: the data we knew before the “update” ping: Nothing.

Well, nothing specific. We would know when a given client would use a newly-released build because their Telemetry pings would suddenly have the new version number in them. Whenever the user got around to sending them to us.

We do have data about installs, though. Our stub installer lets us know how and when installs are downloaded and applied. We compile those notifications into a dataset called download_stats. (for anyone who’s interested: this particular data collection isn’t Telemetry. These data points are packaged and sent in different ways.) Its data looks like this:Screenshot-2017-9-29 Recent Beta Downloads.png

Whoops. Well that ain’t good.

On the left we have the tailing edge of users continuing to download installs for Firefox Beta 56 at the rate of 50-150 per hour… and then only a trace level of Firefox Beta 57 after the build was pushed.

It turns out that the stub installer notifications were being rejected as malformed. Luckily we kept the malformed reports around so that after we fixed the problem we could backfill the dataset:Screenshot-2017-10-4 Recent Beta Downloads

Now that’s better. We can see up to 4000 installs per hour of users migrating to Beta 57, with distinct time-of-day effects. Perfectly cromulent, though the volume seems a little low.

But that’s installs, not updates.

What do we get with “update” pings? Well, for one, we can run queries rather quickly. Querying “main” pings to find the one where a user switched versions requires sifting through terabytes of data. The query below took two minutes to run:

Screenshot-2017-10-3 Users Updating to Firefox Quantum Beta 57(1)

The red line is update/ready: the number of pings we received in that hour telling us that the user had downloaded an update to Beta 57 and it was ready to go. The blue line is update/success: the number of pings we received that hour telling us the user had started their new Firefox Quantum Beta instance.

And here it is per-minute, just because we can:Screenshot-2017-10-3 Users Updating to Firefox Quantum Beta 57(2).png

September 30 and October 1 were the weekend. As such, we’d expect their volumes to be lower than the weekdays surrounding them. However, looking at the per-minute graph for update/ready (red), why is Friday the 29th the same height as Saturday the 30th? Fridays are usually noticeably busier than Saturdays.

Friday was Navarati in India (one of our largest market for Beta) but that’s a multi-day festival that started on the Wednesday (and other sources for client data show only a 15% or so dip in user activity on that date in India), so it’s unlikely to have caused a single day’s dip. Friday wasn’t a holiday at all in any of our other larger markets. There weren’t any problems with the updater or “update” ping ingestion. There haven’t been any dataset failures that would explain it. So what gives?

It turns out that Friday’s numbers weren’t low: Saturday’s were high. In order to improve the stability of what was going to become the Firefox 56 release we began on the 26th to offer updates to the new Firefox Quantum Beta to only half of updating Firefox Beta users. To the other half we offered an update to the Firefox 56 Release Candidate.

What is a Release Candidate? Well, for Firefox it is the stabilized, optimized, rebuilt, rebranded version of Firefox that is just about ready to ship to our release population. It is the last chance we have to catch things before it reaches hundreds of millions of users.

It wasn’t until late on the 29th that we opened the floodgates and let the rest of the Beta users update to Beta 57. This contributed to a higher than expected update volume on the 30th, allowing the Saturday numbers to be nearly as voluminous as the Friday ones. You can actually see exactly when we made the change: there’s a sharp jump in the red line late on September 29 that you can see clearly on both “update”-ping-derived plots.

That’s something we wouldn’t see in “main” pings: they only report what version the user is running, not what version they downloaded and when. And that’s not all we get.

The “update”-ping-fueled graphs have two lines. This rather abruptly piques my curiosity about how they might relate to each other. Visually, the update/ready line (red) is almost always higher than the update/success line (blue). This means that we have more clients downloading and installing updates than we have clients restarting into the updated browser in those intervals. We can count these clients by subtracting the blue line from the red and summing over time:Screenshot-2017-10-3 Outstanding Updates for Users Updating to Firefox Quantum Beta 57

There are, as of the time I was drafting this post, about one half of one million Beta clients who have the new Firefox Quantum Beta… but haven’t run it yet.

Given the delicious quantity of improvements in the new Firefox Quantum Beta, they’re in for a pleasant surprise when they do.

And you can join in, if you’d like.

:chutten

(NOTE: earlier revisions of this post erroneously said download_stats counted updater notifications. It counts stub installer notifications. I have reworded the post to correct for this error. Many thanks to :ddurst for catching that)

Data Science is Hard: Dangerous Data

I sit next to a developer at my coworking location (I’m one of the many Mozilla staff who work remotely) who recently installed the new Firefox Quantum Beta on his home and work machines. I showed him what I was working on at the time (that graph below showing how nicely our Nightly population has increased in the past six months), and we talked about how we count users.

Screenshot-2017-9-28 Desktop Nightly DAU MAU for the Last Six Months by Version

=> “But of course we’ll be counting you twice, since you started a fresh profile on each Beta you installed. Actually four times, since you used Nightly to download and install those builds.” This, among other reasons, is why counting users is hard.

<= “Well, you just have to link it to my Firefox Account and then I’ll only count as one.” He figured it’d be a quick join and then we’d have better numbers for some users.

=> “Are you nuts?! We don’t link your Firefox Account to Telemetry! Imagine what an attacker could do with that!”

In a world with adversarial trackers, advertising trackers, and ever more additional trackers, it was novel to this pseudo-coworker of mine that Mozilla would specifically not integrate its systems.

Wouldn’t it be helpful to ourselves and our partners to know more about our users? About their Firefox Accounts? About their browsing history…

Mozilla doesn’t play that game. And our mission, our policies, and our practices help keep us from accidentally providing “value” of this kind for anyone else.

We know the size of users’ history databases, but not what’s in them.

We know you’re the same user when you close and reopen Firefox, but not who you are.

We know whether users have a Firefox Account, but not which ones they are.

We know how many bookmarks users have, but not what they’re for.

We know how many tabs users have open, but not why. (And for those users reporting over 1000 tabs: WHY?!)

And even this much we only know when you let us:

firefoxDataCollection

Why? Why do we hamstring our revenue stream like this? Why do we compromise on the certainty that having complete information would provide? Why do we allow ourselves to wonder and move cautiously into the unknown when we could measure and react with surety?

Why do we make Data Science even harder by doing this?

Because we care about our users. We think about what a Bad Actor could do if they had access to the data we collect. Before we okay a new data collection we think of all the ways it could be abused: Can it identify the user? Does it link to another dataset? Might it reveal something sensitive?

Yes, we have confidence in our security, our defenses in depth, our privacy policies, and our motivations to work for users and their interests.

But we are also confident that others have motivations and processes and policies that don’t align with ours… and might be given either the authority or the opportunity to gain access in the future.

This is why Firefox Send doesn’t know your encryption key for the files you share with your friends. This is why Firefox Accounts only knows six things (two of them optional) about you, and why Firefox Sync cannot read the data it’s storing for you.

And this is why Telemetry doesn’t know your Firefox Account id.

:chutten

Two Days, or How Long Until The Data Is In

Two days.

It doesn’t seem like long, but that is how long you need to wait before looking at a day’s Firefox data and being sure than 95% of it has been received.

There are some caveats, of course. This only applies to current versions of Firefox (55 and later). This will very occasionally be wrong (like, say, immediately after Labour Day when people finally get around to waking up their computers that have been sleeping for quite some time). And if you have a special case (like trying to count nearly everything instead of just 95% of it) you might want to wait a bit longer.

But for most cases: Two Days.

As part of my 2017 Q3 Deliverables I looked into how long it takes clients to send their anonymous usage statistics to us using Telemetry. This was a culmination of earlier ponderings on client delay, previous work in establishing Telemetry client health, and an eighteen-month (or more!) push to actually look at our data from a data perspective (meta-data).

This led to a meeting in San Francisco where :mreid, :kparlante, :frank, :gfritzsche, and I settled upon a list of metrics that we ought to measure to determine how healthy our Telemetry system is.

Number one on that list: latency.

It turns out there’s a delay between a user doing something (opening a tab, for instance) and them sending that information to us. This is client delay and is broken into two smaller pieces: recording delay (how long from when the user does something until when we’ve put it in a ping for transport), and submission delay (how long it takes that ready-for-transport ping to get to Mozilla).

If you want to know how many tabs were opened on Tuesday, September the 5th, 2017, you couldn’t tell on the day itself. All the tabs people open late at night won’t even be in pings, and anyone who puts their computer to sleep won’t send their pings until they wake their computer in the morning of the 6th.

This is where “Two Days” comes in: On Thursday the 7th you can be reasonably sure that we have received 95% of all pings containing data from the 5th. In fact, by the 7th, you should even have that data in some scheduled datasets like main_summary.

How do we know this? We measured it:

Screenshot-2017-9-12 Client "main" Ping Delay for Latest Version(1).png(Remember what I said about Labour Day? That’s the exceptional case on beta 56)

Most data, most days, comes in within a single day. Add a day to get it into your favourite dataset, and there you have it: Two Days.

Why is this such a big deal? Currently the only information circulating in Mozilla about how long you need to wait for data is received wisdom from a pre-Firefox-55 (pre-pingsender) world. Some teams wait up to ten full days (!!) before trusting that the data they see is complete enough to make decisions about.

This slows Mozilla down. If we are making decisions on data, our data needs to be fast and reliably so.

It just so happens that, since Firefox 55, it has been.

Now comes the hard part: communicating that it has changed and changing those long-held rules of thumb and idées fixes to adhere to our new, speedy reality.

Which brings us to this blog post. Consider this your notice that we have looked into the latency of Telemetry Data and is looks pretty darn quick these days. If you want to know about what happened on a particular day, you don’t need to wait for ten days any more.

Just Two Days. Then you can have your answers.

:chutten

(Much thanks to :gsvelto and :Dexter’s work on pingsender and using it for shutdown pings, :Dexter’s analyses on ping delay that first showed these amazing improvements, and everyone in the data teams for keeping the data flowing while I poked at SQL and rearranged words in documents.)

 

The Photonization of about:telemetry

This summer I mentored :flyingrub for a Google Summer of Code project to redesign about:telemetry. You can read his Project Submission Document here.

Background

Google Summer of Code is a program funded by Google to pay students worldwide to contribute in meaningful ways to open source projects.

about:telemetry is a piece of Firefox’s UI that allows users to inspect the anonymous usage data we collect to improve Firefox. For instance, we look at the maximum number of tabs our users have open during a session (someone or several someones have more than one thousand tabs open!). If you open up a tab in Firefox and type in about:telemetry (then press Enter), you’ll see the interface we provide for users to examine their own data.

Mozilla is committed to putting users in control of their data. about:telemetry is a part of that.

Then

When :flyingrub started work on about:telemetry, it looked like this (Firefox 55):

oldAboutTelemetry

It was… functional. Mostly it was intended to be used by developers to ensure that data collection changes to Firefox actually changed the data that was collected. It didn’t look like part of Firefox. It didn’t look like any other about: page (browse to about:about to see a list of about: pages). It didn’t look like much of anything.

Now

After a few months of polishing and tweaking and input from UX, it looks like this (Firefox Nightly 57):

newAboutTelemetry

Well that’s different, isn’t it?

It has been redesigned to follow the Photon Design System so that it matches how Firefox 57 looks. It has been reorganized into more functional groups, has a new top-level search, and dozens of small tweaks to usability and visibility so you can see more of your data at once and get to it faster.

newAboutTelemetry-histograms.png

Soon

Just because Google Summer of Code is done doesn’t mean about:telemetry is done. Work on about:telemetry continues… and if you know some HTML, CSS, and JavaScript you can help out! Just pick a bug from the “Depends on” list here, and post a comment asking if you can help out. We’ll be right with you to help get you started. (Though you may wish to read this first, since it is more comprehensive than this blog post.)

Even if you can’t or don’t want to help out, you can take sneak a peek at the new design by downloading and using Firefox Nightly. It is blazing fast with a slick new design and comes with excellent new features to help be your agent on the Web.

We expect :flyingrub will continue to contribute to Firefox (as his studies allow, of course. He is a student, and his studies should be first priority now that GSoC is done), and we thank him very much for all of his good work this Summer.

:chutten