Private Browsing instrumentation

Ehsan Akhgari via governance Fri, 14 Sep 2018 08:48:21 -0700

Hi everyone,

For those who don’t know me, I’m a Principal Engineer at Mozilla, and the
module owner of Private Browsing in Firefox.  I’m also responsible for the
original design and implementation of the feature many years ago.


Introduction

=========

TL;DR: Historically, we have considered the fact that the user may have
used a private window as sensitive data, preventing us from collecting data
such as how much usage PBM receives in Firefox. Going forward, we'd like to
modify this policy and will no longer do things to conceal, from ourselves
or from local adversaries, the fact that the user was in private browsing.
Of course, the user’s browsing history and any indicators revealing what
websites they’ve visited in a private window will remain sensitive data and
won’t be subject to our data collection.

This is a complex topic that I’ve been involved with for a few years now,
and it has taken me some time to change my own viewpoint here as we have
discovered more aspects of the issue and have had more time to reflect on
the various aspects of it, so I’d like to spend some time to describe the
background, what we’re changing, and why we’re making this change now.

Background

=========

Historically[1], the threat model we have had for private browsing focused
on a local adversary - someone with direct access to Firefox. We wanted to
prevent that local adversary from learning about the user’s activity while
in private browsing. The initial design of private browsing mode (PBM)
therefore concentrated on isolating that mode from regular browsing.
Overtime, we’ve added anti-tracking features to private browsing that also
protect against online adversaries.  The landscape of our anti-tracking
features is now slowly expanding outside of private browsing mode.

Because of this initial focus on the local adversary, our policy has been
to avoid leaving data on the device that would reveal the users’ activity
while in private browsing, specifically the type of activity that leaks
details about a user’s browsing. So for example we avoid storing users’
history and cookies during private browsing sessions. We interpreted this
policy to also include avoiding leaving data on the device that would
reveal that private browsing was used.

We write different types of data to disk in Firefox.  Some of this data
clearly would cross the lines in terms of revealing something about user’s
browsing, e.g. the URL of a page they have visited.  In such cases it has
been very easy to decide we should not write that data to disk during a
private session. In other cases we have data that is much less clearly
revealing.  This announcement is about one particular type of data: data
which reveals whether a user has used any private windows in a session at
all (obviously without revealing anything about what the user has done in
the said window).  Many years ago we made a decision to consider this class
of data as sensitive for private browsing, and have since went above and
beyond to ensure that Firefox won’t leave any trace on the disk to reveal
whether the user has used a private window in a session.

It’s worth mentioning that this decision originally wasn’t made because we
had a clear threat model around the discovery of a user having used a
private window, but rather due to the general principle of going for the
more conservative choice at the time, I believe.

Because our telemetry system writes data to disk prior to sending that data
to our servers, we have tried to avoid directly instrumenting private
browsing in order to not leave that data on disk.  However, our telemetry
has historically included data from the users’ private browsing sessions.
It does this by aggregating together data across private and non-private
sessions. What it has NOT deliberately included is information that would
segment the user’s activity based on private browsing or reveal that the
users was in private browsing at all. This would violate the policy
mentioned above. We have not therefore instrumented private browsing
directly and do not know with confidence to what extent this feature is
used today. So we might for example know that a user had ten tabs open in a
day but we don’t know if any of those tabs were opened in private browsing.

Problems with this approach

=====================

We have encountered several problem that result from special-casing private
browsing instrumentation in this way:


   1.

   This policy is confusing to anybody who isn’t steeped in the design
   considerations for private browsing. While that by itself isn’t sufficient
   to motivate changing the policy, the practical result of this confusion is
   uneven compliance from the teams responsible for instrumenting the browser.
   In some cases our telemetry aggregates private and non-private sessions, as
   described above. In some cases it only includes activity from non-private
   sessions.


That confusion also creates challenges for our product management,
marketing, and business teams. Mozilla as an organization is working to
make more informed data driven decisions about the direction of our
product. To the extent that there is confusion about the policy and about
what our data does and does not cover, that may result in the wrong
decisions. So for example, as a result of this confusion, we recently
determined that our active DAU (daily active users) number - the number we
are using to measure the topline health of our desktop product - is
inaccurate and is undercounting our users by an unknown margin.

Also, I would like to mention here that in the past few years, several
different teams have raised this problem to me on a number of different
occasions and have reached out to ask me (as the module owner for Private
Browsing) to consider changing our policy here.  I’ve repeatedly turned
down these requests, sticking to the old historic decision we made back in
the day here.


   1.

   Resulting from this confusion, special-casing private browsing
   measurement while we more fully instrument the browser has proven to be a
   brittle approach. We have found, despite our intention, that the fact of
   the user’s private browsing usage can be inferred to some extent from
   telemetry and from information available on disk. Essentially, we are
   already unintentionally leaking information about private browsing usage.


When some measurements include private and non-private sessions while
others only include non-private sessions, the difference between the two
approaches allows us to infer information about users’ private browsing.
While we can fix this on a case-by-case basis as we identify instances of
non-compliance, our expectation is that this leakage will continue so long
as we continue to special case PBM instrumentation.

This means that our policy with regards to the instrumentation of whether a
private window has been used isn’t enforceable in practical terms.


   1.

   Over the years, nobody has managed to put forth a convincing threat
   model that actually requires us to avoid instrumenting the usage of a
   private window.  Every scenario we have looked at has been making contrived
   assumptions about what the local attacker is willing/able to do (e.g. they
   have full access to the user’s computer over an extended period of time,
   but they are unable to install any spying software on their machine).  In
   other words, it has always been unclear what we gain from persevering in
   maintaining this part of our policies around private browsing.

   2.

   Finally, even when we have clarity and consistency regarding this
   policy, this still results in a large gap in knowledge about our own user
   base. Privacy is a key part of our mission and our business strategy and we
   think it is a key reason users come to Firefox.  But we don’t actually
   know and we have no insight into the usage patterns of PBM. This makes it
   difficult for us to know whether we are actually being successful and for
   us to make informed resource decisions for private and security features
   like the Facebook Container[2], Firefox Monitor[3], and our upcoming
   series of anti-tracking features[4].  It also makes it hard to argue for
   more investment in privacy features as we typically lack important data to
   demonstrate that people find one of our largest existing privacy features
   useful.


Change to the policy

================

As a policy, we are going to stop special-casing private browsing
measurement and plan to instrument it directly. This means that going
forward, we will no longer avoid instrumenting the fact that the user has
used a private window in a session. This is a minor change that will allow
us to know the fact of whether private browsing is used. It does not
include changes to the key properties of private browsing - that the user’s
browsing activity is hidden from the local adversary, and the anti-tracking
features that come with it. We do not collect user browsing history in
regular mode or in private browsing, so that property will be maintained
and private browsing history will still not be available to local
adversaries. As always, users will be able to turn off our data collection
through the existing controls exposed in Preferences.

If you read this far, thanks for your attention.  I hope this change
enables us to measure private browsing more effectively and enable Mozilla
to bring more features and improvements to this area in the years ahead!

Cheers,

Ehsan


[1] https://wiki.mozilla.org/Private_Browsing

[2] https://addons.mozilla.org/en-US/firefox/addon/facebook-container/

[3]
https://blog.mozilla.org/futurereleases/2018/06/25/testing-firefox-monitor-a-new-security-tool/

[4]
https://blog.mozilla.org/futurereleases/2018/08/30/changing-our-approach-to-anti-tracking/
_______________________________________________
governance mailing list
governance@lists.mozilla.org
https://lists.mozilla.org/listinfo/governance

Private Browsing instrumentation

Reply via email to