I think the premise that you need to collect data on the top sites that a user 
visits may be flawed. Won't you be contributing to the dominance of 
(already-dominant) top sites by optimizing for them specifically?

It also seems that you could get a reasonably accurate idea of what sites are 
most popular among FIrefox users by looking at the most popular sites overall 
and optimizing for those. Do you expect that Firefox users are so wildly 
different that their top sites don't look more or less the same as the top 
sites overall?

Further, as has been shown again and again, data thought to be untraceable to 
any particular user has been deanonymized through correlations with other data 
sets. Something like top visited sites are actually a pretty juicy target as 
well for state actors, blackmailers, etc.

Finally, the mere act of doing random (from the user's perspective) telemetry 
is problematic. First, users on limited connections don't need to be using more 
data than they already are. Second, the mere act of making a request with IP 
endpoints, even if it sends only a ping, can expose an unprepared user who 
needs privacy. I understand that Firefox already does some of this, but that's 
not really a reason to do more.

From a business perspective, a major differentiating factor (arguably the only 
differentiating factor) of Firefox is that Mozilla isn't Google. The closer you 
get to that line, the more damage you'll do to the trust users have in Mozilla.

I recommend that you take the high road on this one. I'm not sure what the 
motivator is here (does having more data give you leverage with partners)? But 
the stated justification (improving speeds on particular websites) seems too 
weak to excuse the valid privacy concerns.

Mozilla: we want to trust you. We do trust you. We know it's tough out there. 
You're playing with the big kids, and they have intel that, admittedly, 
probably helps them improve their products. But the way you can improve your 
product is by NOT collecting that intel. Do the Mozilla thing, not the Google 
thing.

On Monday, August 21, 2017 at 11:56:44 AM UTC-4, Georg Fritzsche wrote:
> Hi,
> 
> for Firefox we want to better understand how people use our product to
> improve their experience. To do that, we are planning to run a new SHIELD
> study that tests how we can collect additional data in a privacy preserving
> way. Check out the details below and send me your thoughts.
> 
> The problem.
> 
> One recurring ask from the Firefox product teams is the ability to collect
> more sensitive data, like top sites users visit and how features perform on
> specific sites.
> 
> Currently we can collect this data when the user opts in,  but we don't
> have a way to collect unbiased data, without explicit consent (opt-out).
> 
> Asks for sensitive data center most commonly around knowing something in
> relation to which sites a user visits:
> 
>    -
> 
>    "Which top sites are users visiting?"
>    -
> 
>    "Which sites using Flash does a user encounter?"
>    -
> 
>    "Which sites does a user see heavy Jank on?"
> 
> In summary most asks are for occurrences of an event X per domain (more
> specifically eTLD+1 [1], e.g. facebook.com or google.co.uk).
> 
> The solution.
> 
> One solution is the use of differential privacy [2] [3], which allows us to
> collect sensitive data without being able to make conclusions about
> individual users, thus preserving their privacy.
> 
> An attacker that has access to the data a single user submits is not able
> to tell whether a specific site was visited by that user or not.
> 
> The Google Open Source project called RAPPOR [4] [5] is the most widely
> known and deployed implementation of differential privacy.
> 
> We have been investigating the use of RAPPOR for these kind of use-cases,
> with initial simulation results being promising.
> 
> Our plan.
> 
> What we plan to do now is run an opt-out SHIELD study [6] to validate our
> implementation of RAPPOR. This study will collect the value for users’ home
> page (eTLD+1) for a randomly selected group of our release population  We
> are hoping to launch this in mid-September.
> 
> This is not the type of data we have collected as opt-out in the past and
> is a new approach for Mozilla. As such, we are still experimenting with the
> project and wanted to reach out for feedback.
> 
> Georg
> 
> References:
> 
> 1: https://en.wikipedia.org/wiki/Public_Suffix_List
> 
> 2: https://en.wikipedia.org/wiki/Differential_privacy
> 
> 3: https://robertovitillo.com/2016/07/29/differential-privacy-for-dummies/
> 
> 4: https://github.com/google/rappor
> 5: https://arxiv.org/abs/1407.6981
> <https://arxiv.org/abs/1407.6981>6:
> https://wiki.mozilla.org/Firefox/Shield/Shield_Studies

_______________________________________________
governance mailing list
governance@lists.mozilla.org
https://lists.mozilla.org/listinfo/governance

Reply via email to