Hi all,
We wanted to let you know about new data collection that we will be doing for Firefox Hello starting with FF46 launch on April 19th, and the steps we took to prevent it from collecting personal identification. We want to collect more data about the websites that people share with Hello, to help optimize the product UX, understand what people use our new tab sharing feature for, and prioritize features accordingly. The product features and UX can be very different if we decide to optimize against “Shopping together” use cases as opposed to “Playing online games together”, just as examples. We did a lot of diligence for this and explored several options for getting the data. The approach described below is the one we settled on. It prevents personal identification and gets us the data we need to build the best tool we can while being sensitive to our users. This involves collecting the domain names for tabs shared on Firefox Hello on our own servers. How we collect the data We plan to put in place a data collection solution that prevents personal identification. The technical approach to doing this through the use of client-side whitelisting is outlined here: - Data will go to our servers and will be stored with our other server metrics. We are aggregating domain names, and are not storing session histories. These are submitted at the end of the session, so exact timestamps of any visit are not included. - Users who have disabled Health Reports will also not submit this data. - We would use a whitelist client-side to only collect domains that are part of the top 2000 domains (Alexa list of top domains). This prevents personal identification based on obscure domain usage. We would subtract the sites from the Adult <http://www.alexa.com/topsites/category/Top/Adult> category and add all the subdomains of: - google.com <http://www.labnol.org/internet/popular-google-subdomains/5888/>(e.g., drive.google.com) - yahoo.com (e.g., games.yahoo.com) - developer.mozilla.org, bugzilla.mozilla.org, wiki.mozilla.org (this helps us understand how much our user base is Mozillians) - tunes.apple.com - You can see the exact list here: DomainWhitelist.jsm <https://github.com/mozilla/loop/blob/master/add-on/chrome/modules/DomainWhitelist.jsm> - The data will only be kept for 6 months and we plan to revisit this collection in 6 months. We’ll evaluate at the end of this period if we should carry on collecting the data (the data is still useful and will help further shape the product) or just stop. This e-mail is intended to make everyone aware of the data we’re collecting in Hello in an effort to be as transparent as possible. We want make sure people get the full picture of what we are trying to achieve and what we’re putting in place to protect our users. Let me know if you have any questions. Implementation bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1211542 Technical documentation: https://github.com/mozilla/loop/blob/master/docs/DataCollection.md -Romain _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform