You all know Telemetry, our Firefox data collection system that provides us with real-world data about performance, hardware, usage and customizations.
Spark is an open-source in-memory data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. In contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's in-memory primitives provide performance up to 100 times faster which allow to interactively run distributed operations from a shell and analyze millions of Telemetry submissions in a matter of seconds; a great fit for exploratory data analyses. I will briefly go over the data layout of Telemetry and how Spark works under the hood. Finally, I will jump in a hands-on interactive analysis session with real data. No prerequisites are required in terms of Telemetry, Spark or distributed computing. This is a great opportunity to learn how to quickly extract actionable insight from our data. Time: Friday, December 5, 2014, 4:00:00 PM - 5:30:00 PM GMT -05:00 Location: Belmont Room, Marriott Waterfront 2nd, Mozlandia _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform