This is a high-level overview of the Perf team's efforts so far to certify e10s performance & stability -- these metrics will have to be as good as those of single-process Firefox before e10s can be released as the default configuration on the Release channel. This post is addressed to the general Firefox developer audience and skimps on details, so please feel free to get clarification on the mozilla.dev.platform thread or talk to us on #perf.
1. PERFORMANCE 1.1) Overview We have been using Talos & Telemetry data to understand Firefox performance in its e10s & non-e10s configurations. Talos numbers have mostly improved with e10s, but some Telemetry responsiveness measures *seem* to suggest a non-negligible e10s performance regression. Our approach is to run A/B experiments on the Aurora & Beta channels using the TelemetryExperiments infrastructure. For each profile in the experiment, the experiment code randomly configures Firefox to run in e10s or non-e10s mode. We have run these experiments on Aurora 43 and Beta 44. Currently, we are mostly focused on studying general measures of responsiveness in these experiments: * Frequency of main-thread events lasting longer than 127ms: this is tracked by the Background Hang Reporter code (aka BHR) and reported via Telemetry * UI event processing lag: EVENTLOOP_UI_ACTIVITY_EXP_MS histogram probe * Frame painting delay: FX_REFRESH_DRIVER_CHROME_FRAME_DELAY_MS probe, FX_REFRESH_DRIVER_CONTENT_FRAME_DELAY_MS probe, REFRESH_DRIVER_TICK probe, and other histogram probes 1.2) Background Hang Reporter data Of these general measures of responsiveness, the BHR measurement is the most useful, as it also captures pseudo-stacks from janky events, allowing us to attribute jank to various sources (extensions, plugins, various Firefox features, web page scripts, etc) Unfortunately, the BHR Telemetry data from both Aurora 43 & Beta 44 experiments suggests that e10s is jankier than non-e10s. This holds true for profiles with & without extensions. However, we have identified bugs causing inaccuracies in BHR reporting and we are working to imporve BHR as well as other Telemetry performance measurements. We have even built an extension to visualize BHR's jank detection: https://github.com/chutten/statuser In general, as we evaluate e10s performance using A/B experiments, we also validate and improve the performance probes in parallel. 1.3) Other measures of performance We are also analyzing data from many other performance probes: startup & shutdown time probes, page-load time, scrolling smoothness, tab animation smoothness, memory usage, shutdown hangs, and many others. We will also use the BHR Telemetry data to generate a whitelist/blacklist/graylist of addons based on how often they jank e10s Firefox. 1.4) Findings You can see all our analyses of experiment data here: * Beta 44 experiment analyses: https://github.com/vitillo/e10s_analyses/tree/master/beta * Aurora 43 experiment analyses: https://github.com/vitillo/e10s_analyses/tree/master/aurora If you're interested in these analyses, I recommend starting with this analysis of e10s vs non-e10s performance on Beta 44 experiment profiles *without* any extensions installed: https://github.com/vitillo/e10s_analyses/blob/master/beta/addons/e10s_without_addons_experiment.ipynb Start at the section "1. Generic stuff"; the code at the beginning of the analyses is just analysis boilerplate. 2. STABILITY Socorro does not allow us to easily compare e10s vs non-e10s crash rates from experiments. Luckily, Telemetry now reports on Firefox crash events as well, and Telemetry data can be analyzed fairly easily using Spark. Our analyses of the Telemetry crash reports from the Beta 44 A/B experiment showed that e10s is significantly crashier than non-e10s. This was true for profiles with & without extensions. https://github.com/poiru/e10s_analyses/blob/beta/beta/e10s_crash_rate_without_extensions.ipynb There were known issues with a11y blacklisting in e10s code, so we expect the stability measures to improve during the next A/B experiment. 3. FUTURE WORK * There will be another A/B experiment on Beta 45 * BHR and other responsiveness measures are still being improved * We will invest more in other measures of performance after the BHR responsiveness deficit in e10s is better understood * We will soon generate a preliminary extension whitelist/blacklist/graylist for e10s based on experiment data from Beta 45 or 46 You can follow our progress at this meta bug: https://bugzilla.mozilla.org/show_bug.cgi?id=e10s-measurement You can see the full plan for evaluating e10s performance & stability here: https://docs.google.com/document/d/1TyE0BehzYhii3qfmcrfjXlRJL64CcJk0B4Voup4Q0Pg/edit# _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform