> Unfortunately, the BHR Telemetry data from both Aurora 43 & Beta 44
> experiments suggests that e10s is jankier than non-e10s. This holds true
> for profiles with & without extensions.
>
> However, we have identified bugs causing inaccuracies in BHR reporting and
> we are working to imporve BHR as well as other Telemetry performance
> measurements. We have even built an extension to visualize BHR's jank
> detection: https://github.com/chutten/statuser
>
> In general, as we evaluate e10s performance using A/B experiments, we also
> validate and improve the performance probes in parallel.

I have been told this part of my post is misleading, so I'll go into more 
detail about what we know and don't know about the reliability of the BHR 
responsiveness measurement and consequently e10s's performance.

1) We know BHR over-reported jank for e10s Firefox during both A/B experiments. 
This was fixed & uplifted in bug 1234618, but the uplifts didn't make it into 
either of the Aurora 43 & Beta 44 experiments. This means that the BHR 
experiment analyses linked above are not reliable. The upcoming Beta 45 A/B 
experiment will have the issue fixed.

2) Bill McCloskey looked at BHR e10s vs non-e10s performance on the general 
Aurora 45 population (not as part of an A/B experiment) and found that in these 
(not randomly selected) e10s & non-e10s populations, e10s is more responsive.
Bill's analysis: http://people.mozilla.org/~wmccloskey/aurora-analysis.html

3) We know that up to now, BHR did not report jank from the e10s child process 
at all (bug 1228437). This would have caused under-reporting of e10s jank 
during the A/B experiments as well as in billm's Aurora analysis. This is now 
fixed in bug 1228437 and pending uplift.

4) BHR uses "pseudostacks" to report the sources of hangs. The C++ psuedostack 
is lacking in coverage, reducing the usefulness of the collected stacks. This 
is now addressed in bug 1224374 and pending uplift.

5) There were additional issues with the most recent Beta 44 A/B experiment 
(e.g. bug 1236754) which reduced the quality of the collected data.

To summarize, validating & fixing BHR is an ongoing effort, and we don't have a 
basis for determining whether e10s performance is better or worse than non-e10s 
performance based on the BHR numbers yet. I am very optimistic about the 
quality of BHR data we will obtain from the upcoming Beta 45 A/B experiment.

P.S. Note that we have much higher confidence in the e10s Telemetry stability 
numbers. 
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to