On Thursday, January 31, 2013 9:17:44 AM UTC-8, Joshua Cranmer wrote: > On 1/31/2013 10:51 AM, Ehsan Akhgari wrote: > > On 2013-01-31 11:43 AM, Kyle Huey wrote: > >> On Wed, Jan 30, 2013 at 8:03 PM, Ehsan Akhgari <ehsan.akhg...@gmail.com > >> <mailto:ehsan.akhg...@gmail.com>> wrote: > >> > >> We then tried to get a sense of how much of a win the PGO > >> optimizations are. Thanks to a series of measurements by dmandelin, > >> we know that disabling PGO/LTCG will result in a regression of about > >> 10-20% on benchmarks which examine DOM and layout performance such > >> as Dromaeo and guimark2 (and 40% in one case), but no significant > >> regressions in the startup time, and gmail interactions. Thanks to > >> a series of telemetry measurements performed by Vladan on a Nightly > >> build we did last week which had PGO/LTCG disabled, there are no > >> telemetry probes which show a significant regression on builds > >> without PGO/LTCG. Vladan is going to try to get this data out of a > >> Tp5 run tomorrow as well, but we don't have any evidence to believe > >> that the results of that experiments will be any different. > >> > >> Isn't PGO worth something like 15% on Ts? > > > That was what I thought, but local measurements performed by dmandelin > > proved otherwise. > > For what it's worth, reading > <https://bugzilla.mozilla.org/show_bug.cgi?id=833890>, I do not get the > impression that dmandelin "proved" otherwise. His startup tests have > very low statistical confidence (n=2, n=3), and someone who disclaims > his own findings. It may be evidence that PGO is not a Ts win, but it is > weak evidence at best.
I could certainly run a larger number of trials to see what happens. In that case, I stopped because the min values for warm startup were about equal (and also happened to be about equal to other warm startup times I had measured recently). For many timed benchmarks, "base value + positive random noise" seems like a good model, in which case mins seem like good things to compare. > Our Talos results may be measuring imperfect things, but we have > enough datapoints that we can draw statistical conclusions from > them confidently. Statistics doesn't help if you're measuring the wrong things. Whether Ts is measuring the wrong thing, I don't know. It would be possible to learn something about that question by measuring startup with a camera, Telemetry simple measures, and Talos on the same machine and seeing how they compare. By the way, there is a project (in a very early phase now) to do accurate measurements of startup time, both cold and warm, on machines that model user hardware, etc. Dave _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform