Re: The future of PGO on Windows

Joshua Cranmer Thu, 31 Jan 2013 11:35:22 -0800

On 1/31/2013 12:05 PM, Dave Mandelin wrote:

On Thursday, January 31, 2013 9:17:44 AM UTC-8, Joshua Cranmer wrote:
For what it's worth, reading<https://bugzilla.mozilla.org/show_bug.cgi?id=833890>, I do not getthe impression that dmandelin "proved" otherwise. His startup testshave very low statistical confidence (n=2, n=3), and someone whodisclaims his own findings. It may be evidence that PGO is not a Tswin, but it is weak evidence at best.
I could certainly run a larger number of trials to see what happens. In that case, I 
stopped because the min values for warm startup were about equal (and also happened to be 
about equal to other warm startup times I had measured recently). For many timed 
benchmarks, "base value + positive random noise" seems like a good model, in 
which case mins seem like good things to compare.

From a statistical hypothesis testing perspective, I think (I haven'tactually done the math) that the given data is unable to reject eitherthe hypothesis that PGO gives a benefit on startup time or thehypothesis that it does not. Mostly, I was cringing at ehsan's statementthat your results "proved" the hypothesis. About what the beststatistical criteria are, I don't wish to argue here.

Our Talos results may be measuring imperfect things, but we have
enough datapoints that we can draw statistical conclusions from
them confidently.

Statistics doesn't help if you're measuring the wrong things. Whether Ts is 
measuring the wrong thing, I don't know. It would be possible to learn 
something about that question by measuring startup with a camera, Telemetry 
simple measures, and Talos on the same machine and seeing how they compare.

I should clarify my previous statement: I want to avoid confirmationbias in this decision. The proper way to do that is to lay out all thecriterion for acceptance or rejection before you run experiments andmeasure the results. This, obviously, is impossible at this point, sincewe have a mountain of data which has already biased our thought processes.

By the way, there is a project (in a very early phase now) to do accurate 
measurements of startup time, both cold and warm, on machines that model user 
hardware, etc.

This is really starting to get off-topic, but I do think we need clearguidelines on evaluating performance results, which includes things likeensuring proper statistical testing on results, etc.

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: The future of PGO on Windows

Reply via email to