On Tue, Apr 8, 2014 at 2:48 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote: > I think the point here is what matters is that that gain from the > strxfrm part of the patch is large, regardless of what the baseline is > (right?). If there's a small loss in an uncommon worst case, that's > probably acceptable, as long as the worst case is uncommon and the loss > is small. But if the loss is large, or the case is not uncommon, then a > fix for the regression is going to be a necessity.
That all seems reasonable. I just don't understand why you'd want to break out the fmgr-elision part, given that that was already discussed at length two years ago. > You seem to be assuming that a fix for whatever regression is found is > going to be impossible to find. I think that a fix that is actually worth it on balance will be elusive. Heikki's worst case is extremely narrow. I think he'd acknowledge that himself. I've already fixed some plausible regressions. For example, the opportunistic early "len1 == l3n2 && memcmp() == 0?" test covers the common case where two leading keys are equal. I think we're very much into chasing diminishing returns past this point. I think that my figure of a 5% regression is much more realistic, even though it is itself quite unlikely. I think that the greater point is that we don't want to take worrying about worst case performance to extremes. Calling Heikki's 50% regression the worst case is almost unfair, since it involves very carefully crafted input. You could probably also carefully craft input that made our quicksort implementation itself go quadratic, a behavior not necessarily exhibited by an inferior implementation for the same input. Yes, let's consider a pathological worst case, but lets put it in the context of being one end of a spectrum of behaviors, on the extreme fringes. In reality, only a tiny number of individual sort operations will experience any kind of regression at all. In simple terms, I'd be very surprised if anyone complained about a regression at all. If anyone does, it's almost certainly not going to be a 50% regression. There is a reason why many other systems have representative workloads that they target (i.e. a variety of tpc benchmarks). I think that a tyranny of the majority is a bad thing myself, but I'm concerned that we sometimes take that too far. I wonder, why did Heikki not add more padding to the end of the strings in his example, in order to give strxfrm() more wasted work? Didn't he want to make his worst case even worse? Or was is to control for TOASTing noise? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers