>>>>> "David" == David Rowley <david.row...@2ndquadrant.com> writes:
David> I went and had a few adventures with this patch to see if I David> could figure out why the small ~1% regression exists. Just changing the number of instructions (even in a completely unrelated place that's not called during the test) can generate performance variations of this size, even when there's no real difference. To get a reliable measurement of timing changes less than around 3%, what you have to do is this: pick some irrelevant function and add something like an asm directive that inserts a variable number of NOPs, and do a series of test runs with different values. See http://tinyurl.com/op9qg8a for an example of the kind of variation that one can get; this plot records timing runs where each different padding size was tested 3 times (non-consecutively, so you can see how repeatable the test result is for each size), each timing is actually the average of the last 10 of 11 consecutive runs of the test. To establish a 1% performance benefit or regression you need to show that there's still a difference _AFTER_ taking this kind of spooky-action-at-a-distance into account. For example, in the test shown at the link, if a substantive change to the code moved the upper and lower bounds of the output from (6091,6289) to (6030,6236) then one would be justified in claiming it as a 1% improvement. Such is the reality of modern CPUs. -- Andrew (irc:RhodiumToad)