I wrote: > The most likely theory, I think, is that that compiler is generating > slightly different floating-point code causing different plans to > be costed slightly differently than what the test case is expecting. > Probably, the different orderings of the keys in this test case have > exactly the same cost, or almost exactly, so that different roundoff > error could be enough to change the selected plan.
I added some debug printouts to get_cheapest_group_keys_order() and verified that in the two problematic queries, there are two different orderings that have (on my machine) exactly equal lowest cost. So the code picks the first of those and ignores the second. Different roundoff error would be enough to make it do something else. I find this problematic because "exactly equal" costs are not going to be unusual. That's because the values that cost_sort_estimate relies on are, sadly, just about completely fictional. It's expecting that it can get a good cost estimate based on: * procost. In case you hadn't noticed, this is going to be 1 for just about every function we might be considering here. * column width. This is either going to be a constant (e.g. 4 for integers) or, again, largely fictional. The logic for converting widths to cost multipliers adds yet another layer of debatability. * numdistinct estimates. Sometimes we know what we're talking about there, but often we don't. So what I'm afraid we are dealing with here is usually going to be garbage in, garbage out. And we're expending an awful lot of code and cycles to arrive at these highly questionable choices. Given the previous complaints about db0d67db2, I wonder if it's not most prudent to revert it. I doubt we are going to get satisfactory behavior out of it until there's fairly substantial improvements in all these underlying estimates. regards, tom lane