>>>>> "Peter" == Peter Geoghegan <p...@heroku.com> writes:
Peter> Attached is a revision of this patch, that I'm calling v2. What Peter> do you think, Andrew? "No." is I think the best summary of my response. I strongly suggest whichever committer ends up looking at this consider my original version unchanged in preference to this. The cost/benefit decision of supporting abbreviation on 32bit platforms is a point that can be debated (I strongly support retaining the 32bit code, obviously), but the substantive changes here are actively wrong. Peter> Other than that, I've tried to keep things closer to the text Peter> opclass. For example, the cost model now has a few debugging Peter> traces (disabled by default). I have altered the ad-hoc cost Peter> model so that it no longer concerns itself with NULL inputs, Peter> which seemed questionable (not least since the abbreviation Peter> conversion function isn't actually called for NULL inputs. Any Peter> attempt to track the full count within numeric code therefore Peter> cannot work.). This is simply wrong. The reason why the cost model (in my version) tracks non-null values by having its own counter is precisely BECAUSE the passed-in memtupcount includes nulls, and therefore the code will UNDERESTIMATE the fraction of successfully abbreviated values if the comparison is based on memtupcount. In your version, if there are 9999 null values at the start of the input, then on the first non-null value after that, memtupcount will be 10000 and there will be only 1 distinct abbreviated value, causing abbreviation to spuriously abort. The test to clamp the estimate to 1.0 is just nonsensical and serves no purpose whatever, and the comment for it is wrong. You should fix the text abbreviation code, not propagate your mistakes further. (BTW, there's an outright typo in your code, ';;' for ';' at the end of a line. Sloppy.) Peter> I also now allocate a buffer of scratch memory separately from Peter> the main sortsupport object - doing one allocation for all Peter> sortsupport state, bunched together as a buffer seemed like a Peter> questionable micro-optimization. It's yet another cache line... I admit I did not benchmark that choice, but then neither did you. Peter> It seemed unwise to silently disable abbreviation when someone Peter> happened to build with DEC_DIGITS != 4. A static assertion now Peter> gives these unusual cases the opportunity to make an informed Peter> decision about either disabling abbreviation or not changing Peter> DEC_DIGITS in light of the performance penalty, in a Peter> self-documenting way. A) Nobody in their right minds is ever going to do that anyway B) Anybody who does that is either not concerned about performance or is concerned only about performance of the low-level numeric ops, and abbreviation is the last thing they're going to be worried about in either case. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers