On Tue, Jan 20, 2015 at 8:39 PM, Peter Geoghegan <p...@heroku.com> wrote: > On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas <robertmh...@gmail.com> wrote: >> I was assuming we were going to fix this by undoing the abbreviation >> (as in the abort case) when we spill to disk, and not bothering with >> it thereafter. > > The spill-to-disk case is at least as compelling at the internal sort > case. The overhead of comparisons is much higher for tapesort. > > Attached patch serializes keys. On reflection, I'm inclined to go with > this approach. Even if the CPU overhead of reconstructing strxfrm() > blobs is acceptable for text, it might be much more expensive for > other types. I'm loathe to throw away those abbreviated keys > unnecessarily. > > We don't have to worry about having aborted abbreviation, since once > we spill to disk we've effectively committed to abbreviation. This > patch formalizes the idea that there is strictly a pass-by-value > representation required for such cases (but not that the original > Datums must be of a pass-by-reference, which is another thing > entirely). I've tested it some, obviously with Andrew's testcase and > the regression tests, but also with my B-Tree verification tool. > Please review it. > > Sorry about this.
I don't want to change the on-disk format for tapes without a lot more discussion. Can you come up with a fix that avoids that for now? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers