On Fri, 11 Sep 2020 at 01:45, David Rowley <dgrowle...@gmail.com> wrote: > I've attached v4b (b is for backwards since the traditional backwards > tuple order is maintained). v4b seems to be able to run my benchmark > in 63 seconds. I did 10 runs today of yesterday's v3 patch and got an > average of 72.8 seconds, so quite a big improvement from yesterday.
After reading the patch back again I realised there are a few more things that can be done to make it a bit faster. 1. When doing the backup buffer, use code to skip over tuples that don't need to be moved at the end of the page and only memcpy() tuples earlier than that. 2. The position that's determined in #1 can be used to start the memcpy() loop at the first tuple that needs to be moved. 3. In the memmove() code for the preorder check, we can do a similar skip of the tuples at the end of the page that don't need to be moved. I also ditched the #ifdef'd out code as I'm pretty sure #1 and #2 are a much better way of doing the backup buffer given how many tuples are likely to be skipped due to maintaining the traditional tuple order. That gets my benchmark down to 60.8 seconds, so 2.2 seconds better than v4b. I've attached v6b and an updated chart showing the results of the 10 runs I did of it. David
compactify_tuples_dgr_v6b.patch
Description: Binary data