On Tue, 4 Apr 2023 at 07:47, Gregory Stark (as CFM) <stark....@gmail.com> wrote: > The referenced patch was committed March 19th but there's been no > comment here. Is this patch likely to go ahead this release or should > I move it forward again?
Thanks for the reminder on this. I have done some work on it but just didn't post it here as I didn't have good news. The problem I'm facing is that after Melanie's recent refactor work done around heapgettup() [1], I can no longer get the same speedup as before with the pg_prefetch_mem(). While testing Melanie's patches, I did do some performance tests and did see a good increase in performance from it. I really don't know the reason why the prefetching does not show the gains as it did before. Perhaps the rearranged code is better able to perform hardware prefetching of cache lines. I am, however, inclined not to drop the pg_prefetch_mem() macro altogether just because I can no longer demonstrate any performance gains during sequential scans, so I decided to go and try what Thomas mentioned in [2] to use the prefetching macro to fetch the required tuples in PageRepairFragmentation() so that they're cached in CPU cache by the time we get to compactify_tuples(). I tried this using the same test as I described in [3] after adjusting the following line to use PANIC instead of LOG: ereport(LOG, (errmsg("redo done at %X/%X system usage: %s", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr), pg_rusage_show(&ru0)))); doing that allows me to repeat the test using the same WAL each time. amd3990x CPU on Ubuntu 22.10 with 64GB RAM. shared_buffers = 10GB checkpoint_timeout = '1 h' max_wal_size = 100GB max_connections = 300 Master: 2023-04-04 15:54:55.635 NZST [15958] PANIC: redo done at 0/DC447610 system usage: CPU: user: 44.46 s, system: 0.97 s, elapsed: 45.45 s 2023-04-04 15:56:33.380 NZST [16109] PANIC: redo done at 0/DC447610 system usage: CPU: user: 43.80 s, system: 0.86 s, elapsed: 44.69 s 2023-04-04 15:57:25.968 NZST [16134] PANIC: redo done at 0/DC447610 system usage: CPU: user: 44.08 s, system: 0.74 s, elapsed: 44.84 s 2023-04-04 15:58:53.820 NZST [16158] PANIC: redo done at 0/DC447610 system usage: CPU: user: 44.20 s, system: 0.72 s, elapsed: 44.94 s Prefetch Memory in PageRepairFragmentation(): 2023-04-04 16:03:16.296 NZST [25921] PANIC: redo done at 0/DC447610 system usage: CPU: user: 41.73 s, system: 0.77 s, elapsed: 42.52 s 2023-04-04 16:04:07.384 NZST [25945] PANIC: redo done at 0/DC447610 system usage: CPU: user: 40.87 s, system: 0.86 s, elapsed: 41.74 s 2023-04-04 16:05:01.090 NZST [25968] PANIC: redo done at 0/DC447610 system usage: CPU: user: 41.20 s, system: 0.72 s, elapsed: 41.94 s 2023-04-04 16:05:49.235 NZST [25996] PANIC: redo done at 0/DC447610 system usage: CPU: user: 41.56 s, system: 0.66 s, elapsed: 42.24 s About 6.7% performance increase over master. I wonder since I really just did the seqscan patch as a means to get the pg_prefetch_mem() patch in, I wonder if it's ok to scrap that in favour of the PageRepairFragmentation patch. Updated patches attached. David [1] https://postgr.es/m/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ%40mail.gmail.com [2] https://postgr.es/m/CA%2BhUKGJRtzbbhVmb83vbCiMRZ4piOAi7HWLCqs%3DGQ74mUPrP_w%40mail.gmail.com [3] https://postgr.es/m/CAApHDvoKwqAzhiuxEt8jSquPJKDpH8DNUZDFUSX9P7DXrJdc3Q%40mail.gmail.com
v1-0001-Add-pg_prefetch_mem-macro-to-load-cache-lines.patch
Description: Binary data
prefetch_in_PageRepairFragmentation.patch
Description: Binary data