On Thu, 4 Apr 2024 at 06:03, Melanie Plageman <melanieplage...@gmail.com> wrote: > Attached v8 is rebased over current master (which now has the > streaming read API).
I've looked at the v8-0001 patch. I wasn't too keen on heapbuildvis() as a function name for an external function. Since it also does pruning work, it seemed weird to make it sound like it only did visibility work. Per our offline discussion about names, I've changed it to what you suggested which is heap_page_prep(). Aside from that, there was an outdated comment: "In pageatatime mode, heapgetpage() already did visibility checks," which was no longer true as that's done in heapbuildvis() (now heap_page_prep()). I also did a round of comment adjustments as there were a few things I didn't like, e.g: + * heapfetchbuf - subroutine for heapgettup() This is also used in heapgettup_pagemode(), so I thought it was a bad idea for a function to list places it thinks it's being called. I also thought it redundant to write "This routine" in the function head comment. I think "this routine" is implied by the context. I ended up with: /* * heapfetchbuf - read and pin the given MAIN_FORKNUM block number. * * Read the specified block of the scan relation into a buffer and pin that * buffer before saving it in the scan descriptor. */ I'm happy with your changes to heapam_scan_sample_next_block(). I did adjust the comment above CHECK_FOR_INTERRUPTS() so it was effectively the same as the seqscan version, just with "seqscan" swapped for "sample scan". The only other thing I adjusted there was to use "blockno" in some places where you were using hscan->rs_cblock. These all come after the "hscan->rs_cblock = blockno;" line. My thoughts here are that this is more likely to avoid reading the value from the struct again if the compiler isn't happy that the two values are still equivalent for some reason. Even if the compiler is happy today, it would only take a code change to pass hscan to some external function for the compiler to perhaps be uncertain if that function has made an adjustment to rs_cblock and go with the safe option of pulling the value out the struct again which is a little more expensive as it requires some maths to figure out the offset. I've attached v9-0001 and a delta of just my changes from v8. David
From 90bfc63097c556d0921d8f9165731fb07ec04167 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sat, 27 Jan 2024 18:39:37 -0500 Subject: [PATCH v9] Split heapgetpage into two parts heapgetpage(), a per-block utility function used in heap scans, read a passed-in block into a buffer and then, if page-at-a-time processing was enabled, pruned the page and built an array of its visible tuples. This was used for sequential and sample scans. Future commits will add support for read streams. The streaming read API will read in the blocks specified by a callback, but any significant per-page processing should be done synchronously on the buffer yielded by the streaming read API. To support this, separate the logic in heapgetpage() to read a block into a buffer from that which prunes the page and builds an array of the visible tuples. The former is now heapfetchbuf() and the latter is now heapbuildvis(). Future commits will push the logic for selecting the next block into heapfetchbuf() in cases when streaming reads are not supported (such as backwards sequential scans). Because this logic differs for sample scans and sequential scans, inline the code to read the block into a buffer for sample scans. This has the added benefit of allowing for a bit of refactoring in heapam_scan_sample_next_block(), including unpinning the previous buffer before invoking the callback to select the next block. --- src/backend/access/heap/heapam.c | 78 ++++++++++++++---------- src/backend/access/heap/heapam_handler.c | 38 ++++++++---- src/include/access/heapam.h | 2 +- 3 files changed, 74 insertions(+), 44 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index a9d5b109a5..6524fc44a5 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -360,17 +360,17 @@ heap_setscanlimits(TableScanDesc sscan, BlockNumber startBlk, BlockNumber numBlk } /* - * heapgetpage - subroutine for heapgettup() + * heap_page_prep - Prepare the current scan page to be scanned in pagemode * - * This routine reads and pins the specified page of the relation. - * In page-at-a-time mode it performs additional work, namely determining - * which tuples on the page are visible. + * Preparation currently consists of 1. prune the scan's rs_cbuf page, and 2. + * fill the rs_vistuples array with the OffsetNumbers of visible tuples. */ void -heapgetpage(TableScanDesc sscan, BlockNumber block) +heap_page_prep(TableScanDesc sscan) { HeapScanDesc scan = (HeapScanDesc) sscan; - Buffer buffer; + Buffer buffer = scan->rs_cbuf; + BlockNumber block = scan->rs_cblock; Snapshot snapshot; Page page; int lines; @@ -378,31 +378,10 @@ heapgetpage(TableScanDesc sscan, BlockNumber block) OffsetNumber lineoff; bool all_visible; - Assert(block < scan->rs_nblocks); - - /* release previous scan buffer, if any */ - if (BufferIsValid(scan->rs_cbuf)) - { - ReleaseBuffer(scan->rs_cbuf); - scan->rs_cbuf = InvalidBuffer; - } - - /* - * Be sure to check for interrupts at least once per page. Checks at - * higher code levels won't be able to stop a seqscan that encounters many - * pages' worth of consecutive dead tuples. - */ - CHECK_FOR_INTERRUPTS(); - - /* read page using selected strategy */ - scan->rs_cbuf = ReadBufferExtended(scan->rs_base.rs_rd, MAIN_FORKNUM, block, - RBM_NORMAL, scan->rs_strategy); - scan->rs_cblock = block; - - if (!(scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE)) - return; + Assert(BufferGetBlockNumber(buffer) == block); - buffer = scan->rs_cbuf; + /* ensure we're not accidentally being used when not in pagemode */ + Assert(scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE); snapshot = scan->rs_base.rs_snapshot; /* @@ -475,6 +454,37 @@ heapgetpage(TableScanDesc sscan, BlockNumber block) scan->rs_ntuples = ntup; } +/* + * heapfetchbuf - read and pin the given MAIN_FORKNUM block number. + * + * Read the specified block of the scan relation into a buffer and pin that + * buffer before saving it in the scan descriptor. + */ +static inline void +heapfetchbuf(HeapScanDesc scan, BlockNumber block) +{ + Assert(block < scan->rs_nblocks); + + /* release previous scan buffer, if any */ + if (BufferIsValid(scan->rs_cbuf)) + { + ReleaseBuffer(scan->rs_cbuf); + scan->rs_cbuf = InvalidBuffer; + } + + /* + * Be sure to check for interrupts at least once per page. Checks at + * higher code levels won't be able to stop a seqscan that encounters many + * pages' worth of consecutive dead tuples. + */ + CHECK_FOR_INTERRUPTS(); + + /* read page using selected strategy */ + scan->rs_cbuf = ReadBufferExtended(scan->rs_base.rs_rd, MAIN_FORKNUM, block, + RBM_NORMAL, scan->rs_strategy); + scan->rs_cblock = block; +} + /* * heapgettup_initial_block - return the first BlockNumber to scan * @@ -748,7 +758,7 @@ heapgettup(HeapScanDesc scan, */ while (block != InvalidBlockNumber) { - heapgetpage((TableScanDesc) scan, block); + heapfetchbuf(scan, block); LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE); page = heapgettup_start_page(scan, dir, &linesleft, &lineoff); continue_page: @@ -869,7 +879,11 @@ heapgettup_pagemode(HeapScanDesc scan, */ while (block != InvalidBlockNumber) { - heapgetpage((TableScanDesc) scan, block); + /* read the page */ + heapfetchbuf(scan, block); + + /* prune the page and determine visible tuple offsets */ + heap_page_prep((TableScanDesc) scan); page = BufferGetPage(scan->rs_cbuf); linesleft = scan->rs_ntuples; lineindex = ScanDirectionIsForward(dir) ? 0 : linesleft - 1; diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index 0952d4a98e..a4451c24b4 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2352,11 +2352,15 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) if (hscan->rs_nblocks == 0) return false; - if (tsm->NextSampleBlock) + /* release previous scan buffer, if any */ + if (BufferIsValid(hscan->rs_cbuf)) { - blockno = tsm->NextSampleBlock(scanstate, hscan->rs_nblocks); - hscan->rs_cblock = blockno; + ReleaseBuffer(hscan->rs_cbuf); + hscan->rs_cbuf = InvalidBuffer; } + + if (tsm->NextSampleBlock) + blockno = tsm->NextSampleBlock(scanstate, hscan->rs_nblocks); else { /* scanning table sequentially */ @@ -2398,20 +2402,32 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) } } + hscan->rs_cblock = blockno; + if (!BlockNumberIsValid(blockno)) { - if (BufferIsValid(hscan->rs_cbuf)) - ReleaseBuffer(hscan->rs_cbuf); - hscan->rs_cbuf = InvalidBuffer; - hscan->rs_cblock = InvalidBlockNumber; hscan->rs_inited = false; - return false; } - heapgetpage(scan, blockno); - hscan->rs_inited = true; + Assert(hscan->rs_cblock < hscan->rs_nblocks); + + /* + * Be sure to check for interrupts at least once per page. Checks at + * higher code levels won't be able to stop a sample scan that encounters + * many pages' worth of consecutive dead tuples. + */ + CHECK_FOR_INTERRUPTS(); + + /* Read page using selected strategy */ + hscan->rs_cbuf = ReadBufferExtended(hscan->rs_base.rs_rd, MAIN_FORKNUM, + blockno, RBM_NORMAL, hscan->rs_strategy); + /* in pagemode, prune the page and determine visible tuple offsets */ + if (hscan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) + heap_page_prep(scan); + + hscan->rs_inited = true; return true; } @@ -2572,7 +2588,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (scan->rs_flags & SO_ALLOW_PAGEMODE) { /* - * In pageatatime mode, heapgetpage() already did visibility checks, + * In pageatatime mode, heap_page_prep() already did visibility checks, * so just look at the info it left in rs_vistuples[]. * * We use a binary search over the known-sorted array. Note: we could diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index a307fb5f24..e8a211cf1b 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -267,7 +267,7 @@ extern TableScanDesc heap_beginscan(Relation relation, Snapshot snapshot, uint32 flags); extern void heap_setscanlimits(TableScanDesc sscan, BlockNumber startBlk, BlockNumber numBlks); -extern void heapgetpage(TableScanDesc sscan, BlockNumber block); +extern void heap_page_prep(TableScanDesc sscan); extern void heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params, bool allow_strat, bool allow_sync, bool allow_pagemode); extern void heap_endscan(TableScanDesc sscan); -- 2.40.1.windows.1
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index bf2b9b19e7..6524fc44a5 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -360,14 +360,13 @@ heap_setscanlimits(TableScanDesc sscan, BlockNumber startBlk, BlockNumber numBlk } /* - * heapbuildvis - Utility function for heap scans. + * heap_page_prep - Prepare the current scan page to be scanned in pagemode * - * Given a page residing in a buffer saved in the scan descriptor, prune the - * page and determine which of its tuples are all visible, saving their offsets - * in an array in the scan descriptor. + * Preparation currently consists of 1. prune the scan's rs_cbuf page, and 2. + * fill the rs_vistuples array with the OffsetNumbers of visible tuples. */ void -heapbuildvis(TableScanDesc sscan) +heap_page_prep(TableScanDesc sscan) { HeapScanDesc scan = (HeapScanDesc) sscan; Buffer buffer = scan->rs_cbuf; @@ -381,6 +380,8 @@ heapbuildvis(TableScanDesc sscan) Assert(BufferGetBlockNumber(buffer) == block); + /* ensure we're not accidentally being used when not in pagemode */ + Assert(scan->rs_base.rs_flags & SO_ALLOW_PAGEMODE); snapshot = scan->rs_base.rs_snapshot; /* @@ -454,10 +455,10 @@ heapbuildvis(TableScanDesc sscan) } /* - * heapfetchbuf - subroutine for heapgettup() + * heapfetchbuf - read and pin the given MAIN_FORKNUM block number. * - * This routine reads the specified block of the relation into a buffer and - * returns with that pinned buffer saved in the scan descriptor. + * Read the specified block of the scan relation into a buffer and pin that + * buffer before saving it in the scan descriptor. */ static inline void heapfetchbuf(HeapScanDesc scan, BlockNumber block) @@ -878,8 +879,11 @@ heapgettup_pagemode(HeapScanDesc scan, */ while (block != InvalidBlockNumber) { + /* read the page */ heapfetchbuf(scan, block); - heapbuildvis((TableScanDesc) scan); + + /* prune the page and determine visible tuple offsets */ + heap_page_prep((TableScanDesc) scan); page = BufferGetPage(scan->rs_cbuf); linesleft = scan->rs_ntuples; lineindex = ScanDirectionIsForward(dir) ? 0 : linesleft - 1; diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index f4f670e9b2..a4451c24b4 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2352,6 +2352,7 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) if (hscan->rs_nblocks == 0) return false; + /* release previous scan buffer, if any */ if (BufferIsValid(hscan->rs_cbuf)) { ReleaseBuffer(hscan->rs_cbuf); @@ -2403,7 +2404,7 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) hscan->rs_cblock = blockno; - if (!BlockNumberIsValid(hscan->rs_cblock)) + if (!BlockNumberIsValid(blockno)) { hscan->rs_inited = false; return false; @@ -2412,22 +2413,19 @@ heapam_scan_sample_next_block(TableScanDesc scan, SampleScanState *scanstate) Assert(hscan->rs_cblock < hscan->rs_nblocks); /* - * We may scan multiple pages before finding tuples to yield or finishing - * the scan. Since we want to check for interrupts at least once per page, - * do so here. + * Be sure to check for interrupts at least once per page. Checks at + * higher code levels won't be able to stop a sample scan that encounters + * many pages' worth of consecutive dead tuples. */ CHECK_FOR_INTERRUPTS(); /* Read page using selected strategy */ hscan->rs_cbuf = ReadBufferExtended(hscan->rs_base.rs_rd, MAIN_FORKNUM, - hscan->rs_cblock, RBM_NORMAL, hscan->rs_strategy); + blockno, RBM_NORMAL, hscan->rs_strategy); - /* - * If pagemode is allowed, prune the page and build an array of visible - * tuple offsets. - */ + /* in pagemode, prune the page and determine visible tuple offsets */ if (hscan->rs_base.rs_flags & SO_ALLOW_PAGEMODE) - heapbuildvis(scan); + heap_page_prep(scan); hscan->rs_inited = true; return true; @@ -2590,7 +2588,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (scan->rs_flags & SO_ALLOW_PAGEMODE) { /* - * In pageatatime mode, heapgetpage() already did visibility checks, + * In pageatatime mode, heap_page_prep() already did visibility checks, * so just look at the info it left in rs_vistuples[]. * * We use a binary search over the known-sorted array. Note: we could diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index 4d324c78e5..e8a211cf1b 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -267,7 +267,7 @@ extern TableScanDesc heap_beginscan(Relation relation, Snapshot snapshot, uint32 flags); extern void heap_setscanlimits(TableScanDesc sscan, BlockNumber startBlk, BlockNumber numBlks); -extern void heapbuildvis(TableScanDesc sscan); +extern void heap_page_prep(TableScanDesc sscan); extern void heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params, bool allow_strat, bool allow_sync, bool allow_pagemode); extern void heap_endscan(TableScanDesc sscan);