On Thu, Feb 02, 2023 at 12:21:20AM +1300, David Rowley wrote:
> On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
> <melanieplage...@gmail.com> wrote:
> > v7 attached
> 
> I've been looking over the v7-0002 patch today and I did make a few
> adjustments to heapgettup_initial_block() as I would prefer to see the
> branching of each of all these helper functions follow the pattern of:
> 
> if (<forward scan>)
> {
>     if (<parallel scan>)
>         <parallel stuff>
>     else
>         <serial stuff>
> }
> else
> {
>     <backwards serial stuff>
> }
> 
> which wasn't quite what the function was doing.

I'm fine with this. One code comment about the new version inline.

> Along the way, I noticed that 0002 has a subtle bug that does not seem
> to be present once the remaining patches are applied.  I think I'm
> happier to push these along the lines of how you have them in the
> patches, so I've held off pushing for now due to the bug and the
> change I had to make to fix it.
> 
> The problem is around the setting of scan->rs_inited = true; you've
> moved that into heapgettup_initial_block() and you've correctly not
> initialised the scan for empty tables when you return
> InvalidBlockNumber, however, you've not correctly considered the fact
> that table_block_parallelscan_nextpage() could also return
> InvalidBlockNumber if the parallel workers manage to grab all of the
> blocks before the current process gets the first block. I don't know
> for sure, but it looks like this could cause problems when
> heapgettup() or heapgettup_pagemode() got called again for a rescan.
> We'd have returned the NULL tuple to indicate that no further tuples
> exist, but we'll have left rs_inited set to true which looks like
> it'll cause issues.

Ah, yes. In the later patches in the series, I handle all end of scan
cases (regardless of whether or not there was a beginning) in a single
place at the end of the function. There I release the buffer and reset
all state -- including setting rs_inited to false. So, that made it okay
to set rs_inited to true in heapgettup_initial_block().

When splitting it up, I made a mistake and missed the case you
mentioned. Thanks for catching that!

FWIW, I like setting rs_inited in heapgettup_initial_block() better in
the final refactor, but I agree with you that in this patch on its own
it is better in the body of heapgettup() and heapgettup_pagemode().
 
> I wondered if it might be better to do the scan->rs_inited = true; in
> heapgettup() and heapgettup_pagemode() instead. The attached v8 patch
> does it this way. Despite this fixing that bug, I think this might be
> a slightly better division of duties.

LGTM.

> From cbd37463bdaa96afed4c7c739c8e91b770a9f8a7 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrow...@gmail.com>
> Date: Wed, 1 Feb 2023 19:35:16 +1300
> Subject: [PATCH v8] Refactor heapam.c adding heapgettup_initial_block function
> 
> Here we adjust heapgettup() and heapgettup_pagemode() to move the code
> that fetches the first block out into a helper function.  This removes
> some code duplication.
> 
> Author: Melanie Plageman
> Reviewed-by: David Rowley
> Discussion: 
> https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_t8qeqzqol02rpi...@mail.gmail.com
> ---
>  src/backend/access/heap/heapam.c | 225 ++++++++++++++-----------------
>  1 file changed, 103 insertions(+), 122 deletions(-)
> 
> diff --git a/src/backend/access/heap/heapam.c 
> b/src/backend/access/heap/heapam.c
> index 0a8bac25f5..40168cc9ca 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -483,6 +483,67 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
>       scan->rs_ntuples = ntup;
>  }
>  
> +/*
> + * heapgettup_initial_block - return the first BlockNumber to scan
> + *
> + * Returns InvalidBlockNumber when there are no blocks to scan.  This can
> + * occur with empty tables and in parallel scans when parallel workers get 
> all
> + * of the pages before we can get a chance to get our first page.
> + */
> +static BlockNumber
> +heapgettup_initial_block(HeapScanDesc scan, ScanDirection dir)
> +{
> +     Assert(!scan->rs_inited);
> +
> +     /* When there are no pages to scan, return InvalidBlockNumber */
> +     if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0)
> +             return InvalidBlockNumber;
> +
> +     if (ScanDirectionIsForward(dir))
> +     {
> +             /* serial scan */
> +             if (scan->rs_base.rs_parallel == NULL)
> +                     return scan->rs_startblock;

I believe this else is superfluous since we returned above.

> +             else
> +             {
> +                     /* parallel scan */
> +                     
> table_block_parallelscan_startblock_init(scan->rs_base.rs_rd,
> +                                                                             
>                          scan->rs_parallelworkerdata,
> +                                                                             
>                          (ParallelBlockTableScanDesc) 
> scan->rs_base.rs_parallel);
> +
> +                     /* may return InvalidBlockNumber if there are no more 
> blocks */
> +                     return 
> table_block_parallelscan_nextpage(scan->rs_base.rs_rd,
> +                                                                             
>                          scan->rs_parallelworkerdata,
> +                                                                             
>                          (ParallelBlockTableScanDesc) 
> scan->rs_base.rs_parallel);
> +             }
> +     }
...
> @@ -889,62 +892,40 @@ heapgettup_pagemode(HeapScanDesc scan,
> -             if (!scan->rs_inited)
> -             {
> -                     lineindex = lines - 1;
> -                     scan->rs_inited = true;
> -             }
> -             else
> -             {
> +                     page = BufferGetPage(scan->rs_cbuf);
> +                     TestForOldSnapshot(scan->rs_base.rs_snapshot, 
> scan->rs_base.rs_rd, page);
> +                     lines = scan->rs_ntuples;
>                       lineindex = scan->rs_cindex - 1;
>               }
> -             /* block and lineindex now reference the previous visible tid */

I think this is an unintentional diff.

>  
> +             /* block and lineindex now reference the previous visible tid */
>               linesleft = lineindex + 1;
>       }

- Melanie


Reply via email to