On 19/06/2024 18:55, Melanie Plageman wrote:
On Tue, Jun 18, 2024 at 6:02 PM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:

I went through v22 to remind myself of what the patches do and do some
basic review - I have some simple questions / comments for now, nothing
major. I've kept the comments in separate 'review' patches, it does not
seem worth copying here.

Thanks so much for the review!

I've implemented your feedback in attached v23 except for what I
mention below. I have not gone through each patch in the new set very
carefully after making the changes because I think we should resolve
the question of adding the new table scan descriptor before I do that.
A change there will require a big rebase. Then I can go through each
patch very carefully.
Had a quick look at this after a long pause. I only looked at the first few, hoping that reviewing them would allow you to commit at least some of them, making the rest easier.

v23-0001-table_scan_bitmap_next_block-counts-lossy-and-ex.patch

Looks good to me. (I'm not sure if this would be a net positive change on its own, but it's needed by the later patch so OK)

v23-0002-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch

LGTM

v23-0003-Make-table_scan_bitmap_next_block-async-friendly.patch

@@ -1955,19 +1954,26 @@ table_relation_estimate_size(Relation rel, int32 
*attr_widths,
  */
/*
- * Prepare to fetch / check / return tuples from `tbmres->blockno` as part of a
- * bitmap table scan. `scan` needs to have been started via
- * table_beginscan_bm(). Returns false if there are no tuples to be found on
- * the page, true otherwise. `lossy_pages` is incremented if the block's
- * representation in the bitmap is lossy; otherwise, `exact_pages` is
- * incremented.
+ * Prepare to fetch / check / return tuples as part of a bitmap table scan.
+ * `scan` needs to have been started via table_beginscan_bm(). Returns false if
+ * there are no more blocks in the bitmap, true otherwise. `lossy_pages` is
+ * incremented if the block's representation in the bitmap is lossy; otherwise,
+ * `exact_pages` is incremented.
+ *
+ * `recheck` is set by the table AM to indicate whether or not the tuples
+ * from this block should be rechecked. Tuples from lossy pages will always
+ * need to be rechecked, but some non-lossy pages' tuples may also require
+ * recheck.
+ *
+ * `blockno` is only used in bitmap table scan code to validate that the
+ * prefetch block is staying ahead of the current block.
  *
  * Note, this is an optionally implemented function, therefore should only be
  * used after verifying the presence (at plan time or such).
  */
 static inline bool
 table_scan_bitmap_next_block(TableScanDesc scan,
-                                                        struct 
TBMIterateResult *tbmres,
+                                                        BlockNumber *blockno, 
bool *recheck,
                                                         long *lossy_pages,
                                                         long *exact_pages)
 {

The new comment doesn't really explain what *blockno means. Is it an input or output parameter? How is it used with the prefetching?

v23-0004-Add-common-interface-for-TBMIterators.patch

+/*
+ * Start iteration on a shared or private bitmap iterator. Note that tbm will
+ * only be provided by private BitmapHeapScan callers. dsa and dsp will only be
+ * provided by parallel BitmapHeapScan callers. For shared callers, one process
+ * must already have called tbm_prepare_shared_iterate() to create and set up
+ * the TBMSharedIteratorState. The TBMIterator is passed by reference to
+ * accommodate callers who would like to allocate it inside an existing struct.
+ */
+void
+tbm_begin_iterate(TBMIterator *iterator, TIDBitmap *tbm,
+                                 dsa_area *dsa, dsa_pointer dsp)
+{
+       Assert(iterator);
+
+       iterator->private_iterator = NULL;
+       iterator->shared_iterator = NULL;
+       iterator->exhausted = false;
+
+       /* Allocate a private iterator and attach the shared state to it */
+       if (DsaPointerIsValid(dsp))
+               iterator->shared_iterator = tbm_attach_shared_iterate(dsa, dsp);
+       else
+               iterator->private_iterator = tbm_begin_private_iterate(tbm);
+}

Hmm, I haven't looked at how this is used the later patches, but a function signature where some parameters are used or not depending on the situation seems a bit awkward. Perhaps it would be better to let the caller call tbm_attach_shared_iterate(dsa, dsp) or tbm_begin_private_iterate(tbm), and provide a function to turn that into a TBMIterator? Something like:

TBMIterator *tbm_iterator_from_shared_iterator(TBMSharedIterator *);
TBMIterator *tbm_iterator_from_private_iterator(TBMPrivateIterator *);

Does tbm_iterator() really need the 'exhausted' flag? The private and shared variants work without one.

+1 on this patch in general, and I have no objections to its current form either, the above are just suggestions to consider.

v23-0006-BitmapHeapScan-uses-unified-iterator.patch

Makes sense. (Might be better to squash this with the previous patch)

v23-0007-BitmapHeapScan-Make-prefetch-sync-error-more-det.patch

LGTM

v23-0008-Push-current-scan-descriptor-into-specialized-sc.patch
v23-0009-Remove-ss_current-prefix-from-ss_currentScanDesc.patch

LGTM. I would squash these together.

v23-0010-Add-scan_in_progress-to-BitmapHeapScanState.patch

--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1804,6 +1804,7 @@ typedef struct ParallelBitmapHeapState
  *             prefetch_target    current target prefetch distance
  *             prefetch_maximum   maximum value for prefetch_target
  *             initialized                is node is ready to iterate
+ *             scan_in_progress   is this a rescan
  *             pstate                     shared state for parallel bitmap scan
  *             recheck                    do current page's tuples need recheck
  *             blockno                    used to validate pf and current 
block in sync
@@ -1824,6 +1825,7 @@ typedef struct BitmapHeapScanState
        int                     prefetch_target;
        int                     prefetch_maximum;
        bool            initialized;
+       bool            scan_in_progress;
        ParallelBitmapHeapState *pstate;
        bool            recheck;
        BlockNumber blockno;

Hmm, the "is this a rescan" comment sounds inaccurate, because it is set as soon as the scan is started, not only when rescanning. Other than that, LGTM.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to