On 19/06/2024 18:55, Melanie Plageman wrote:
On Tue, Jun 18, 2024 at 6:02 PM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:
I went through v22 to remind myself of what the patches do and do some
basic review - I have some simple questions / comments for now, nothing
major. I've kept the comments in separate 'review' patches, it does not
seem worth copying here.
Thanks so much for the review!
I've implemented your feedback in attached v23 except for what I
mention below. I have not gone through each patch in the new set very
carefully after making the changes because I think we should resolve
the question of adding the new table scan descriptor before I do that.
A change there will require a big rebase. Then I can go through each
patch very carefully.
Had a quick look at this after a long pause. I only looked at the first
few, hoping that reviewing them would allow you to commit at least some
of them, making the rest easier.
v23-0001-table_scan_bitmap_next_block-counts-lossy-and-ex.patch
Looks good to me. (I'm not sure if this would be a net positive change
on its own, but it's needed by the later patch so OK)
v23-0002-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch
LGTM
v23-0003-Make-table_scan_bitmap_next_block-async-friendly.patch
@@ -1955,19 +1954,26 @@ table_relation_estimate_size(Relation rel, int32
*attr_widths,
*/
/*
- * Prepare to fetch / check / return tuples from `tbmres->blockno` as part of a
- * bitmap table scan. `scan` needs to have been started via
- * table_beginscan_bm(). Returns false if there are no tuples to be found on
- * the page, true otherwise. `lossy_pages` is incremented if the block's
- * representation in the bitmap is lossy; otherwise, `exact_pages` is
- * incremented.
+ * Prepare to fetch / check / return tuples as part of a bitmap table scan.
+ * `scan` needs to have been started via table_beginscan_bm(). Returns false if
+ * there are no more blocks in the bitmap, true otherwise. `lossy_pages` is
+ * incremented if the block's representation in the bitmap is lossy; otherwise,
+ * `exact_pages` is incremented.
+ *
+ * `recheck` is set by the table AM to indicate whether or not the tuples
+ * from this block should be rechecked. Tuples from lossy pages will always
+ * need to be rechecked, but some non-lossy pages' tuples may also require
+ * recheck.
+ *
+ * `blockno` is only used in bitmap table scan code to validate that the
+ * prefetch block is staying ahead of the current block.
*
* Note, this is an optionally implemented function, therefore should only be
* used after verifying the presence (at plan time or such).
*/
static inline bool
table_scan_bitmap_next_block(TableScanDesc scan,
- struct
TBMIterateResult *tbmres,
+ BlockNumber *blockno,
bool *recheck,
long *lossy_pages,
long *exact_pages)
{
The new comment doesn't really explain what *blockno means. Is it an
input or output parameter? How is it used with the prefetching?
v23-0004-Add-common-interface-for-TBMIterators.patch
+/*
+ * Start iteration on a shared or private bitmap iterator. Note that tbm will
+ * only be provided by private BitmapHeapScan callers. dsa and dsp will only be
+ * provided by parallel BitmapHeapScan callers. For shared callers, one process
+ * must already have called tbm_prepare_shared_iterate() to create and set up
+ * the TBMSharedIteratorState. The TBMIterator is passed by reference to
+ * accommodate callers who would like to allocate it inside an existing struct.
+ */
+void
+tbm_begin_iterate(TBMIterator *iterator, TIDBitmap *tbm,
+ dsa_area *dsa, dsa_pointer dsp)
+{
+ Assert(iterator);
+
+ iterator->private_iterator = NULL;
+ iterator->shared_iterator = NULL;
+ iterator->exhausted = false;
+
+ /* Allocate a private iterator and attach the shared state to it */
+ if (DsaPointerIsValid(dsp))
+ iterator->shared_iterator = tbm_attach_shared_iterate(dsa, dsp);
+ else
+ iterator->private_iterator = tbm_begin_private_iterate(tbm);
+}
Hmm, I haven't looked at how this is used the later patches, but a
function signature where some parameters are used or not depending on
the situation seems a bit awkward. Perhaps it would be better to let the
caller call tbm_attach_shared_iterate(dsa, dsp) or
tbm_begin_private_iterate(tbm), and provide a function to turn that into
a TBMIterator? Something like:
TBMIterator *tbm_iterator_from_shared_iterator(TBMSharedIterator *);
TBMIterator *tbm_iterator_from_private_iterator(TBMPrivateIterator *);
Does tbm_iterator() really need the 'exhausted' flag? The private and
shared variants work without one.
+1 on this patch in general, and I have no objections to its current
form either, the above are just suggestions to consider.
v23-0006-BitmapHeapScan-uses-unified-iterator.patch
Makes sense. (Might be better to squash this with the previous patch)
v23-0007-BitmapHeapScan-Make-prefetch-sync-error-more-det.patch
LGTM
v23-0008-Push-current-scan-descriptor-into-specialized-sc.patch
v23-0009-Remove-ss_current-prefix-from-ss_currentScanDesc.patch
LGTM. I would squash these together.
v23-0010-Add-scan_in_progress-to-BitmapHeapScanState.patch
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1804,6 +1804,7 @@ typedef struct ParallelBitmapHeapState
* prefetch_target current target prefetch distance
* prefetch_maximum maximum value for prefetch_target
* initialized is node is ready to iterate
+ * scan_in_progress is this a rescan
* pstate shared state for parallel bitmap scan
* recheck do current page's tuples need recheck
* blockno used to validate pf and current
block in sync
@@ -1824,6 +1825,7 @@ typedef struct BitmapHeapScanState
int prefetch_target;
int prefetch_maximum;
bool initialized;
+ bool scan_in_progress;
ParallelBitmapHeapState *pstate;
bool recheck;
BlockNumber blockno;
Hmm, the "is this a rescan" comment sounds inaccurate, because it is set
as soon as the scan is started, not only when rescanning. Other than
that, LGTM.
--
Heikki Linnakangas
Neon (https://neon.tech)