On Tue, Feb 23, 2021 at 5:04 AM Andres Freund <and...@anarazel.de> wrote: > > ## AIO API overview > > The main steps to use AIO (without higher level helpers) are: > > 1) acquire an "unused" AIO: pgaio_io_get() > > 2) start some IO, this is done by functions like > pgaio_io_start_(read|write|fsync|flush_range)_(smgr|sb|raw|wal) > > The (read|write|fsync|flush_range) indicates the operation, whereas > (smgr|sb|raw|wal) determines how IO completions, errors, ... are handled. > > (see below for more details about this design choice - it might or not be > right) > > 3) optionally: assign a backend-local completion callback to the IO > (pgaio_io_on_completion_local()) > > 4) 2) alone does *not* cause the IO to be submitted to the kernel, but to be > put on a per-backend list of pending IOs. The pending IOs can be explicitly > be flushed pgaio_submit_pending(), but will also be submitted if the > pending list gets to be too large, or if the current backend waits for the > IO. > > The are two main reasons not to submit the IO immediately: > - If adjacent, we can merge several IOs into one "kernel level" IO during > submission. Larger IOs are considerably more efficient. > - Several AIO APIs allow to submit a batch of IOs in one system call. > > 5) wait for the IO: pgaio_io_wait() waits for an IO "owned" by the current > backend. When other backends may need to wait for an IO to finish, > pgaio_io_ref() can put a reference to that AIO in shared memory (e.g. a > BufferDesc), which can be waited for using pgaio_io_wait_ref(). > > 6) Process the results of the request. If a callback was registered in 3), > this isn't always necessary. The results of AIO can be accessed using > pgaio_io_result() which returns an integer where negative numbers are > -errno, and positive numbers are the [partial] success conditions > (e.g. potentially indicating a short read). > > 7) release ownership of the io (pgaio_io_release()) or reuse the IO for > another operation (pgaio_io_recycle()) > > > Most places that want to use AIO shouldn't themselves need to care about > managing the number of writes in flight, or the readahead distance. To help > with that there are two helper utilities, a "streaming read" and a "streaming > write". > > The "streaming read" helper uses a callback to determine which blocks to > prefetch - that allows to do readahead in a sequential fashion but importantly > also allows to asynchronously "read ahead" non-sequential blocks. > > E.g. for vacuum, lazy_scan_heap() has a callback that uses the visibility map > to figure out which block needs to be read next. Similarly lazy_vacuum_heap() > uses the tids in LVDeadTuples to figure out which blocks are going to be > needed. Here's the latter as an example: > https://github.com/anarazel/postgres/commit/a244baa36bfb252d451a017a273a6da1c09f15a3#diff-3198152613d9a28963266427b380e3d4fbbfabe96a221039c6b1f37bc575b965R1906 >
Attached is a patch on top of the AIO branch which does bitmapheapscan prefetching using the PgStreamingRead helper already used by sequential scan and vacuum on the AIO branch. The prefetch iterator is removed and the main iterator in the BitmapHeapScanState node is now used by the PgStreamingRead helper. Some notes about the code: Each IO will have its own TBMIterateResult allocated and returned by the PgStreamingRead helper and freed later by heapam_scan_bitmap_next_block() before requesting the next block. Previously it was allocated once and saved in the TBMIterator in the BitmapHeapScanState node and reused. Because of this, the table AM API routine, table_scan_bitmap_next_block() now defines the TBMIterateResult as an output parameter. The PgStreamingRead helper pgsr_private parameter for BitmapHeapScan is now the actual BitmapHeapScanState node. It needed access to the iterator, the heap scan descriptor, and a few fields in the BitmapHeapScanState node that could be moved elsewhere or duplicated (visibility map buffer and can_skip_fetch, for example). So, it is possible to either create a new struct or move fields around to avoid this--but, I'm not sure if that would actually be better. Because the PgStreamingReadHelper needs to be set up with the BitmapHeapScanState node but also needs some table AM specific functions, I thought it made more sense to initialize it using a new table AM API routine. Instead of fully implementing that I just wrote a wrapper function, table_bitmap_scan_setup() which just calls bitmapheap_pgsr_alloc() to socialize the idea before implementing it. I haven't made the GIN code reasonable yet either (it uses the TID bitmap functions that I've changed). There are various TODOs in the code posing questions both to the reviewer and myself for future versions of the patch. Oh, also, I haven't updated the failing partition_prune regression test because I haven't had a chance to look at the EXPLAIN code which adds the text which is not being produced to see if it is actually a bug in my code or not. Oh, and I haven't done testing to see how effective the prefetching is -- that is a larger project that I have yet to tackle. - Melanie
From 1413a391a656f9e2dd3fdaa6ef9d4f3242d7f998 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Tue, 22 Jun 2021 16:14:58 -0400 Subject: [PATCH v1] Use pgsr for AIO bitmapheapscan --- src/backend/access/gin/ginget.c | 18 +- src/backend/access/gin/ginscan.c | 4 + src/backend/access/heap/heapam_handler.c | 190 ++++++++- src/backend/executor/nodeBitmapHeapscan.c | 475 ++-------------------- src/backend/nodes/tidbitmap.c | 55 ++- src/include/access/gin_private.h | 5 + src/include/access/heapam.h | 2 + src/include/access/tableam.h | 4 +- src/include/executor/nodeBitmapHeapscan.h | 1 + src/include/nodes/execnodes.h | 4 - src/include/nodes/tidbitmap.h | 7 +- src/include/storage/aio.h | 2 +- 12 files changed, 272 insertions(+), 495 deletions(-) diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c index 03191e016c..ef7c284cd0 100644 --- a/src/backend/access/gin/ginget.c +++ b/src/backend/access/gin/ginget.c @@ -311,6 +311,8 @@ collectMatchBitmap(GinBtreeData *btree, GinBtreeStack *stack, } } +#define MAX_TUPLES_PER_PAGE MaxHeapTuplesPerPage + /* * Start* functions setup beginning state of searches: finds correct buffer and pins it. */ @@ -332,6 +334,7 @@ restartScanEntry: entry->nlist = 0; entry->matchBitmap = NULL; entry->matchResult = NULL; + entry->savedMatchResult = NULL; entry->reduceResult = false; entry->predictNumberResult = 0; @@ -372,7 +375,10 @@ restartScanEntry: if (entry->matchBitmap) { if (entry->matchIterator) + { tbm_end_iterate(entry->matchIterator); + pfree(entry->savedMatchResult); + } entry->matchIterator = NULL; tbm_free(entry->matchBitmap); entry->matchBitmap = NULL; @@ -385,6 +391,8 @@ restartScanEntry: if (entry->matchBitmap && !tbm_is_empty(entry->matchBitmap)) { entry->matchIterator = tbm_begin_iterate(entry->matchBitmap); + entry->savedMatchResult = (TBMIterateResult *) palloc0(sizeof(TBMIterateResult) + + MAX_TUPLES_PER_PAGE * sizeof(OffsetNumber)); entry->isFinished = false; } } @@ -790,6 +798,7 @@ entryLoadMoreItems(GinState *ginstate, GinScanEntry entry, #define gin_rand() (((double) random()) / ((double) MAX_RANDOM_VALUE)) #define dropItem(e) ( gin_rand() > ((double)GinFuzzySearchLimit)/((double)((e)->predictNumberResult)) ) + /* * Sets entry->curItem to next heap item pointer > advancePast, for one entry * of one scan key, or sets entry->isFinished to true if there are no more. @@ -817,7 +826,6 @@ entryGetItem(GinState *ginstate, GinScanEntry entry, /* A bitmap result */ BlockNumber advancePastBlk = GinItemPointerGetBlockNumber(&advancePast); OffsetNumber advancePastOff = GinItemPointerGetOffsetNumber(&advancePast); - for (;;) { /* @@ -831,12 +839,18 @@ entryGetItem(GinState *ginstate, GinScanEntry entry, (ItemPointerIsLossyPage(&advancePast) && entry->matchResult->blockno == advancePastBlk)) { - entry->matchResult = tbm_iterate(entry->matchIterator); + + tbm_iterate(entry->matchIterator, entry->savedMatchResult); + if (!BlockNumberIsValid(entry->savedMatchResult->blockno)) + entry->matchResult = NULL; + else + entry->matchResult = entry->savedMatchResult; if (entry->matchResult == NULL) { ItemPointerSetInvalid(&entry->curItem); tbm_end_iterate(entry->matchIterator); + pfree(entry->savedMatchResult); entry->matchIterator = NULL; entry->isFinished = true; break; diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c index 55e2d49fd7..3fd9310887 100644 --- a/src/backend/access/gin/ginscan.c +++ b/src/backend/access/gin/ginscan.c @@ -107,6 +107,7 @@ ginFillScanEntry(GinScanOpaque so, OffsetNumber attnum, scanEntry->matchBitmap = NULL; scanEntry->matchIterator = NULL; scanEntry->matchResult = NULL; + scanEntry->savedMatchResult = NULL; scanEntry->list = NULL; scanEntry->nlist = 0; scanEntry->offset = InvalidOffsetNumber; @@ -246,7 +247,10 @@ ginFreeScanKeys(GinScanOpaque so) if (entry->list) pfree(entry->list); if (entry->matchIterator) + { tbm_end_iterate(entry->matchIterator); + pfree(entry->savedMatchResult); + } if (entry->matchBitmap) tbm_free(entry->matchBitmap); } diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index 9c65741c41..b531ad641e 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -27,6 +27,7 @@ #include "access/syncscan.h" #include "access/tableam.h" #include "access/tsmapi.h" +#include "access/visibilitymap.h" #include "access/xact.h" #include "catalog/catalog.h" #include "catalog/index.h" @@ -36,6 +37,7 @@ #include "executor/executor.h" #include "miscadmin.h" #include "pgstat.h" +#include "storage/aio.h" #include "storage/bufmgr.h" #include "storage/bufpage.h" #include "storage/lmgr.h" @@ -57,6 +59,11 @@ static BlockNumber heapam_scan_get_blocks_done(HeapScanDesc hscan); static const TableAmRoutine heapam_methods; +static PgStreamingReadNextStatus +bitmapheapscan_pgsr_next_single(uintptr_t pgsr_private, PgAioInProgress *aio, uintptr_t *read_private); +static void +bitmapheapscan_pgsr_release(uintptr_t pgsr_private, uintptr_t read_private); + /* ------------------------------------------------------------------------ * Slot related callbacks for heap AM @@ -2106,13 +2113,128 @@ heapam_estimate_rel_size(Relation rel, int32 *attr_widths, * Executor related callbacks for the heap AM * ------------------------------------------------------------------------ */ +#define MAX_TUPLES_PER_PAGE MaxHeapTuplesPerPage + +// TODO: for heap, these are in heapam.c instead of heapam_handler.c +// but, heap may move where it does the setup of pgsr +static void +bitmapheapscan_pgsr_release(uintptr_t pgsr_private, uintptr_t read_private) +{ + BitmapHeapScanState *bhs_state = (BitmapHeapScanState *) pgsr_private; + HeapScanDesc hdesc = (HeapScanDesc ) bhs_state->ss.ss_currentScanDesc; + TBMIterateResult *tbmres = (TBMIterateResult *) read_private; + + ereport(WARNING, + errmsg("pgsr %s: releasing buf %d", + NameStr(hdesc->rs_base.rs_rd->rd_rel->relname), + tbmres->buffer), + errhidestmt(true), + errhidecontext(true)); + + Assert(BufferIsValid(tbmres->buffer)); + ReleaseBuffer(tbmres->buffer); +} + +static PgStreamingReadNextStatus +bitmapheapscan_pgsr_next_single(uintptr_t pgsr_private, PgAioInProgress *aio, uintptr_t *read_private) +{ + bool already_valid; + bool skip_fetch; + BitmapHeapScanState *bhs_state = (BitmapHeapScanState *) pgsr_private; + /* + * TODO: instead of passing the BitmapHeapScanState node when setting up + * and ultimately using it here as pgsr_private, perhaps I can pass only the + * iterator by adding a pointer to the HeapScanDesc to the iterator and + * moving the vmbuffer into the heapscandesc and also add can_skip_fetch to + * the iterator and then pass the iterator as the private state. + * If doing this, will need a separate bitmapheapscan_pgsr_next_parallel() in + * addition to the bitmapheapscan_pgsr_next_single() which would use the + * shared_tbmiterator instead of the tbmiterator() (and then would need separate + * alloc functions for setup and potentially different release functions). + */ + ParallelBitmapHeapState *pstate = bhs_state->pstate; + HeapScanDesc hdesc = (HeapScanDesc ) bhs_state->ss.ss_currentScanDesc; + TBMIterateResult *tbmres = (TBMIterateResult *) palloc0(sizeof(TBMIterateResult) + + MAX_TUPLES_PER_PAGE * sizeof(OffsetNumber)); + Assert(bhs_state->initialized); + if (pstate == NULL) + tbm_iterate(bhs_state->tbmiterator, tbmres); + else + tbm_shared_iterate(bhs_state->shared_tbmiterator, tbmres); + + // TODO: could this be invalid for another reason than hit_end? + if (!BlockNumberIsValid(tbmres->blockno)) + { + pfree(tbmres); + tbmres = NULL; + *read_private = 0; + return PGSR_NEXT_END; + } + /* + * Ignore any claimed entries past what we think is the end of the + * relation. It may have been extended after the start of our scan (we + * only hold an AccessShareLock, and it could be inserts from this + * backend). + */ + if (tbmres->blockno >= hdesc->rs_nblocks) + { + tbmres->blockno = InvalidBlockNumber; + *read_private = (uintptr_t) tbmres; + return PGSR_NEXT_NO_IO; + } + + /* + * We can skip fetching the heap page if we don't need any fields + * from the heap, and the bitmap entries don't need rechecking, + * and all tuples on the page are visible to our transaction. + */ + skip_fetch = (bhs_state->can_skip_fetch && !tbmres->recheck && + VM_ALL_VISIBLE(hdesc->rs_base.rs_rd, tbmres->blockno, + &bhs_state->vmbuffer)); + + if (skip_fetch) + { + /* + * The number of tuples on this page is put into + * node->return_empty_tuples. + */ + tbmres->buffer = InvalidBuffer; + *read_private = (uintptr_t) tbmres; + return PGSR_NEXT_NO_IO; + } + tbmres->buffer = ReadBufferAsync(hdesc->rs_base.rs_rd, + MAIN_FORKNUM, + tbmres->blockno, + RBM_NORMAL, hdesc->rs_strategy, &already_valid, + &aio); + *read_private = (uintptr_t) tbmres; + + if (already_valid) + return PGSR_NEXT_NO_IO; + else + return PGSR_NEXT_IO; +} + +// TODO: put this in the right place +void bitmapheap_pgsr_alloc(BitmapHeapScanState *scanstate) +{ + HeapScanDesc hscan = (HeapScanDesc ) scanstate->ss.ss_currentScanDesc; + if (!hscan->rs_inited) + { + int iodepth = Max(Min(128, NBuffers / 128), 1); + hscan->pgsr = pg_streaming_read_alloc(iodepth, (uintptr_t) scanstate, + bitmapheapscan_pgsr_next_single, + bitmapheapscan_pgsr_release); + + hscan->rs_inited = true; + } +} static bool heapam_scan_bitmap_next_block(TableScanDesc scan, - TBMIterateResult *tbmres) + TBMIterateResult **tbmres) { HeapScanDesc hscan = (HeapScanDesc) scan; - BlockNumber page = tbmres->blockno; Buffer buffer; Snapshot snapshot; int ntup; @@ -2120,22 +2242,35 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, hscan->rs_cindex = 0; hscan->rs_ntuples = 0; - /* - * Ignore any claimed entries past what we think is the end of the - * relation. It may have been extended after the start of our scan (we - * only hold an AccessShareLock, and it could be inserts from this - * backend). - */ - if (page >= hscan->rs_nblocks) + Assert(hscan->pgsr); + if (*tbmres) + { + if (BufferIsValid((*tbmres)->buffer)) + ReleaseBuffer((*tbmres)->buffer); + hscan->rs_cbuf = InvalidBuffer; + pfree(*tbmres); + } + + *tbmres = (TBMIterateResult *) pg_streaming_read_get_next(hscan->pgsr); + /* hit the end */ + if (*tbmres == NULL) + return true; + + /* Invalid due to past the end of the relation */ + if (!BlockNumberIsValid((*tbmres)->blockno)) + { + pfree(*tbmres); + *tbmres = NULL; return false; + } + + hscan->rs_cblock = (*tbmres)->blockno; + hscan->rs_cbuf = (*tbmres)->buffer; + + /* Skipped fetching, we'll still use ntuples though */ + if (!(BufferIsValid(hscan->rs_cbuf))) + return true; - /* - * Acquire pin on the target heap page, trading in any pin we held before. - */ - hscan->rs_cbuf = ReleaseAndReadBuffer(hscan->rs_cbuf, - scan->rs_rd, - page); - hscan->rs_cblock = page; buffer = hscan->rs_cbuf; snapshot = scan->rs_snapshot; @@ -2156,7 +2291,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, /* * We need two separate strategies for lossy and non-lossy cases. */ - if (tbmres->ntuples >= 0) + if ((*tbmres)->ntuples >= 0) { /* * Bitmap is non-lossy, so we just look through the offsets listed in @@ -2165,13 +2300,13 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, */ int curslot; - for (curslot = 0; curslot < tbmres->ntuples; curslot++) + for (curslot = 0; curslot < (*tbmres)->ntuples; curslot++) { - OffsetNumber offnum = tbmres->offsets[curslot]; + OffsetNumber offnum = (*tbmres)->offsets[curslot]; ItemPointerData tid; HeapTupleData heapTuple; - ItemPointerSet(&tid, page, offnum); + ItemPointerSet(&tid, (*tbmres)->blockno, offnum); if (heap_hot_search_buffer(&tid, scan->rs_rd, buffer, snapshot, &heapTuple, NULL, true)) hscan->rs_vistuples[ntup++] = ItemPointerGetOffsetNumber(&tid); @@ -2199,7 +2334,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, loctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lp); loctup.t_len = ItemIdGetLength(lp); loctup.t_tableOid = scan->rs_rd->rd_id; - ItemPointerSet(&loctup.t_self, page, offnum); + ItemPointerSet(&loctup.t_self, (*tbmres)->blockno, offnum); valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer); if (valid) { @@ -2230,6 +2365,19 @@ heapam_scan_bitmap_next_tuple(TableScanDesc scan, Page dp; ItemId lp; + /* we skipped fetching */ + if (BufferIsInvalid(tbmres->buffer)) + { + Assert(tbmres->ntuples >= 0); + if (tbmres->ntuples > 0) + { + ExecStoreAllNullTuple(slot); + tbmres->ntuples--; + return true; + } + Assert(tbmres->ntuples == 0); + return false; + } /* * Out of range? If so, nothing more to look at on this page */ diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c index 3861bd8a24..41f5434080 100644 --- a/src/backend/executor/nodeBitmapHeapscan.c +++ b/src/backend/executor/nodeBitmapHeapscan.c @@ -36,6 +36,8 @@ #include "postgres.h" #include <math.h> +// TODO: delete me after moving scan setup function +#include "access/heapam.h" #include "access/relscan.h" #include "access/tableam.h" @@ -55,13 +57,13 @@ static TupleTableSlot *BitmapHeapNext(BitmapHeapScanState *node); static inline void BitmapDoneInitializingSharedState(ParallelBitmapHeapState *pstate); -static inline void BitmapAdjustPrefetchIterator(BitmapHeapScanState *node, - TBMIterateResult *tbmres); -static inline void BitmapAdjustPrefetchTarget(BitmapHeapScanState *node); -static inline void BitmapPrefetch(BitmapHeapScanState *node, - TableScanDesc scan); static bool BitmapShouldInitializeSharedState(ParallelBitmapHeapState *pstate); +// TODO: add to tableam +void table_bitmap_scan_setup(BitmapHeapScanState *scanstate) +{ + bitmapheap_pgsr_alloc(scanstate); +} /* ---------------------------------------------------------------- * BitmapHeapNext @@ -75,10 +77,7 @@ BitmapHeapNext(BitmapHeapScanState *node) ExprContext *econtext; TableScanDesc scan; TIDBitmap *tbm; - TBMIterator *tbmiterator = NULL; - TBMSharedIterator *shared_tbmiterator = NULL; - TBMIterateResult *tbmres; - TupleTableSlot *slot; + TupleTableSlot *slot = node->ss.ss_ScanTupleSlot; ParallelBitmapHeapState *pstate = node->pstate; dsa_area *dsa = node->ss.ps.state->es_query_dsa; @@ -89,11 +88,6 @@ BitmapHeapNext(BitmapHeapScanState *node) slot = node->ss.ss_ScanTupleSlot; scan = node->ss.ss_currentScanDesc; tbm = node->tbm; - if (pstate == NULL) - tbmiterator = node->tbmiterator; - else - shared_tbmiterator = node->shared_tbmiterator; - tbmres = node->tbmres; /* * If we haven't yet performed the underlying index scan, do it, and begin @@ -117,17 +111,7 @@ BitmapHeapNext(BitmapHeapScanState *node) elog(ERROR, "unrecognized result from subplan"); node->tbm = tbm; - node->tbmiterator = tbmiterator = tbm_begin_iterate(tbm); - node->tbmres = tbmres = NULL; - -#ifdef USE_PREFETCH - if (node->prefetch_maximum > 0) - { - node->prefetch_iterator = tbm_begin_iterate(tbm); - node->prefetch_pages = 0; - node->prefetch_target = -1; - } -#endif /* USE_PREFETCH */ + node->tbmiterator = tbm_begin_iterate(tbm); } else { @@ -143,180 +127,45 @@ BitmapHeapNext(BitmapHeapScanState *node) elog(ERROR, "unrecognized result from subplan"); node->tbm = tbm; - /* * Prepare to iterate over the TBM. This will return the * dsa_pointer of the iterator state which will be used by * multiple processes to iterate jointly. */ - pstate->tbmiterator = tbm_prepare_shared_iterate(tbm); -#ifdef USE_PREFETCH - if (node->prefetch_maximum > 0) - { - pstate->prefetch_iterator = - tbm_prepare_shared_iterate(tbm); - - /* - * We don't need the mutex here as we haven't yet woke up - * others. - */ - pstate->prefetch_pages = 0; - pstate->prefetch_target = -1; - } -#endif + pstate->tbmiterator = + tbm_prepare_shared_iterate(tbm); /* We have initialized the shared state so wake up others. */ BitmapDoneInitializingSharedState(pstate); } /* Allocate a private iterator and attach the shared state to it */ - node->shared_tbmiterator = shared_tbmiterator = + node->shared_tbmiterator = tbm_attach_shared_iterate(dsa, pstate->tbmiterator); - node->tbmres = tbmres = NULL; - -#ifdef USE_PREFETCH - if (node->prefetch_maximum > 0) - { - node->shared_prefetch_iterator = - tbm_attach_shared_iterate(dsa, pstate->prefetch_iterator); - } -#endif /* USE_PREFETCH */ } node->initialized = true; + /* do any required setup, such as setting up streaming read helper */ + // TODO: modify for parallel as relevant + table_bitmap_scan_setup(node); + /* get the first block */ + while (!table_scan_bitmap_next_block(scan, &node->tbmres)); + if (node->tbmres == NULL) + return NULL; } + + // TODO: seems like it would be more clear to have an independent function + // getting the next tuple and block and then only have the recheck here. + // the loop condition would be next_tuple != NULL for (;;) { - bool skip_fetch; - CHECK_FOR_INTERRUPTS(); - - /* - * Get next page of results if needed - */ - if (tbmres == NULL) - { - if (!pstate) - node->tbmres = tbmres = tbm_iterate(tbmiterator); - else - node->tbmres = tbmres = tbm_shared_iterate(shared_tbmiterator); - if (tbmres == NULL) - { - /* no more entries in the bitmap */ - break; - } - - BitmapAdjustPrefetchIterator(node, tbmres); - - /* - * We can skip fetching the heap page if we don't need any fields - * from the heap, and the bitmap entries don't need rechecking, - * and all tuples on the page are visible to our transaction. - * - * XXX: It's a layering violation that we do these checks above - * tableam, they should probably moved below it at some point. - */ - skip_fetch = (node->can_skip_fetch && - !tbmres->recheck && - VM_ALL_VISIBLE(node->ss.ss_currentRelation, - tbmres->blockno, - &node->vmbuffer)); - - if (skip_fetch) - { - /* can't be lossy in the skip_fetch case */ - Assert(tbmres->ntuples >= 0); - - /* - * The number of tuples on this page is put into - * node->return_empty_tuples. - */ - node->return_empty_tuples = tbmres->ntuples; - } - else if (!table_scan_bitmap_next_block(scan, tbmres)) - { - /* AM doesn't think this block is valid, skip */ - continue; - } - - if (tbmres->ntuples >= 0) - node->exact_pages++; - else - node->lossy_pages++; - - /* Adjust the prefetch target */ - BitmapAdjustPrefetchTarget(node); - } - else - { - /* - * Continuing in previously obtained page. - */ - -#ifdef USE_PREFETCH - - /* - * Try to prefetch at least a few pages even before we get to the - * second page if we don't stop reading after the first tuple. - */ - if (!pstate) - { - if (node->prefetch_target < node->prefetch_maximum) - node->prefetch_target++; - } - else if (pstate->prefetch_target < node->prefetch_maximum) - { - /* take spinlock while updating shared state */ - SpinLockAcquire(&pstate->mutex); - if (pstate->prefetch_target < node->prefetch_maximum) - pstate->prefetch_target++; - SpinLockRelease(&pstate->mutex); - } -#endif /* USE_PREFETCH */ - } - - /* - * We issue prefetch requests *after* fetching the current page to try - * to avoid having prefetching interfere with the main I/O. Also, this - * should happen only when we have determined there is still something - * to do on the current page, else we may uselessly prefetch the same - * page we are just about to request for real. - * - * XXX: It's a layering violation that we do these checks above - * tableam, they should probably moved below it at some point. - */ - BitmapPrefetch(node, scan); - - if (node->return_empty_tuples > 0) - { - /* - * If we don't have to fetch the tuple, just return nulls. - */ - ExecStoreAllNullTuple(slot); - - if (--node->return_empty_tuples == 0) - { - /* no more tuples to return in the next round */ - node->tbmres = tbmres = NULL; - } - } - else + /* Attempt to fetch tuple from AM. */ + if (table_scan_bitmap_next_tuple(scan, node->tbmres, slot)) { - /* - * Attempt to fetch tuple from AM. - */ - if (!table_scan_bitmap_next_tuple(scan, tbmres, slot)) - { - /* nothing more to look at on this page */ - node->tbmres = tbmres = NULL; - continue; - } - - /* - * If we are using lossy info, we have to recheck the qual - * conditions at every tuple. - */ - if (tbmres->recheck) + // TODO: couldn't we have recheck set to true when it was only because + // the bitmap was lossy and not because the qual needs to be rechecked? + if (node->tbmres->recheck) { econtext->ecxt_scantuple = slot; if (!ExecQualAndReset(node->bitmapqualorig, econtext)) @@ -327,16 +176,23 @@ BitmapHeapNext(BitmapHeapScanState *node) continue; } } + return slot; } - /* OK to return this tuple */ - return slot; - } + /* + * Get next page of results + */ + while (!table_scan_bitmap_next_block(scan, &node->tbmres)); - /* - * if we get here it means we are at the end of the scan.. - */ - return ExecClearTuple(slot); + /* if we get here it means we are at the end of the scan */ + if (node->tbmres == NULL) + return NULL; + + if (node->tbmres->ntuples >= 0) + node->exact_pages++; + else + node->lossy_pages++; + } } /* @@ -354,235 +210,6 @@ BitmapDoneInitializingSharedState(ParallelBitmapHeapState *pstate) ConditionVariableBroadcast(&pstate->cv); } -/* - * BitmapAdjustPrefetchIterator - Adjust the prefetch iterator - */ -static inline void -BitmapAdjustPrefetchIterator(BitmapHeapScanState *node, - TBMIterateResult *tbmres) -{ -#ifdef USE_PREFETCH - ParallelBitmapHeapState *pstate = node->pstate; - - if (pstate == NULL) - { - TBMIterator *prefetch_iterator = node->prefetch_iterator; - - if (node->prefetch_pages > 0) - { - /* The main iterator has closed the distance by one page */ - node->prefetch_pages--; - } - else if (prefetch_iterator) - { - /* Do not let the prefetch iterator get behind the main one */ - TBMIterateResult *tbmpre = tbm_iterate(prefetch_iterator); - - if (tbmpre == NULL || tbmpre->blockno != tbmres->blockno) - elog(ERROR, "prefetch and main iterators are out of sync"); - } - return; - } - - if (node->prefetch_maximum > 0) - { - TBMSharedIterator *prefetch_iterator = node->shared_prefetch_iterator; - - SpinLockAcquire(&pstate->mutex); - if (pstate->prefetch_pages > 0) - { - pstate->prefetch_pages--; - SpinLockRelease(&pstate->mutex); - } - else - { - /* Release the mutex before iterating */ - SpinLockRelease(&pstate->mutex); - - /* - * In case of shared mode, we can not ensure that the current - * blockno of the main iterator and that of the prefetch iterator - * are same. It's possible that whatever blockno we are - * prefetching will be processed by another process. Therefore, - * we don't validate the blockno here as we do in non-parallel - * case. - */ - if (prefetch_iterator) - tbm_shared_iterate(prefetch_iterator); - } - } -#endif /* USE_PREFETCH */ -} - -/* - * BitmapAdjustPrefetchTarget - Adjust the prefetch target - * - * Increase prefetch target if it's not yet at the max. Note that - * we will increase it to zero after fetching the very first - * page/tuple, then to one after the second tuple is fetched, then - * it doubles as later pages are fetched. - */ -static inline void -BitmapAdjustPrefetchTarget(BitmapHeapScanState *node) -{ -#ifdef USE_PREFETCH - ParallelBitmapHeapState *pstate = node->pstate; - - if (pstate == NULL) - { - if (node->prefetch_target >= node->prefetch_maximum) - /* don't increase any further */ ; - else if (node->prefetch_target >= node->prefetch_maximum / 2) - node->prefetch_target = node->prefetch_maximum; - else if (node->prefetch_target > 0) - node->prefetch_target *= 2; - else - node->prefetch_target++; - return; - } - - /* Do an unlocked check first to save spinlock acquisitions. */ - if (pstate->prefetch_target < node->prefetch_maximum) - { - SpinLockAcquire(&pstate->mutex); - if (pstate->prefetch_target >= node->prefetch_maximum) - /* don't increase any further */ ; - else if (pstate->prefetch_target >= node->prefetch_maximum / 2) - pstate->prefetch_target = node->prefetch_maximum; - else if (pstate->prefetch_target > 0) - pstate->prefetch_target *= 2; - else - pstate->prefetch_target++; - SpinLockRelease(&pstate->mutex); - } -#endif /* USE_PREFETCH */ -} - -/* - * BitmapPrefetch - Prefetch, if prefetch_pages are behind prefetch_target - */ -static inline void -BitmapPrefetch(BitmapHeapScanState *node, TableScanDesc scan) -{ - /* - * FIXME: This really should just all be replaced by using one iterator - * and a PgStreamingRead. tbm_iterate() actually does a fair bit of work, - * we don't want to repeat that. Nor is it good to do the buffer mapping - * lookups twice. - */ -#ifdef USE_PREFETCH - ParallelBitmapHeapState *pstate = node->pstate; - bool issued_prefetch = false; - - if (pstate == NULL) - { - TBMIterator *prefetch_iterator = node->prefetch_iterator; - - if (prefetch_iterator) - { - while (node->prefetch_pages < node->prefetch_target) - { - TBMIterateResult *tbmpre = tbm_iterate(prefetch_iterator); - bool skip_fetch; - - if (tbmpre == NULL) - { - /* No more pages to prefetch */ - tbm_end_iterate(prefetch_iterator); - node->prefetch_iterator = NULL; - break; - } - node->prefetch_pages++; - - /* - * If we expect not to have to actually read this heap page, - * skip this prefetch call, but continue to run the prefetch - * logic normally. (Would it be better not to increment - * prefetch_pages?) - * - * This depends on the assumption that the index AM will - * report the same recheck flag for this future heap page as - * it did for the current heap page; which is not a certainty - * but is true in many cases. - */ - skip_fetch = (node->can_skip_fetch && - (node->tbmres ? !node->tbmres->recheck : false) && - VM_ALL_VISIBLE(node->ss.ss_currentRelation, - tbmpre->blockno, - &node->pvmbuffer)); - - if (!skip_fetch) - { - PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, tbmpre->blockno); - issued_prefetch = true; - } - } - } - - return; - } - - if (pstate->prefetch_pages < pstate->prefetch_target) - { - TBMSharedIterator *prefetch_iterator = node->shared_prefetch_iterator; - - if (prefetch_iterator) - { - while (1) - { - TBMIterateResult *tbmpre; - bool do_prefetch = false; - bool skip_fetch; - - /* - * Recheck under the mutex. If some other process has already - * done enough prefetching then we need not to do anything. - */ - SpinLockAcquire(&pstate->mutex); - if (pstate->prefetch_pages < pstate->prefetch_target) - { - pstate->prefetch_pages++; - do_prefetch = true; - } - SpinLockRelease(&pstate->mutex); - - if (!do_prefetch) - return; - - tbmpre = tbm_shared_iterate(prefetch_iterator); - if (tbmpre == NULL) - { - /* No more pages to prefetch */ - tbm_end_shared_iterate(prefetch_iterator); - node->shared_prefetch_iterator = NULL; - break; - } - - /* As above, skip prefetch if we expect not to need page */ - skip_fetch = (node->can_skip_fetch && - (node->tbmres ? !node->tbmres->recheck : false) && - VM_ALL_VISIBLE(node->ss.ss_currentRelation, - tbmpre->blockno, - &node->pvmbuffer)); - - if (!skip_fetch) - { - PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, tbmpre->blockno); - issued_prefetch = true; - } - } - } - } - - /* - * The PrefetchBuffer() calls staged IOs, but didn't necessarily submit - * them, as it is more efficient to amortize the syscall cost across - * multiple calls. - */ - if (issued_prefetch) - pgaio_submit_pending(true); -#endif /* USE_PREFETCH */ -} /* * BitmapHeapRecheck -- access method routine to recheck a tuple in EvalPlanQual @@ -631,12 +258,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node) /* release bitmaps and buffers if any */ if (node->tbmiterator) tbm_end_iterate(node->tbmiterator); - if (node->prefetch_iterator) - tbm_end_iterate(node->prefetch_iterator); if (node->shared_tbmiterator) tbm_end_shared_iterate(node->shared_tbmiterator); - if (node->shared_prefetch_iterator) - tbm_end_shared_iterate(node->shared_prefetch_iterator); if (node->tbm) tbm_free(node->tbm); if (node->vmbuffer != InvalidBuffer) @@ -646,10 +269,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node) node->tbm = NULL; node->tbmiterator = NULL; node->tbmres = NULL; - node->prefetch_iterator = NULL; node->initialized = false; node->shared_tbmiterator = NULL; - node->shared_prefetch_iterator = NULL; node->vmbuffer = InvalidBuffer; node->pvmbuffer = InvalidBuffer; @@ -699,14 +320,10 @@ ExecEndBitmapHeapScan(BitmapHeapScanState *node) */ if (node->tbmiterator) tbm_end_iterate(node->tbmiterator); - if (node->prefetch_iterator) - tbm_end_iterate(node->prefetch_iterator); if (node->tbm) tbm_free(node->tbm); if (node->shared_tbmiterator) tbm_end_shared_iterate(node->shared_tbmiterator); - if (node->shared_prefetch_iterator) - tbm_end_shared_iterate(node->shared_prefetch_iterator); if (node->vmbuffer != InvalidBuffer) ReleaseBuffer(node->vmbuffer); if (node->pvmbuffer != InvalidBuffer) @@ -750,18 +367,15 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags) scanstate->tbm = NULL; scanstate->tbmiterator = NULL; scanstate->tbmres = NULL; - scanstate->return_empty_tuples = 0; scanstate->vmbuffer = InvalidBuffer; scanstate->pvmbuffer = InvalidBuffer; scanstate->exact_pages = 0; scanstate->lossy_pages = 0; - scanstate->prefetch_iterator = NULL; scanstate->prefetch_pages = 0; scanstate->prefetch_target = 0; scanstate->pscan_len = 0; scanstate->initialized = false; scanstate->shared_tbmiterator = NULL; - scanstate->shared_prefetch_iterator = NULL; scanstate->pstate = NULL; /* @@ -909,7 +523,6 @@ ExecBitmapHeapInitializeDSM(BitmapHeapScanState *node, pstate = shm_toc_allocate(pcxt->toc, node->pscan_len); pstate->tbmiterator = 0; - pstate->prefetch_iterator = 0; /* Initialize the mutex */ SpinLockInit(&pstate->mutex); @@ -946,11 +559,7 @@ ExecBitmapHeapReInitializeDSM(BitmapHeapScanState *node, if (DsaPointerIsValid(pstate->tbmiterator)) tbm_free_shared_area(dsa, pstate->tbmiterator); - if (DsaPointerIsValid(pstate->prefetch_iterator)) - tbm_free_shared_area(dsa, pstate->prefetch_iterator); - pstate->tbmiterator = InvalidDsaPointer; - pstate->prefetch_iterator = InvalidDsaPointer; } /* ---------------------------------------------------------------- diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c index c5feacbff4..9ac0fa98d0 100644 --- a/src/backend/nodes/tidbitmap.c +++ b/src/backend/nodes/tidbitmap.c @@ -180,7 +180,6 @@ struct TBMIterator int spageptr; /* next spages index */ int schunkptr; /* next schunks index */ int schunkbit; /* next bit to check in current schunk */ - TBMIterateResult output; /* MUST BE LAST (because variable-size) */ }; /* @@ -221,7 +220,6 @@ struct TBMSharedIterator PTEntryArray *ptbase; /* pagetable element array */ PTIterationArray *ptpages; /* sorted exact page index list */ PTIterationArray *ptchunks; /* sorted lossy page index list */ - TBMIterateResult output; /* MUST BE LAST (because variable-size) */ }; /* Local function prototypes */ @@ -695,8 +693,7 @@ tbm_begin_iterate(TIDBitmap *tbm) * Create the TBMIterator struct, with enough trailing space to serve the * needs of the TBMIterateResult sub-struct. */ - iterator = (TBMIterator *) palloc(sizeof(TBMIterator) + - MAX_TUPLES_PER_PAGE * sizeof(OffsetNumber)); + iterator = (TBMIterator *) palloc(sizeof(TBMIterator)); iterator->tbm = tbm; /* @@ -966,11 +963,10 @@ tbm_advance_schunkbit(PagetableEntry *chunk, int *schunkbitp) * be examined, but the condition must be rechecked anyway. (For ease of * testing, recheck is always set true when ntuples < 0.) */ -TBMIterateResult * -tbm_iterate(TBMIterator *iterator) +void +tbm_iterate(TBMIterator *iterator, TBMIterateResult *tbmres) { TIDBitmap *tbm = iterator->tbm; - TBMIterateResult *output = &(iterator->output); Assert(tbm->iterating == TBM_ITERATING_PRIVATE); @@ -1008,11 +1004,11 @@ tbm_iterate(TBMIterator *iterator) chunk_blockno < tbm->spages[iterator->spageptr]->blockno) { /* Return a lossy page indicator from the chunk */ - output->blockno = chunk_blockno; - output->ntuples = -1; - output->recheck = true; + tbmres->blockno = chunk_blockno; + tbmres->ntuples = -1; + tbmres->recheck = true; iterator->schunkbit++; - return output; + return; } } @@ -1028,16 +1024,16 @@ tbm_iterate(TBMIterator *iterator) page = tbm->spages[iterator->spageptr]; /* scan bitmap to extract individual offset numbers */ - ntuples = tbm_extract_page_tuple(page, output); - output->blockno = page->blockno; - output->ntuples = ntuples; - output->recheck = page->recheck; + ntuples = tbm_extract_page_tuple(page, tbmres); + tbmres->blockno = page->blockno; + tbmres->ntuples = ntuples; + tbmres->recheck = page->recheck; iterator->spageptr++; - return output; + return; } /* Nothing more in the bitmap */ - return NULL; + tbmres->blockno = InvalidBlockNumber; } /* @@ -1047,10 +1043,9 @@ tbm_iterate(TBMIterator *iterator) * across multiple processes. We need to acquire the iterator LWLock, * before accessing the shared members. */ -TBMIterateResult * -tbm_shared_iterate(TBMSharedIterator *iterator) +void +tbm_shared_iterate(TBMSharedIterator *iterator, TBMIterateResult *tbmres) { - TBMIterateResult *output = &iterator->output; TBMSharedIteratorState *istate = iterator->state; PagetableEntry *ptbase = NULL; int *idxpages = NULL; @@ -1101,13 +1096,13 @@ tbm_shared_iterate(TBMSharedIterator *iterator) chunk_blockno < ptbase[idxpages[istate->spageptr]].blockno) { /* Return a lossy page indicator from the chunk */ - output->blockno = chunk_blockno; - output->ntuples = -1; - output->recheck = true; + tbmres->blockno = chunk_blockno; + tbmres->ntuples = -1; + tbmres->recheck = true; istate->schunkbit++; LWLockRelease(&istate->lock); - return output; + return; } } @@ -1117,21 +1112,21 @@ tbm_shared_iterate(TBMSharedIterator *iterator) int ntuples; /* scan bitmap to extract individual offset numbers */ - ntuples = tbm_extract_page_tuple(page, output); - output->blockno = page->blockno; - output->ntuples = ntuples; - output->recheck = page->recheck; + ntuples = tbm_extract_page_tuple(page, tbmres); + tbmres->blockno = page->blockno; + tbmres->ntuples = ntuples; + tbmres->recheck = page->recheck; istate->spageptr++; LWLockRelease(&istate->lock); - return output; + return; } LWLockRelease(&istate->lock); /* Nothing more in the bitmap */ - return NULL; + tbmres->blockno = InvalidBlockNumber; } /* diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h index 670a40b4be..1122c098c7 100644 --- a/src/include/access/gin_private.h +++ b/src/include/access/gin_private.h @@ -352,6 +352,11 @@ typedef struct GinScanEntryData TIDBitmap *matchBitmap; TBMIterator *matchIterator; TBMIterateResult *matchResult; + // TODO: a temporary hack to deal with the fact that I am + // 1) not sure if InvalidBlockNumber can come up for other reasons than exhausting the bitmap + // and 2) not having taken the time yet to check all the places where matchResult == NULL + // is used to make sure I can replace it with something else + TBMIterateResult *savedMatchResult; /* used for Posting list and one page in Posting tree */ ItemPointerData *list; diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h index 331f5c6716..c4d653e923 100644 --- a/src/include/access/heapam.h +++ b/src/include/access/heapam.h @@ -20,6 +20,7 @@ #include "access/skey.h" #include "access/table.h" /* for backward compatibility */ #include "access/tableam.h" +#include "nodes/execnodes.h" #include "nodes/lockoptions.h" #include "nodes/primnodes.h" #include "storage/bufpage.h" @@ -225,5 +226,6 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data, CommandId *cmin, CommandId *cmax); extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple, Buffer buffer, Snapshot snapshot); +extern void bitmapheap_pgsr_alloc(BitmapHeapScanState *scanstate); #endif /* HEAPAM_H */ diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h index 414b6b4d57..fea54384ec 100644 --- a/src/include/access/tableam.h +++ b/src/include/access/tableam.h @@ -786,7 +786,7 @@ typedef struct TableAmRoutine * scan_bitmap_next_tuple need to exist, or neither. */ bool (*scan_bitmap_next_block) (TableScanDesc scan, - struct TBMIterateResult *tbmres); + struct TBMIterateResult **tbmres); /* * Fetch the next tuple of a bitmap table scan into `slot` and return true @@ -1929,7 +1929,7 @@ table_relation_estimate_size(Relation rel, int32 *attr_widths, */ static inline bool table_scan_bitmap_next_block(TableScanDesc scan, - struct TBMIterateResult *tbmres) + struct TBMIterateResult **tbmres) { /* * We don't expect direct calls to table_scan_bitmap_next_block with valid diff --git a/src/include/executor/nodeBitmapHeapscan.h b/src/include/executor/nodeBitmapHeapscan.h index 3b0bd5acb8..64d8c6a07c 100644 --- a/src/include/executor/nodeBitmapHeapscan.h +++ b/src/include/executor/nodeBitmapHeapscan.h @@ -28,5 +28,6 @@ extern void ExecBitmapHeapReInitializeDSM(BitmapHeapScanState *node, ParallelContext *pcxt); extern void ExecBitmapHeapInitializeWorker(BitmapHeapScanState *node, ParallelWorkerContext *pwcxt); +extern void table_bitmap_scan_setup(BitmapHeapScanState *scanstate); #endif /* NODEBITMAPHEAPSCAN_H */ diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index e31ad6204e..8d083af3d3 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -1545,7 +1545,6 @@ typedef enum typedef struct ParallelBitmapHeapState { dsa_pointer tbmiterator; - dsa_pointer prefetch_iterator; slock_t mutex; int prefetch_pages; int prefetch_target; @@ -1586,19 +1585,16 @@ typedef struct BitmapHeapScanState TBMIterator *tbmiterator; TBMIterateResult *tbmres; bool can_skip_fetch; - int return_empty_tuples; Buffer vmbuffer; Buffer pvmbuffer; long exact_pages; long lossy_pages; - TBMIterator *prefetch_iterator; int prefetch_pages; int prefetch_target; int prefetch_maximum; Size pscan_len; bool initialized; TBMSharedIterator *shared_tbmiterator; - TBMSharedIterator *shared_prefetch_iterator; ParallelBitmapHeapState *pstate; } BitmapHeapScanState; diff --git a/src/include/nodes/tidbitmap.h b/src/include/nodes/tidbitmap.h index bc67166105..236de80f23 100644 --- a/src/include/nodes/tidbitmap.h +++ b/src/include/nodes/tidbitmap.h @@ -23,6 +23,8 @@ #define TIDBITMAP_H #include "storage/itemptr.h" +// TODO: not great that I am including this now +#include "storage/buf.h" #include "utils/dsa.h" @@ -40,6 +42,7 @@ typedef struct TBMSharedIterator TBMSharedIterator; typedef struct TBMIterateResult { BlockNumber blockno; /* page number containing tuples */ + Buffer buffer; int ntuples; /* -1 indicates lossy result */ bool recheck; /* should the tuples be rechecked? */ /* Note: recheck is always true if ntuples < 0 */ @@ -64,8 +67,8 @@ extern bool tbm_is_empty(const TIDBitmap *tbm); extern TBMIterator *tbm_begin_iterate(TIDBitmap *tbm); extern dsa_pointer tbm_prepare_shared_iterate(TIDBitmap *tbm); -extern TBMIterateResult *tbm_iterate(TBMIterator *iterator); -extern TBMIterateResult *tbm_shared_iterate(TBMSharedIterator *iterator); +extern void tbm_iterate(TBMIterator *iterator, TBMIterateResult *tbmres); +extern void tbm_shared_iterate(TBMSharedIterator *iterator, TBMIterateResult *tbmres); extern void tbm_end_iterate(TBMIterator *iterator); extern void tbm_end_shared_iterate(TBMSharedIterator *iterator); extern TBMSharedIterator *tbm_attach_shared_iterate(dsa_area *dsa, diff --git a/src/include/storage/aio.h b/src/include/storage/aio.h index 9a07f06b9f..8e1aa48827 100644 --- a/src/include/storage/aio.h +++ b/src/include/storage/aio.h @@ -39,7 +39,7 @@ typedef enum IoMethod } IoMethod; /* We'll default to bgworker. */ -#define DEFAULT_IO_METHOD IOMETHOD_WORKER +#define DEFAULT_IO_METHOD IOMETHOD_IO_URING /* GUCs */ extern int io_method; -- 2.27.0