Hi, Here is a patch to allow PostgreSQL to use $SUBJECT. It is from the AIO patch-set[1]. It adds three new settings, defaulting to off:
io_data_direct = whether to use O_DIRECT for main data files io_wal_direct = ... for WAL io_wal_init_direct = ... for WAL-file initialisation O_DIRECT asks the kernel to avoid caching file data as much as possible. Here's a fun quote about it[2]: "The exact semantics of Direct I/O (O_DIRECT) are not well specified. It is not a part of POSIX, or SUS, or any other formal standards specification. The exact meaning of O_DIRECT has historically been negotiated in non-public discussions between powerful enterprise database companies and proprietary Unix systems, and its behaviour has generally been passed down as oral lore rather than as a formal set of requirements and specifications." It gives the kernel the opportunity to move data directly between PostgreSQL's user space buffers and the storage hardware using DMA hardware, that is, without CPU involvement or copying. Not all storage stacks can do that, for various reasons, but even if not, the caching policy should ideally still use temporary buffers and avoid polluting the page cache. These settings currently destroy performance, and are not intended to be used by end-users, yet! That's why we filed them under DEVELOPER_OPTIONS. You don't get automatic read-ahead, concurrency, clustering or (of course) buffering from the kernel. The idea is that later parts of the AIO patch-set will introduce mechanisms to replace what the kernel is doing for us today, and then more, since we ought to be even better at predicting our own future I/O than it, so that we'll finish up ahead. Even with all that, you wouldn't want to turn it on by default because the default shared_buffers would be insufficient for any real system, and there are portability problems. Examples of slowness: * every 8KB sequential read or write becomes a full round trip to the storage, one at a time * data that is written to WAL and then read back in by WAL sender will incur full I/O round trip (that's probably not really an AIO problem, that's something we should probably address by using shared memory instead of files, as noted as a TODO item in the source code) Memory alignment patches: Direct I/O generally needs to be done to/from VM page-aligned addresses, but only "standard" 4KB pages, even when larger VM pages are in use (if there is an exotic system where that isn't true, it won't work). We need to deal with buffers on the stack, the heap and in shmem. For the stack, see patch 0001. For the heap and shared memory, see patch 0002, but David Rowley is going to propose that part separately, as MemoryContext API adjustments are a specialised enough topic to deserve another thread; here I include a copy as a dependency. The main direct I/O patch is 0003. Assorted portability notes: I expect this to "work" (that is, successfully destroy performance) on typical developer systems running at least Linux, macOS, Windows and FreeBSD. By work, I mean: not be rejected by PostgreSQL, not be rejected by the kernel, and influence kernel cache behaviour on common filesystems. It might be rejected with ENOSUPP, EINVAL etc on some more exotic filesystems and OSes. Of currently supported OSes, only OpenBSD and Solaris don't have O_DIRECT at all, and we'll reject the GUCs. For macOS and Windows we internally translate our own PG_O_DIRECT flag to the correct flags/calls (committed a while back[3]). On Windows, scatter/gather is available only with direct I/O, so a true pwritev would in theory be possible, but that has some more complications and is left for later patches (probably using native interfaces, not disguising as POSIX). There may be systems on which 8KB offset alignment will not work at all or not work well, and that's expected. For example, BTRFS, ZFS, JFS "big file", UFS etc allow larger-than-8KB blocks/records, and an 8KB write will have to trigger a read-before-write. Note that offset/length alignment requirements (blocks) are independent of buffer alignment requirements (memory pages, 4KB). The behaviour and cache coherency of files that have open descriptors using both direct and non-direct flags may be complicated and vary between systems. The patch currently lets you change the GUCs at runtime so backends can disagree: that should probably not be allowed, but is like that now for experimentation. More study is required. If someone has a compiler that we don't know how to do pg_attribute_aligned() for, then we can't make correctly aligned stack buffers, so in that case direct I/O is disabled, but I don't know of such a system (maybe aCC, but we dropped it). That's why smgr code can only assert that pointers are IO-aligned if PG_O_DIRECT != 0, and why PG_O_DIRECT is forced to 0 if there is no pg_attribute_aligned() macro, disabling the GUCs. This seems to be an independent enough piece to get into the tree on its own, with the proviso that it's not actually useful yet other than for experimentation. Thoughts? These patches have been hacked on at various times by Andres Freund, David Rowley and me. [1] https://wiki.postgresql.org/wiki/AIO [2] https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO%27s_Semantics [3] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BADiyyHe0cun2wfT%2BSVnFVqNYPxoO6J9zcZkVO7%2BNGig%40mail.gmail.com
From 87a0c14600506d2a33a5a6bedc6e58d70ff7acc7 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Wed, 24 Jun 2020 16:35:49 -0700 Subject: [PATCH 1/3] Align PGAlignedBlock to expected page size. In order to be allowed to use O_DIRECT, we need to align buffers to the page or sector size. Author: Andres Freund <and...@anarazel.de> Author: Thomas Munro <thomas.mu...@gmail.com> --- src/include/c.h | 20 ++++++++++++-------- src/include/pg_config_manual.h | 8 ++++++++ 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/src/include/c.h b/src/include/c.h index d70ed84ac5..0deaca0414 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -1070,17 +1070,18 @@ extern void ExceptionalCondition(const char *conditionName, /* * Use this, not "char buf[BLCKSZ]", to declare a field or local variable - * holding a page buffer, if that page might be accessed as a page and not - * just a string of bytes. Otherwise the variable might be under-aligned, - * causing problems on alignment-picky hardware. (In some places, we use - * this to declare buffers even though we only pass them to read() and - * write(), because copying to/from aligned buffers is usually faster than - * using unaligned buffers.) We include both "double" and "int64" in the - * union to ensure that the compiler knows the value must be MAXALIGN'ed - * (cf. configure's computation of MAXIMUM_ALIGNOF). + * holding a page buffer, if that page might be accessed as a page or passed to + * an I/O function and not just a string of bytes. Otherwise the variable + * might be under-aligned, causing problems on alignment-picky hardware, or if + * PG_O_DIRECT is used. We include both "double" and "int64" in the union to + * ensure that the compiler knows the value must be MAXALIGN'ed (cf. + * configure's computation of MAXIMUM_ALIGNOF). */ typedef union PGAlignedBlock { +#ifdef pg_attribute_aligned + pg_attribute_aligned(PG_IO_ALIGN_SIZE) +#endif char data[BLCKSZ]; double force_align_d; int64 force_align_i64; @@ -1089,6 +1090,9 @@ typedef union PGAlignedBlock /* Same, but for an XLOG_BLCKSZ-sized buffer */ typedef union PGAlignedXLogBlock { +#ifdef pg_attribute_aligned + pg_attribute_aligned(PG_IO_ALIGN_SIZE) +#endif char data[XLOG_BLCKSZ]; double force_align_d; int64 force_align_i64; diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index f2a106f983..a2ad08a110 100644 --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -227,6 +227,14 @@ */ #define PG_CACHE_LINE_SIZE 128 +/* + * Assumed memory alignment requirement for direct I/O. The real requirement + * may be based on sectors or pages. The default is the typical modern sector + * size and virtual memory page size, which is enough for currently known + * systems. + */ +#define PG_IO_ALIGN_SIZE 4096 + /* *------------------------------------------------------------------------ * The following symbols are for enabling debugging code, not for -- 2.35.1
From 7a1521dcafbc42b2482d16e8dd0781dfbd5ef2b4 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 18 Oct 2022 09:47:45 -0700 Subject: [PATCH 2/3] XXX palloc_io_aligned() -- not for review here This patch will be posted for review by David Rowley in its own thread, but a copy is included here as a dependency. --- contrib/bloom/blinsert.c | 2 +- src/backend/access/gist/gistbuild.c | 8 +- src/backend/access/gist/gistbuildbuffers.c | 5 +- src/backend/access/heap/rewriteheap.c | 2 +- src/backend/access/nbtree/nbtree.c | 2 +- src/backend/access/nbtree/nbtsort.c | 8 +- src/backend/access/spgist/spginsert.c | 2 +- src/backend/nodes/gen_node_support.pl | 2 +- src/backend/storage/buffer/buf_init.c | 7 +- src/backend/storage/buffer/localbuf.c | 4 +- src/backend/storage/page/bufpage.c | 2 +- src/backend/storage/smgr/md.c | 14 ++- src/backend/utils/mmgr/mcxt.c | 99 ++++++++++++++++++++-- src/include/nodes/memnodes.h | 5 +- src/include/utils/memutils_internal.h | 4 +- src/include/utils/palloc.h | 5 ++ 16 files changed, 141 insertions(+), 30 deletions(-) diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c index dd26d6ac29..b0da3ac529 100644 --- a/contrib/bloom/blinsert.c +++ b/contrib/bloom/blinsert.c @@ -166,7 +166,7 @@ blbuildempty(Relation index) Page metapage; /* Construct metapage. */ - metapage = (Page) palloc(BLCKSZ); + metapage = (Page) palloc_io_aligned(BLCKSZ, 0); BloomFillMetapage(index, metapage); /* diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c index fb0f466708..2daa9b2e10 100644 --- a/src/backend/access/gist/gistbuild.c +++ b/src/backend/access/gist/gistbuild.c @@ -415,7 +415,7 @@ gist_indexsortbuild(GISTBuildState *state) * Write an empty page as a placeholder for the root page. It will be * replaced with the real root page at the end. */ - page = palloc0(BLCKSZ); + page = palloc_io_aligned(BLCKSZ, MCXT_ALLOC_ZERO); smgrextend(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO, page, true); state->pages_allocated++; @@ -509,7 +509,7 @@ gist_indexsortbuild_levelstate_add(GISTBuildState *state, levelstate->current_page++; if (levelstate->pages[levelstate->current_page] == NULL) - levelstate->pages[levelstate->current_page] = palloc(BLCKSZ); + levelstate->pages[levelstate->current_page] = palloc_io_aligned(BLCKSZ, 0); newPage = levelstate->pages[levelstate->current_page]; gistinitpage(newPage, old_page_flags); @@ -579,7 +579,7 @@ gist_indexsortbuild_levelstate_flush(GISTBuildState *state, /* Create page and copy data */ data = (char *) (dist->list); - target = palloc0(BLCKSZ); + target = (Page) palloc_io_aligned(BLCKSZ, 0); gistinitpage(target, isleaf ? F_LEAF : 0); for (int i = 0; i < dist->block.num; i++) { @@ -630,7 +630,7 @@ gist_indexsortbuild_levelstate_flush(GISTBuildState *state, if (parent == NULL) { parent = palloc0(sizeof(GistSortedBuildLevelState)); - parent->pages[0] = (Page) palloc(BLCKSZ); + parent->pages[0] = (Page) palloc_io_aligned(BLCKSZ, 0); parent->parent = NULL; gistinitpage(parent->pages[0], 0); diff --git a/src/backend/access/gist/gistbuildbuffers.c b/src/backend/access/gist/gistbuildbuffers.c index 538e3880c9..9e188633ae 100644 --- a/src/backend/access/gist/gistbuildbuffers.c +++ b/src/backend/access/gist/gistbuildbuffers.c @@ -186,8 +186,9 @@ gistAllocateNewPageBuffer(GISTBuildBuffers *gfbb) { GISTNodeBufferPage *pageBuffer; - pageBuffer = (GISTNodeBufferPage *) MemoryContextAllocZero(gfbb->context, - BLCKSZ); + pageBuffer = (GISTNodeBufferPage *) + MemoryContextAllocIOAligned(gfbb->context, + BLCKSZ, MCXT_ALLOC_ZERO); pageBuffer->prev = InvalidBlockNumber; /* Set page free space */ diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c index b01b39b008..6fe7f1aed4 100644 --- a/src/backend/access/heap/rewriteheap.c +++ b/src/backend/access/heap/rewriteheap.c @@ -257,7 +257,7 @@ begin_heap_rewrite(Relation old_heap, Relation new_heap, TransactionId oldest_xm state->rs_old_rel = old_heap; state->rs_new_rel = new_heap; - state->rs_buffer = (Page) palloc(BLCKSZ); + state->rs_buffer = (Page) palloc_io_aligned(BLCKSZ, 0); /* new_heap needn't be empty, just locked */ state->rs_blockno = RelationGetNumberOfBlocks(new_heap); state->rs_buffer_valid = false; diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index b52eca8f38..924da953aa 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -153,7 +153,7 @@ btbuildempty(Relation index) Page metapage; /* Construct metapage. */ - metapage = (Page) palloc(BLCKSZ); + metapage = (Page) palloc_io_aligned(BLCKSZ, 0); _bt_initmetapage(metapage, P_NONE, 0, _bt_allequalimage(index, false)); /* diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c index 501e011ce1..563e6cce1f 100644 --- a/src/backend/access/nbtree/nbtsort.c +++ b/src/backend/access/nbtree/nbtsort.c @@ -619,7 +619,7 @@ _bt_blnewpage(uint32 level) Page page; BTPageOpaque opaque; - page = (Page) palloc(BLCKSZ); + page = (Page) palloc_io_aligned(BLCKSZ, 0); /* Zero the page and set up standard page header info */ _bt_pageinit(page, BLCKSZ); @@ -660,7 +660,9 @@ _bt_blwritepage(BTWriteState *wstate, Page page, BlockNumber blkno) while (blkno > wstate->btws_pages_written) { if (!wstate->btws_zeropage) - wstate->btws_zeropage = (Page) palloc0(BLCKSZ); + wstate->btws_zeropage = + (Page) palloc_io_aligned(BLCKSZ, MCXT_ALLOC_ZERO); + /* don't set checksum for all-zero page */ smgrextend(RelationGetSmgr(wstate->index), MAIN_FORKNUM, wstate->btws_pages_written++, @@ -1170,7 +1172,7 @@ _bt_uppershutdown(BTWriteState *wstate, BTPageState *state) * set to point to "P_NONE"). This changes the index to the "valid" state * by filling in a valid magic number in the metapage. */ - metapage = (Page) palloc(BLCKSZ); + metapage = (Page) palloc_io_aligned(BLCKSZ, 0); _bt_initmetapage(metapage, rootblkno, rootlevel, wstate->inskey->allequalimage); _bt_blwritepage(wstate, metapage, BTREE_METAPAGE); diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c index c6821b5952..d5b83710e4 100644 --- a/src/backend/access/spgist/spginsert.c +++ b/src/backend/access/spgist/spginsert.c @@ -158,7 +158,7 @@ spgbuildempty(Relation index) Page page; /* Construct metapage. */ - page = (Page) palloc(BLCKSZ); + page = (Page) palloc_io_aligned(BLCKSZ, 0); SpGistInitMetapage(page); /* diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl index 81b8c184a9..9598056821 100644 --- a/src/backend/nodes/gen_node_support.pl +++ b/src/backend/nodes/gen_node_support.pl @@ -142,7 +142,7 @@ my @abstract_types = qw(Node); # they otherwise don't participate in node support. my @extra_tags = qw( IntList OidList XidList - AllocSetContext GenerationContext SlabContext + AllocSetContext GenerationContext SlabContext AlignedAllocRedirectContext TIDBitmap WindowObjectData ); diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c index 6b6264854e..edd9bd48c3 100644 --- a/src/backend/storage/buffer/buf_init.c +++ b/src/backend/storage/buffer/buf_init.c @@ -79,8 +79,9 @@ InitBufferPool(void) &foundDescs); BufferBlocks = (char *) - ShmemInitStruct("Buffer Blocks", - NBuffers * (Size) BLCKSZ, &foundBufs); + TYPEALIGN(BLCKSZ, + ShmemInitStruct("Buffer Blocks", + (NBuffers + 1) * (Size) BLCKSZ, &foundBufs)); /* Align condition variables to cacheline boundary. */ BufferIOCVArray = (ConditionVariableMinimallyPadded *) @@ -164,6 +165,8 @@ BufferShmemSize(void) size = add_size(size, PG_CACHE_LINE_SIZE); /* size of data pages */ + /* to allow aligning buffer blocks */ + size = add_size(size, BLCKSZ); size = add_size(size, mul_size(NBuffers, BLCKSZ)); /* size of stuff controlled by freelist.c */ diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c index 30d67d1c40..f51d3527f6 100644 --- a/src/backend/storage/buffer/localbuf.c +++ b/src/backend/storage/buffer/localbuf.c @@ -546,8 +546,8 @@ GetLocalBufferStorage(void) /* And don't overflow MaxAllocSize, either */ num_bufs = Min(num_bufs, MaxAllocSize / BLCKSZ); - cur_block = (char *) MemoryContextAlloc(LocalBufferContext, - num_bufs * BLCKSZ); + cur_block = (char *) MemoryContextAllocIOAligned(LocalBufferContext, + num_bufs * BLCKSZ, 0); next_buf_in_block = 0; num_bufs_in_block = num_bufs; } diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c index 8b617c7e79..42f6f1782a 100644 --- a/src/backend/storage/page/bufpage.c +++ b/src/backend/storage/page/bufpage.c @@ -1522,7 +1522,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno) * and second to avoid wasting space in processes that never call this. */ if (pageCopy == NULL) - pageCopy = MemoryContextAlloc(TopMemoryContext, BLCKSZ); + pageCopy = MemoryContextAllocIOAligned(TopMemoryContext, BLCKSZ, 0); memcpy(pageCopy, (char *) page, BLCKSZ); ((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno); diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index a515bb36ac..719721a894 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -439,6 +439,10 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, int nbytes; MdfdVec *v; +#if PG_O_DIRECT != 0 + AssertPointerAlignment(buffer, PG_IO_ALIGN_SIZE); +#endif + /* This assert is too expensive to have on normally ... */ #ifdef CHECK_WRITE_VS_EXTEND Assert(blocknum >= mdnblocks(reln, forknum)); @@ -661,6 +665,10 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, int nbytes; MdfdVec *v; +#if PG_O_DIRECT != 0 + AssertPointerAlignment(buffer, PG_IO_ALIGN_SIZE); +#endif + TRACE_POSTGRESQL_SMGR_MD_READ_START(forknum, blocknum, reln->smgr_rlocator.locator.spcOid, reln->smgr_rlocator.locator.dbOid, @@ -726,6 +734,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, int nbytes; MdfdVec *v; +#if PG_O_DIRECT != 0 + AssertPointerAlignment(buffer, PG_IO_ALIGN_SIZE); +#endif + /* This assert is too expensive to have on normally ... */ #ifdef CHECK_WRITE_VS_EXTEND Assert(blocknum < mdnblocks(reln, forknum)); @@ -1280,7 +1292,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno, */ if (nblocks < ((BlockNumber) RELSEG_SIZE)) { - char *zerobuf = palloc0(BLCKSZ); + char *zerobuf = palloc_io_aligned(BLCKSZ, MCXT_ALLOC_ZERO); mdextend(reln, forknum, nextsegno * ((BlockNumber) RELSEG_SIZE) - 1, diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index f526ca82c1..807c0f3af3 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -36,6 +36,9 @@ static void BogusFree(void *pointer); static void *BogusRealloc(void *pointer, Size size); static MemoryContext BogusGetChunkContext(void *pointer); static Size BogusGetChunkSpace(void *pointer); +static void AlignedAllocFree(void *pointer); +static MemoryContext AlignedAllocGetChunkContext(void *pointer); + /***************************************************************************** * GLOBAL MEMORY * @@ -84,6 +87,10 @@ static const MemoryContextMethods mcxt_methods[] = { [MCTX_SLAB_ID].check = SlabCheck, #endif + /* in here */ + [MCTX_ALIGNED_REDIRECT_ID].get_chunk_context = AlignedAllocGetChunkContext, + [MCTX_ALIGNED_REDIRECT_ID].free_p = AlignedAllocFree, + /* * Unused (as yet) IDs should have dummy entries here. This allows us to * fail cleanly if a bogus pointer is passed to pfree or the like. It @@ -110,11 +117,6 @@ static const MemoryContextMethods mcxt_methods[] = { [MCTX_UNUSED4_ID].realloc = BogusRealloc, [MCTX_UNUSED4_ID].get_chunk_context = BogusGetChunkContext, [MCTX_UNUSED4_ID].get_chunk_space = BogusGetChunkSpace, - - [MCTX_UNUSED5_ID].free_p = BogusFree, - [MCTX_UNUSED5_ID].realloc = BogusRealloc, - [MCTX_UNUSED5_ID].get_chunk_context = BogusGetChunkContext, - [MCTX_UNUSED5_ID].get_chunk_space = BogusGetChunkSpace, }; /* @@ -1306,11 +1308,16 @@ void pfree(void *pointer) { #ifdef USE_VALGRIND + MemoryContextMethodID method = GetMemoryChunkMethodID(pointer); MemoryContext context = GetMemoryChunkContext(pointer); #endif MCXT_METHOD(pointer, free_p) (pointer); - VALGRIND_MEMPOOL_FREE(context, pointer); + +#ifdef USE_VALGRIND + if (method != MCTX_ALIGNED_REDIRECT_ID) + VALGRIND_MEMPOOL_FREE(context, pointer); +#endif } /* @@ -1497,3 +1504,83 @@ pchomp(const char *in) n--; return pnstrdup(in, n); } + +/* + * pointer to fake memory context + pointer to actual allocation + */ +#define ALIGNED_ALLOC_CHUNK_SIZE (sizeof(uintptr_t) + sizeof(uintptr_t)) + +#include "utils/memutils_memorychunk.h" + +static void +AlignedAllocFree(void *pointer) +{ + MemoryChunk *chunk = PointerGetMemoryChunk(pointer); + void *unaligned; + + Assert(!MemoryChunkIsExternal(chunk)); + + unaligned = MemoryChunkGetBlock(chunk); + + pfree(unaligned); +} + +MemoryContext +AlignedAllocGetChunkContext(void *pointer) +{ + MemoryChunk *chunk = PointerGetMemoryChunk(pointer); + + Assert(!MemoryChunkIsExternal(chunk)); + + return GetMemoryChunkContext(MemoryChunkGetBlock(chunk)); +} + +void * +MemoryContextAllocAligned(MemoryContext context, + Size size, Size alignto, int flags) +{ + Size alloc_size; + void *unaligned; + void *aligned; + + /* wouldn't make much sense to waste that much space */ + Assert(alignto < (128 * 1024 * 1024)); + + if (alignto < MAXIMUM_ALIGNOF) + return palloc_extended(size, flags); + + /* allocate enough space for alignment padding */ + alloc_size = size + alignto + sizeof(MemoryChunk); + + unaligned = MemoryContextAllocExtended(context, alloc_size, flags); + + aligned = (char *) unaligned + sizeof(MemoryChunk); + aligned = (void *) (TYPEALIGN(alignto, aligned) - sizeof(MemoryChunk)); + + MemoryChunkSetHdrMask(aligned, unaligned, 0, MCTX_ALIGNED_REDIRECT_ID); + + /* XXX: should we adjust valgrind state here? */ + + Assert((char *) TYPEALIGN(alignto, MemoryChunkGetPointer(aligned)) == MemoryChunkGetPointer(aligned)); + + return MemoryChunkGetPointer(aligned); +} + +void * +MemoryContextAllocIOAligned(MemoryContext context, Size size, int flags) +{ + // FIXME: don't hardcode page size + return MemoryContextAllocAligned(context, size, 4096, flags); +} + +void * +palloc_aligned(Size size, Size alignto, int flags) +{ + return MemoryContextAllocAligned(CurrentMemoryContext, size, alignto, flags); +} + +void * +palloc_io_aligned(Size size, int flags) +{ + return MemoryContextAllocIOAligned(CurrentMemoryContext, size, flags); +} diff --git a/src/include/nodes/memnodes.h b/src/include/nodes/memnodes.h index 63d07358cd..dcfe41806a 100644 --- a/src/include/nodes/memnodes.h +++ b/src/include/nodes/memnodes.h @@ -104,10 +104,11 @@ typedef struct MemoryContextData * * Add new context types to the set accepted by this macro. */ -#define MemoryContextIsValid(context) \ +#define MemoryContextIsValid(context) \ ((context) != NULL && \ (IsA((context), AllocSetContext) || \ IsA((context), SlabContext) || \ - IsA((context), GenerationContext))) + IsA((context), GenerationContext) || \ + IsA((context), AlignedAllocRedirectContext))) #endif /* MEMNODES_H */ diff --git a/src/include/utils/memutils_internal.h b/src/include/utils/memutils_internal.h index bc2cbdd506..9611a192a2 100644 --- a/src/include/utils/memutils_internal.h +++ b/src/include/utils/memutils_internal.h @@ -92,8 +92,8 @@ typedef enum MemoryContextMethodID MCTX_ASET_ID, MCTX_GENERATION_ID, MCTX_SLAB_ID, - MCTX_UNUSED4_ID, /* available */ - MCTX_UNUSED5_ID /* 111 occurs in wipe_mem'd memory */ + MCTX_ALIGNED_REDIRECT_ID, + MCTX_UNUSED4_ID /* 111 occurs in wipe_mem'd memory */ } MemoryContextMethodID; /* diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h index 8eee0e2938..0b0ba2a953 100644 --- a/src/include/utils/palloc.h +++ b/src/include/utils/palloc.h @@ -73,10 +73,15 @@ extern void *MemoryContextAllocZero(MemoryContext context, Size size); extern void *MemoryContextAllocZeroAligned(MemoryContext context, Size size); extern void *MemoryContextAllocExtended(MemoryContext context, Size size, int flags); +extern void *MemoryContextAllocAligned(MemoryContext context, + Size size, Size alignto, int flags); +extern void *MemoryContextAllocIOAligned(MemoryContext context, Size size, int flags); extern void *palloc(Size size); extern void *palloc0(Size size); extern void *palloc_extended(Size size, int flags); +extern void *palloc_aligned(Size size, Size alignto, int flags); +extern void *palloc_io_aligned(Size size, int flags); extern pg_nodiscard void *repalloc(void *pointer, Size size); extern pg_nodiscard void *repalloc_extended(void *pointer, Size size, int flags); -- 2.35.1
From 819a406f029b04ab6a500f63fe9c154332b65d8e Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 3 Oct 2022 21:58:22 -0700 Subject: [PATCH 3/3] Add direct I/O settings (developer-only). Provide a way to ask the kernel to use O_DIRECT (or local equivalent) for data and WAL files. This hurts performance currently and is not intended for end-users yet. Later proposed work will introduce our own I/O clustering, read-ahead, etc to replace the kernel features that are disabled with this option. This replaces the previous logic that would use O_DIRECT for the WAL in limited and obscure cases, now that there is an explicit setting. Discussion: https://postgr.es/m/ Author: Andres Freund <and...@anarazel.de> Author: Thomas Munro <thomas.mu...@gmail.com> --- doc/src/sgml/config.sgml | 51 ++++++++++++++++++++ src/backend/access/transam/xlog.c | 53 +++++++++++++-------- src/backend/access/transam/xlogprefetcher.c | 2 +- src/backend/storage/buffer/bufmgr.c | 13 +++-- src/backend/storage/buffer/localbuf.c | 4 +- src/backend/storage/file/fd.c | 5 ++ src/backend/storage/smgr/md.c | 29 +++++++++-- src/backend/storage/smgr/smgr.c | 20 ++++++++ src/backend/utils/misc/guc_tables.c | 33 +++++++++++++ src/include/access/xlog.h | 2 + src/include/storage/fd.h | 6 ++- src/include/storage/smgr.h | 5 ++ src/include/utils/guc_hooks.h | 2 + 13 files changed, 190 insertions(+), 35 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 559eb898a9..2d860dd900 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -11011,6 +11011,57 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' </listitem> </varlistentry> + <varlistentry id="guc-io-data-direct" xreflabel="io_data_direct"> + <term><varname>io_data_direct</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>io_data_direct</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Ask the kernel to minimize caching effects for relation data files + using <literal>O_DIRECT</literal> (most Unix-like systems), + <literal>F_NOCACHE</literal> (macOS) or + <literal>FILE_FLAG_NO_BUFFERING</literal> (Windows). Currently this + hurts performance, and is intended for developer testing only. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-io-wal-direct" xreflabel="io_wal_direct"> + <term><varname>io_wal_direct</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>io_wal_direct</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Ask the kernel to minimize caching effects while writing WAL files + using <literal>O_DIRECT</literal> (most Unix-like systems), + <literal>F_NOCACHE</literal> (macOS) or + <literal>FILE_FLAG_NO_BUFFERING</literal> (Windows). Currently this + hurts performance, and is intended for developer testing only. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-io-wal-init-direct" xreflabel="io_wal_init_direct"> + <term><varname>io_wal_init_direct</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>io_wal_init_direct</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Ask the kernel to minimize caching effects while initializing WAL files + using <literal>O_DIRECT</literal> (most Unix-like systems), + <literal>F_NOCACHE</literal> (macOS) or + <literal>FILE_FLAG_NO_BUFFERING</literal> (Windows). Currently this + hurts performance, and is intended for developer testing only. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-post-auth-delay" xreflabel="post_auth_delay"> <term><varname>post_auth_delay</varname> (<type>integer</type>) <indexterm> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8f10effe3a..5663bdf856 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -138,6 +138,8 @@ int wal_retrieve_retry_interval = 5000; int max_slot_wal_keep_size_mb = -1; int wal_decode_buffer_size = 512 * 1024; bool track_wal_io_timing = false; +bool io_wal_direct = false; +bool io_wal_init_direct = false; #ifdef WAL_DEBUG bool XLOG_DEBUG = false; @@ -2926,6 +2928,7 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli, XLogSegNo max_segno; int fd; int save_errno; + int open_flags = O_RDWR | O_CREAT | O_EXCL | PG_BINARY; Assert(logtli != 0); @@ -2958,8 +2961,11 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli, unlink(tmppath); + if (io_wal_init_direct) + open_flags |= PG_O_DIRECT; + /* do not use get_sync_bit() here --- want to fsync only at end of fill */ - fd = BasicOpenFile(tmppath, O_RDWR | O_CREAT | O_EXCL | PG_BINARY); + fd = BasicOpenFile(tmppath, open_flags); if (fd < 0) ereport(ERROR, (errcode_for_file_access(), @@ -3373,7 +3379,7 @@ XLogFileClose(void) * use the cache to read the WAL segment. */ #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED) - if (!XLogIsNeeded()) + if (!XLogIsNeeded() && !io_wal_direct) (void) posix_fadvise(openLogFile, 0, 0, POSIX_FADV_DONTNEED); #endif @@ -4473,6 +4479,21 @@ show_in_hot_standby(void) return RecoveryInProgress() ? "on" : "off"; } +/* + * GUC check for direct I/O support. + */ +bool +check_io_wal_direct(bool *newval, void **extra, GucSource source) +{ +#if PG_O_DIRECT == 0 + if (*newval) + { + GUC_check_errdetail("io_wal_direct and io_wal_init_direct are not supported on this platform."); + return false; + } +#endif + return true; +} /* * Read the control file, set respective GUCs. @@ -8056,35 +8077,27 @@ xlog_redo(XLogReaderState *record) } /* - * Return the (possible) sync flag used for opening a file, depending on the - * value of the GUC wal_sync_method. + * Return the extra open flags used for opening a file, depending on the + * value of the GUCs wal_sync_method, fsync and io_wal_direct. */ static int get_sync_bit(int method) { int o_direct_flag = 0; - /* If fsync is disabled, never open in sync mode */ - if (!enableFsync) - return 0; - /* - * Optimize writes by bypassing kernel cache with O_DIRECT when using - * O_SYNC and O_DSYNC. But only if archiving and streaming are disabled, - * otherwise the archive command or walsender process will read the WAL - * soon after writing it, which is guaranteed to cause a physical read if - * we bypassed the kernel cache. We also skip the - * posix_fadvise(POSIX_FADV_DONTNEED) call in XLogFileClose() for the same - * reason. - * - * Never use O_DIRECT in walreceiver process for similar reasons; the WAL + * Use O_DIRECT if requested, except in walreceiver process. The WAL * written by walreceiver is normally read by the startup process soon - * after it's written. Also, walreceiver performs unaligned writes, which + * after it's written. Also, walreceiver performs unaligned writes, which * don't work with O_DIRECT, so it is required for correctness too. */ - if (!XLogIsNeeded() && !AmWalReceiverProcess()) + if (io_wal_direct && !AmWalReceiverProcess()) o_direct_flag = PG_O_DIRECT; + /* If fsync is disabled, never open in sync mode */ + if (!enableFsync) + return o_direct_flag; + switch (method) { /* @@ -8096,7 +8109,7 @@ get_sync_bit(int method) case SYNC_METHOD_FSYNC: case SYNC_METHOD_FSYNC_WRITETHROUGH: case SYNC_METHOD_FDATASYNC: - return 0; + return o_direct_flag; #ifdef O_SYNC case SYNC_METHOD_OPEN: return O_SYNC | o_direct_flag; diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c index 0cf03945ee..d840078afc 100644 --- a/src/backend/access/transam/xlogprefetcher.c +++ b/src/backend/access/transam/xlogprefetcher.c @@ -785,7 +785,7 @@ XLogPrefetcherNextBlock(uintptr_t pgsr_private, XLogRecPtr *lsn) block->prefetch_buffer = InvalidBuffer; return LRQ_NEXT_IO; } - else + else if (!io_data_direct) { /* * This shouldn't be possible, because we already determined diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 6b95381481..9918855f37 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -535,7 +535,7 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln, * Try to initiate an asynchronous read. This returns false in * recovery if the relation file doesn't exist. */ - if (smgrprefetch(smgr_reln, forkNum, blockNum)) + if (!io_data_direct && smgrprefetch(smgr_reln, forkNum, blockNum)) result.initiated_io = true; #endif /* USE_PREFETCH */ } @@ -582,11 +582,11 @@ PrefetchSharedBuffer(SMgrRelation smgr_reln, * the kernel and therefore didn't really initiate I/O, and no way to know when * the I/O completes other than using synchronous ReadBuffer(). * - * 3. Otherwise, the buffer wasn't already cached by PostgreSQL, and either + * 3. Otherwise, the buffer wasn't already cached by PostgreSQL, and * USE_PREFETCH is not defined (this build doesn't support prefetching due to - * lack of a kernel facility), or the underlying relation file wasn't found and - * we are in recovery. (If the relation file wasn't found and we are not in - * recovery, an error is raised). + * lack of a kernel facility), io_data_direct is enabled, or the underlying + * relation file wasn't found and we are in recovery. (If the relation file + * wasn't found and we are not in recovery, an error is raised). */ PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum) @@ -4908,6 +4908,9 @@ ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag) { PendingWriteback *pending; + if (io_data_direct) + return; + /* * Add buffer to the pending writeback array, unless writeback control is * disabled. diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c index f51d3527f6..f9c82a789e 100644 --- a/src/backend/storage/buffer/localbuf.c +++ b/src/backend/storage/buffer/localbuf.c @@ -87,8 +87,8 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum, { #ifdef USE_PREFETCH /* Not in buffers, so initiate prefetch */ - smgrprefetch(smgr, forkNum, blockNum); - result.initiated_io = true; + if (!io_data_direct && smgrprefetch(smgr, forkNum, blockNum)) + result.initiated_io = true; #endif /* USE_PREFETCH */ } diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index 4151cafec5..aa720952f8 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -2021,6 +2021,11 @@ FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info) if (nbytes <= 0) return; +#ifdef PG_O_DIRECT + if (VfdCache[file].fileFlags & PG_O_DIRECT) + return; +#endif + returnCode = FileAccess(file); if (returnCode < 0) return; diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index 719721a894..20ec37c310 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -142,6 +142,21 @@ static MdfdVec *_mdfd_getseg(SMgrRelation reln, ForkNumber forknum, static BlockNumber _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg); +static inline int +_mdfd_open_flags(ForkNumber forkNum) +{ + int flags = O_RDWR | PG_BINARY; + + /* + * XXX: not clear if direct IO ever is interesting for other forks? The + * FSM fork currently often ends up very fragmented when using direct IO, + * for example. + */ + if (io_data_direct /* && forkNum == MAIN_FORKNUM */) + flags |= PG_O_DIRECT; + + return flags; +} /* * mdinit() -- Initialize private state for magnetic disk storage manager. @@ -205,14 +220,14 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo) path = relpath(reln->smgr_rlocator, forknum); - fd = PathNameOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY); + fd = PathNameOpenFile(path, _mdfd_open_flags(forknum) | O_CREAT | O_EXCL); if (fd < 0) { int save_errno = errno; if (isRedo) - fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); + fd = PathNameOpenFile(path, _mdfd_open_flags(forknum)); if (fd < 0) { /* be sure to report the error reported by create, not open */ @@ -513,7 +528,7 @@ mdopenfork(SMgrRelation reln, ForkNumber forknum, int behavior) path = relpath(reln->smgr_rlocator, forknum); - fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); + fd = PathNameOpenFile(path, _mdfd_open_flags(forknum)); if (fd < 0) { @@ -584,6 +599,8 @@ mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum) off_t seekpos; MdfdVec *v; + Assert(!io_data_direct); + v = _mdfd_getseg(reln, forknum, blocknum, false, InRecovery ? EXTENSION_RETURN_NULL : EXTENSION_FAIL); if (v == NULL) @@ -609,6 +626,8 @@ void mdwriteback(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, BlockNumber nblocks) { + Assert(!io_data_direct); + /* * Issue flush requests in as few requests as possible; have to split at * segment boundaries though, since those are actually separate files. @@ -1186,7 +1205,7 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber forknum, BlockNumber segno, fullpath = _mdfd_segpath(reln, forknum, segno); /* open the file */ - fd = PathNameOpenFile(fullpath, O_RDWR | PG_BINARY | oflags); + fd = PathNameOpenFile(fullpath, _mdfd_open_flags(forknum) | oflags); pfree(fullpath); @@ -1395,7 +1414,7 @@ mdsyncfiletag(const FileTag *ftag, char *path) strlcpy(path, p, MAXPGPATH); pfree(p); - file = PathNameOpenFile(path, O_RDWR | PG_BINARY); + file = PathNameOpenFile(path, _mdfd_open_flags(ftag->forknum)); if (file < 0) return -1; need_to_close = true; diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c index c1a5febcbf..706a52b9f1 100644 --- a/src/backend/storage/smgr/smgr.c +++ b/src/backend/storage/smgr/smgr.c @@ -20,6 +20,7 @@ #include "access/xlogutils.h" #include "lib/ilist.h" #include "storage/bufmgr.h" +#include "storage/fd.h" #include "storage/ipc.h" #include "storage/md.h" #include "storage/smgr.h" @@ -27,6 +28,9 @@ #include "utils/inval.h" +/* GUCs */ +bool io_data_direct = false; + /* * This struct of function pointers defines the API between smgr.c and * any individual storage manager module. Note that smgr subfunctions are @@ -735,3 +739,19 @@ ProcessBarrierSmgrRelease(void) smgrreleaseall(); return true; } + +/* + * Check if this build allows smgr implementations to enable direct I/O. + */ +bool +check_io_data_direct(bool *newval, void **extra, GucSource source) +{ +#if PG_O_DIRECT == 0 + if (*newval) + { + GUC_check_errdetail("io_data_direct is not supported on this platform."); + return false; + } +#endif + return true; +} diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 05ab087934..e324378ad4 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -1925,6 +1925,39 @@ struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, + { + {"io_data_direct", PGC_SUSET, DEVELOPER_OPTIONS, + gettext_noop("Access data files with direct I/O."), + NULL, + GUC_NOT_IN_SAMPLE + }, + &io_data_direct, + false, + check_io_data_direct, NULL, NULL + }, + + { + {"io_wal_direct", PGC_SUSET, DEVELOPER_OPTIONS, + gettext_noop("Write WAL files with direct I/O."), + NULL, + GUC_NOT_IN_SAMPLE + }, + &io_wal_direct, + false, + check_io_wal_direct, NULL, NULL + }, + + { + {"io_wal_init_direct", PGC_SUSET, DEVELOPER_OPTIONS, + gettext_noop("Initialize WAL files with direct I/O."), + NULL, + GUC_NOT_IN_SAMPLE + }, + &io_wal_init_direct, + false, + check_io_wal_direct, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL, NULL diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 1fbd48fbda..6220370036 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -51,6 +51,8 @@ extern PGDLLIMPORT char *wal_consistency_checking_string; extern PGDLLIMPORT bool log_checkpoints; extern PGDLLIMPORT bool track_wal_io_timing; extern PGDLLIMPORT int wal_decode_buffer_size; +extern PGDLLIMPORT bool io_wal_direct; +extern PGDLLIMPORT bool io_wal_init_direct; extern PGDLLIMPORT int CheckPointSegments; diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h index c0a212487d..283ff21e31 100644 --- a/src/include/storage/fd.h +++ b/src/include/storage/fd.h @@ -44,6 +44,7 @@ #define FD_H #include <dirent.h> +#include <fcntl.h> typedef enum RecoveryInitSyncMethod { @@ -82,9 +83,10 @@ extern PGDLLIMPORT int max_safe_fds; * to the appropriate Windows flag in src/port/open.c. We simulate it with * fcntl(F_NOCACHE) on macOS inside fd.c's open() wrapper. We use the name * PG_O_DIRECT rather than defining O_DIRECT in that case (probably not a good - * idea on a Unix). + * idea on a Unix). We can only use it if the compiler will correctly align + * PGAlignedBlock for us, though. */ -#if defined(O_DIRECT) +#if defined(O_DIRECT) && defined(pg_attribute_aligned) #define PG_O_DIRECT O_DIRECT #elif defined(F_NOCACHE) #define PG_O_DIRECT 0x80000000 diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h index a07715356b..ef75934a16 100644 --- a/src/include/storage/smgr.h +++ b/src/include/storage/smgr.h @@ -17,6 +17,10 @@ #include "lib/ilist.h" #include "storage/block.h" #include "storage/relfilelocator.h" +#include "utils/guc.h" + +/* GUCs */ +extern PGDLLIMPORT bool io_data_direct; /* * smgr.c maintains a table of SMgrRelation objects, which are essentially @@ -107,5 +111,6 @@ extern void smgrtruncate(SMgrRelation reln, ForkNumber *forknum, extern void smgrimmedsync(SMgrRelation reln, ForkNumber forknum); extern void AtEOXact_SMgr(void); extern bool ProcessBarrierSmgrRelease(void); +extern bool check_io_data_direct(bool *newval, void **extra, GucSource source); #endif /* SMGR_H */ diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h index f1a9a183b4..a9748f6b34 100644 --- a/src/include/utils/guc_hooks.h +++ b/src/include/utils/guc_hooks.h @@ -59,6 +59,8 @@ extern bool check_effective_io_concurrency(int *newval, void **extra, GucSource source); extern bool check_huge_page_size(int *newval, void **extra, GucSource source); extern const char *show_in_hot_standby(void); +extern bool check_io_data_direct(bool *newval, void **extra, GucSource source); +extern bool check_io_wal_direct(bool *newval, void **extra, GucSource source); extern bool check_locale_messages(char **newval, void **extra, GucSource source); extern void assign_locale_messages(const char *newval, void *extra); extern bool check_locale_monetary(char **newval, void **extra, GucSource source); -- 2.35.1