v5 attached. On Thu, May 4, 2023 at 12:44 PM Andres Freund <and...@anarazel.de> wrote: > On 2023-04-27 11:36:49 -0400, Melanie Plageman wrote: > > > > /* and finally tell the kernel to write the data to > > > > storage */ > > > > reln = smgropen(currlocator, InvalidBackendId); > > > > smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, > > > > nblocks); > > > > Yes, as it is currently, IssuePendingWritebacks() is only used for shared > > buffers. My rationale for including IOObject is that localbuf.c calls > > smgr* functions and there isn't anything stopping it from calling > > smgrwriteback() or using WritebackContexts (AFAICT). > > I think it's extremely unlikely that we'll ever do that, because it's very > common to have temp tables that are bigger than temp_buffers. We basically > hope that the kernel can do good caching for us there. > > > > > Or I actually think we might not even need to pass around the io_* > > > parameters and could just pass immediate values to the > > > pgstat_count_io_op_time call. If we ever start using shared buffers > > > for thing other than relation files (for example SLRU?), we'll have to > > > consider the target individually for each buffer block. That being > > > said, I'm fine with how it is either. > > > > In IssuePendingWritebacks() we don't actually know which IOContext we > > are issuing writebacks for when we call pgstat_count_io_op_time() (we do > > issue pending writebacks for other IOContexts than IOCONTEXT_NORMAL). I > > agree IOObject is not strictly necessary right now. I've kept IOObject a > > member of WritebackContext for the reasons I mention above, however, I > > am open to removing it if it adds confusion. > > I don't think it's really worth adding struct members for potential future > safety. We can just add them later if we end up needing them.
I've removed both members of WritebackContext and hard-coded IOOBJECT_RELATION in the call to pgstat_count_io_op_time(). > > From 7cdd6fc78ed82180a705ab9667714f80d08c5f7d Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplage...@gmail.com> > > Date: Mon, 24 Apr 2023 18:21:54 -0400 > > Subject: [PATCH v4] Add writeback to pg_stat_io > > > > 28e626bde00 added the notion of IOOps but neglected to include > > writeback. With the addition of IO timing to pg_stat_io in ac8d53dae5, > > the omission of writeback caused some confusion. Checkpointer write > > timing in pg_stat_io often differed greatly from the write timing > > written to the log. To fix this, add IOOp IOOP_WRITEBACK and track > > writebacks and writeback timing in pg_stat_io. > > For the future: It'd be good to note that catversion needs to be increased. Noted. I've added it to the commit message since I did a new version anyway. > > index 99f7f95c39..27b6f1a0a0 100644 > > --- a/doc/src/sgml/monitoring.sgml > > +++ b/doc/src/sgml/monitoring.sgml > > @@ -3867,6 +3867,32 @@ SELECT pid, wait_event_type, wait_event FROM > > pg_stat_activity WHERE wait_event i > > </entry> > > </row> > > > > + <row> > > + <entry role="catalog_table_entry"> > > + <para role="column_definition"> > > + <structfield>writebacks</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Number of units of size <varname>op_bytes</varname> which the > > backend > > + requested the kernel write out to permanent storage. > > + </para> > > + </entry> > > + </row> > > I think the reference to "backend" here is somewhat misplaced - it could be > checkpointer or bgwriter as well. We don't reference the backend in other > comparable columns of pgsio either... So, I tried to come up with something that doesn't make reference to any "requester" of the writeback and the best I could do was: "Number of units of size op_bytes requested that the kernel write out." This is awfully awkward sounding. "backend_type" is the name of the column in pg_stat_io. Client backends are always referred to as such in the pg_stat_io documentation. Thus, I think it is reasonable to use the word "backend" and assume people understand it could be any type of backend. However, since the existing docs for pg_stat_bgwriter use "backend" to mean "client backend", and I see a few uses of the word "process" in the stats docs, I've changed my use of the word "backend" to "process". > > diff --git a/src/backend/storage/buffer/buf_init.c > > b/src/backend/storage/buffer/buf_init.c > > index 0057443f0c..a7182fe95a 100644 > > --- a/src/backend/storage/buffer/buf_init.c > > +++ b/src/backend/storage/buffer/buf_init.c > > @@ -145,9 +145,15 @@ InitBufferPool(void) > > /* Init other shared buffer-management stuff */ > > StrategyInitialize(!foundDescs); > > > > - /* Initialize per-backend file flush context */ > > - WritebackContextInit(&BackendWritebackContext, > > - &backend_flush_after); > > + /* > > + * Initialize per-backend file flush context. IOContext is > > initialized to > > + * IOCONTEXT_NORMAL because this is the most common context. IOObject > > is > > + * initialized to IOOBJECT_RELATION because writeback is currently > > only > > + * requested for permanent relations in shared buffers. The backend > > can > > + * overwrite these as appropriate. > > + */ > > + WritebackContextInit(&BackendWritebackContext, IOOBJECT_RELATION, > > + IOCONTEXT_NORMAL, > > &backend_flush_after); > > } > > > > This seems somewhat icky. I've removed both IOObject and IOContext from WritebackContext. > > /* > > diff --git a/src/backend/storage/buffer/bufmgr.c > > b/src/backend/storage/buffer/bufmgr.c > > index 1fa689052e..116910cdfe 100644 > > --- a/src/backend/storage/buffer/bufmgr.c > > +++ b/src/backend/storage/buffer/bufmgr.c > > @@ -1685,6 +1685,8 @@ again: > > FlushBuffer(buf_hdr, NULL, IOOBJECT_RELATION, io_context); > > LWLockRelease(content_lock); > > > > + BackendWritebackContext.io_object = IOOBJECT_RELATION; > > + BackendWritebackContext.io_context = io_context; > > ScheduleBufferTagForWriteback(&BackendWritebackContext, > > > > &buf_hdr->tag); > > } > > What about passing the io_context to ScheduleBufferTagForWriteback instead? I assume we don't want to include the time spent in sort_pending_writebacks(), so I've added io_context as a parameter to ScheduleBufferTagForWriteback and threaded it through IssuePendingWritebacks() as well. Because WritebackContext was just called "context" as a function parameter to these functions and that was easy to confuse with "io_context", I've changed the name of the WritebackContext function parameter to "wb_context". I separated this rename into its own commit so that the diff for the commit adding writeback is more clear. I assume the committer will squash those commits. > > --- a/src/test/regress/sql/stats.sql > > +++ b/src/test/regress/sql/stats.sql > > Hm. Could we add a test for this? While it's not implemented everywhere, we > still issue the smgrwriteback() afaics. The default for the _flush_after GUCs > changes, but backend_flush_after is USERSET, so we could just change it for a > single command. I couldn't come up with a way to write a test for this. GetVictimBuffer() is only called when flushing a dirty buffer. I tried adding backend_flush_after = 1 and sum(writebacks) to the test for reuses with vacuum strategy, but there were never more writebacks (in any context) after doing the full table rewrite with VACUUM. I presume this is because checkpointer or bgwriter is writing out the dirty buffers before our client backend gets to reusing them. And, since bgwriter/checkpointer_flush_after are not USERSET, I don't think we can guarantee this will cause writeback operations. Just anecdotally, I increased size of the table to exceed checkpoint_flush_after on my Postgres instance and I could get the test to cause writeback, but that doesn't work for a portable test. This was the same reason we couldn't test writes for VACUUM strategy. I did notice while working on this that, with the addition of the VACUUM parameter BUFFER_USAGE_LIMIT, we could decrease the size of the table in the vacuum strategy reuses test. Not sure if this is legit to commit now since it isn't required for the writeback patch set, but I included a patch for it in this patchset. On Thu, May 4, 2023 at 12:57 PM Andres Freund <and...@anarazel.de> wrote: > On 2023-04-24 21:29:48 -0400, Melanie Plageman wrote: > > > 2) I'm a little nervous about not including IOObject in the writeback > > context. Technically, there is nothing stopping local buffer code from > > calling IssuePendingWritebacks(). Right now, local buffer code doesn't > > do ScheduleBufferTagForWriteback(). But it doesn't seem quite right to > > hardcode in IOOBJECT_RELATION when there is nothing wrong with > > requesting writeback of local buffers (AFAIK). What do you think? > > I think it'd be wrong on performance grounds ;). We could add an assertion to > ScheduleBufferTagForWriteback(), I guess, to document that fact? Now that it doesn't have access to IOObject, no need. I've added comments elsewhere. > > 3) Should any restrictions be added to pgstat_tracks_io_object() or > > pgstat_tracks_io_op()? I couldn't think of any backend types or IO > > contexts which would not do writeback as a rule. Also, though we don't > > do writeback for temp tables now, it isn't nonsensical to do so. In > > this version, I didn't add any restrictions. > > I think the temp table restriction could be encoded for now, I don't forsee > that changing anytime soon. I've done that in the attached v5. - Melanie
From 0ab9af06934ab551f2224493e913ae26000b3f87 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sat, 6 May 2023 12:24:28 -0400 Subject: [PATCH v5 1/3] BUFFER_USAGE_LIMIT reduces needed test table size Using the minimum BUFFER_USAGE_LIMIT value, the VACUUM ring buffer is smaller and we can make one of the pg_stat_io test tables smaller while still causing reuses. --- src/test/regress/expected/stats.out | 6 ++++-- src/test/regress/sql/stats.sql | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index 813d6d39ea..d4e4889e7b 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -1337,11 +1337,13 @@ SET wal_skip_threshold = '1 kB'; SELECT sum(reuses) AS reuses, sum(reads) AS reads FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_before_ CREATE TABLE test_io_vac_strategy(a int, b int) WITH (autovacuum_enabled = 'false'); -INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 8000)i; +INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 4500)i; -- Ensure that the next VACUUM will need to perform IO by rewriting the table -- first with VACUUM (FULL). VACUUM (FULL) test_io_vac_strategy; -VACUUM (PARALLEL 0) test_io_vac_strategy; +-- Use the minimum BUFFER_USAGE_LIMIT to cause reuses with the smallest table +-- possible. +VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy; SELECT pg_stat_force_next_flush(); pg_stat_force_next_flush -------------------------- diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index 99a28bb79c..7ba341314f 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -670,11 +670,13 @@ SET wal_skip_threshold = '1 kB'; SELECT sum(reuses) AS reuses, sum(reads) AS reads FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_before_ CREATE TABLE test_io_vac_strategy(a int, b int) WITH (autovacuum_enabled = 'false'); -INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 8000)i; +INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 4500)i; -- Ensure that the next VACUUM will need to perform IO by rewriting the table -- first with VACUUM (FULL). VACUUM (FULL) test_io_vac_strategy; -VACUUM (PARALLEL 0) test_io_vac_strategy; +-- Use the minimum BUFFER_USAGE_LIMIT to cause reuses with the smallest table +-- possible. +VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy; SELECT pg_stat_force_next_flush(); SELECT sum(reuses) AS reuses, sum(reads) AS reads FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_after_ -- 2.37.2
From 87a9fa0da3895083d24bb2ee9a744a5afa11e5af Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sat, 6 May 2023 11:11:04 -0400 Subject: [PATCH v5 2/3] Update parameter name context to wb_context For clarity of review, renaming the function parameter "context" in ScheduleBufferTagForWriteback() and IssuePendingWritebacks() to "wb_context" is a separate commit. The next commit adds an "io_context" parameter and "wb_context" makes it more clear which is which. --- src/backend/storage/buffer/bufmgr.c | 28 ++++++++++++++-------------- src/include/storage/buf_internals.h | 4 ++-- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 1fa689052e..041d5dc129 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -5445,7 +5445,7 @@ WritebackContextInit(WritebackContext *context, int *max_pending) * Add buffer to list of pending writeback requests. */ void -ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag) +ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag) { PendingWriteback *pending; @@ -5456,11 +5456,11 @@ ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag) * Add buffer to the pending writeback array, unless writeback control is * disabled. */ - if (*context->max_pending > 0) + if (*wb_context->max_pending > 0) { - Assert(*context->max_pending <= WRITEBACK_MAX_PENDING_FLUSHES); + Assert(*wb_context->max_pending <= WRITEBACK_MAX_PENDING_FLUSHES); - pending = &context->pending_writebacks[context->nr_pending++]; + pending = &wb_context->pending_writebacks[wb_context->nr_pending++]; pending->tag = *tag; } @@ -5470,8 +5470,8 @@ ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag) * includes the case where previously an item has been added, but control * is now disabled. */ - if (context->nr_pending >= *context->max_pending) - IssuePendingWritebacks(context); + if (wb_context->nr_pending >= *wb_context->max_pending) + IssuePendingWritebacks(wb_context); } #define ST_SORT sort_pending_writebacks @@ -5489,25 +5489,25 @@ ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag) * error out - it's just a hint. */ void -IssuePendingWritebacks(WritebackContext *context) +IssuePendingWritebacks(WritebackContext *wb_context) { int i; - if (context->nr_pending == 0) + if (wb_context->nr_pending == 0) return; /* * Executing the writes in-order can make them a lot faster, and allows to * merge writeback requests to consecutive blocks into larger writebacks. */ - sort_pending_writebacks(context->pending_writebacks, context->nr_pending); + sort_pending_writebacks(wb_context->pending_writebacks, wb_context->nr_pending); /* * Coalesce neighbouring writes, but nothing else. For that we iterate * through the, now sorted, array of pending flushes, and look forward to * find all neighbouring (or identical) writes. */ - for (i = 0; i < context->nr_pending; i++) + for (i = 0; i < wb_context->nr_pending; i++) { PendingWriteback *cur; PendingWriteback *next; @@ -5517,7 +5517,7 @@ IssuePendingWritebacks(WritebackContext *context) RelFileLocator currlocator; Size nblocks = 1; - cur = &context->pending_writebacks[i]; + cur = &wb_context->pending_writebacks[i]; tag = cur->tag; currlocator = BufTagGetRelFileLocator(&tag); @@ -5525,10 +5525,10 @@ IssuePendingWritebacks(WritebackContext *context) * Peek ahead, into following writeback requests, to see if they can * be combined with the current one. */ - for (ahead = 0; i + ahead + 1 < context->nr_pending; ahead++) + for (ahead = 0; i + ahead + 1 < wb_context->nr_pending; ahead++) { - next = &context->pending_writebacks[i + ahead + 1]; + next = &wb_context->pending_writebacks[i + ahead + 1]; /* different file, stop */ if (!RelFileLocatorEquals(currlocator, @@ -5555,7 +5555,7 @@ IssuePendingWritebacks(WritebackContext *context) smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks); } - context->nr_pending = 0; + wb_context->nr_pending = 0; } diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h index 136cf8fbaf..f8ac811379 100644 --- a/src/include/storage/buf_internals.h +++ b/src/include/storage/buf_internals.h @@ -388,8 +388,8 @@ extern PGDLLIMPORT CkptSortItem *CkptBufferIds; */ /* bufmgr.c */ extern void WritebackContextInit(WritebackContext *context, int *max_pending); -extern void IssuePendingWritebacks(WritebackContext *context); -extern void ScheduleBufferTagForWriteback(WritebackContext *context, BufferTag *tag); +extern void IssuePendingWritebacks(WritebackContext *wb_context); +extern void ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag); /* freelist.c */ extern IOContext IOContextForStrategy(BufferAccessStrategy strategy); -- 2.37.2
From 8420944a3b57a292ce56f34a985498844e8dd3c1 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sat, 6 May 2023 11:16:58 -0400 Subject: [PATCH v5 3/3] Add writeback to pg_stat_io 28e626bde00 added the concept of IOOps but neglected to include writeback operations. ac8d53dae5 added time spent doing these I/O operations. Without counting writeback, checkpointer write timing in the log often differed substantially from that in pg_stat_io. To fix this, add IOOp IOOP_WRITEBACK and track writeback in pg_stat_io. NOTE: Catalog version must be increased. Reviewed-by: Kyotaro Horiguchi <horikyota....@gmail.com> Reported-by: Andres Freund <and...@anarazel.de> Discussion: https://postgr.es/m/20230419172326.dhgyo4wrrhulovt6%40awork3.anarazel.de --- doc/src/sgml/monitoring.sgml | 26 +++++++++++++++++++++ src/backend/catalog/system_views.sql | 2 ++ src/backend/storage/buffer/bufmgr.c | 32 ++++++++++++++++++++------ src/backend/utils/activity/pgstat_io.c | 14 ++++++----- src/backend/utils/adt/pgstatfuncs.c | 5 ++++ src/include/catalog/pg_proc.dat | 6 ++--- src/include/pgstat.h | 3 ++- src/include/storage/buf_internals.h | 5 ++-- src/test/regress/expected/rules.out | 4 +++- src/test/regress/expected/stats.out | 4 ++-- src/test/regress/sql/stats.sql | 4 ++-- 11 files changed, 81 insertions(+), 24 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 99f7f95c39..cf13ade205 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -3867,6 +3867,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i </entry> </row> + <row> + <entry role="catalog_table_entry"> + <para role="column_definition"> + <structfield>writebacks</structfield> <type>bigint</type> + </para> + <para> + Number of units of size <varname>op_bytes</varname> which the process + requested the kernel write out to permanent storage. + </para> + </entry> + </row> + + <row> + <entry role="catalog_table_entry"> + <para role="column_definition"> + <structfield>writeback_time</structfield> <type>double precision</type> + </para> + <para> + Time spent in writeback operations in milliseconds (if + <xref linkend="guc-track-io-timing"/> is enabled, otherwise zero). This + includes the time spent queueing write-out requests and, potentially, + the time spent to write out the dirty data. + </para> + </entry> + </row> + <row> <entry role="catalog_table_entry"> <para role="column_definition"> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 48aacf66ee..d0c932ad0e 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1131,6 +1131,8 @@ SELECT b.read_time, b.writes, b.write_time, + b.writebacks, + b.writeback_time, b.extends, b.extend_time, b.op_bytes, diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 041d5dc129..3952476d15 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -1685,7 +1685,7 @@ again: FlushBuffer(buf_hdr, NULL, IOOBJECT_RELATION, io_context); LWLockRelease(content_lock); - ScheduleBufferTagForWriteback(&BackendWritebackContext, + ScheduleBufferTagForWriteback(&BackendWritebackContext, io_context, &buf_hdr->tag); } @@ -2725,8 +2725,11 @@ BufferSync(int flags) CheckpointWriteDelay(flags, (double) num_processed / num_to_scan); } - /* issue all pending flushes */ - IssuePendingWritebacks(&wb_context); + /* + * Issue all pending flushes. Only checkpointer calls BufferSync(), so + * IOContext will always be IOCONTEXT_NORMAL. + */ + IssuePendingWritebacks(&wb_context, IOCONTEXT_NORMAL); pfree(per_ts_stat); per_ts_stat = NULL; @@ -3110,7 +3113,11 @@ SyncOneBuffer(int buf_id, bool skip_recently_used, WritebackContext *wb_context) UnpinBuffer(bufHdr); - ScheduleBufferTagForWriteback(wb_context, &tag); + /* + * SyncOneBuffer() is only called by checkpointer and bgwriter, so + * IOContext will always be IOCONTEXT_NORMAL. + */ + ScheduleBufferTagForWriteback(wb_context, IOCONTEXT_NORMAL, &tag); return result | BUF_WRITTEN; } @@ -5445,7 +5452,8 @@ WritebackContextInit(WritebackContext *context, int *max_pending) * Add buffer to list of pending writeback requests. */ void -ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag) +ScheduleBufferTagForWriteback(WritebackContext *wb_context, IOContext io_context, + BufferTag *tag) { PendingWriteback *pending; @@ -5471,7 +5479,7 @@ ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag) * is now disabled. */ if (wb_context->nr_pending >= *wb_context->max_pending) - IssuePendingWritebacks(wb_context); + IssuePendingWritebacks(wb_context, io_context); } #define ST_SORT sort_pending_writebacks @@ -5489,8 +5497,9 @@ ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag) * error out - it's just a hint. */ void -IssuePendingWritebacks(WritebackContext *wb_context) +IssuePendingWritebacks(WritebackContext *wb_context, IOContext io_context) { + instr_time io_start; int i; if (wb_context->nr_pending == 0) @@ -5502,6 +5511,8 @@ IssuePendingWritebacks(WritebackContext *wb_context) */ sort_pending_writebacks(wb_context->pending_writebacks, wb_context->nr_pending); + io_start = pgstat_prepare_io_time(); + /* * Coalesce neighbouring writes, but nothing else. For that we iterate * through the, now sorted, array of pending flushes, and look forward to @@ -5555,6 +5566,13 @@ IssuePendingWritebacks(WritebackContext *wb_context) smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks); } + /* + * Assume that writeback requests are only issued for buffers containing + * blocks of permanent relations. + */ + pgstat_count_io_op_time(IOOBJECT_RELATION, io_context, + IOOP_WRITEBACK, io_start, wb_context->nr_pending); + wb_context->nr_pending = 0; } diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c index 25735eb6c0..eb7d35d422 100644 --- a/src/backend/utils/activity/pgstat_io.c +++ b/src/backend/utils/activity/pgstat_io.c @@ -424,6 +424,14 @@ pgstat_tracks_io_op(BackendType bktype, IOObject io_object, bktype == B_CHECKPOINTER) && io_op == IOOP_EXTEND) return false; + /* + * Temporary tables are not logged and thus do not require fsync'ing. + * Writeback is not requested for temporary tables. + */ + if (io_object == IOOBJECT_TEMP_RELATION && + (io_op == IOOP_FSYNC || io_op == IOOP_WRITEBACK)) + return false; + /* * Some IOOps are not valid in certain IOContexts and some IOOps are only * valid in certain contexts. @@ -448,12 +456,6 @@ pgstat_tracks_io_op(BackendType bktype, IOObject io_object, if (strategy_io_context && io_op == IOOP_FSYNC) return false; - /* - * Temporary tables are not logged and thus do not require fsync'ing. - */ - if (io_context == IOCONTEXT_NORMAL && - io_object == IOOBJECT_TEMP_RELATION && io_op == IOOP_FSYNC) - return false; return true; } diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 0d57b6a7c0..70da0a2de1 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -1268,6 +1268,8 @@ typedef enum io_stat_col IO_COL_READ_TIME, IO_COL_WRITES, IO_COL_WRITE_TIME, + IO_COL_WRITEBACKS, + IO_COL_WRITEBACK_TIME, IO_COL_EXTENDS, IO_COL_EXTEND_TIME, IO_COL_CONVERSION, @@ -1303,6 +1305,8 @@ pgstat_get_io_op_index(IOOp io_op) return IO_COL_REUSES; case IOOP_WRITE: return IO_COL_WRITES; + case IOOP_WRITEBACK: + return IO_COL_WRITEBACKS; } elog(ERROR, "unrecognized IOOp value: %d", io_op); @@ -1322,6 +1326,7 @@ pgstat_get_io_time_index(IOOp io_op) { case IOOP_READ: case IOOP_WRITE: + case IOOP_WRITEBACK: case IOOP_EXTEND: case IOOP_FSYNC: return pgstat_get_io_op_index(io_op) + 1; diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index b2bc81b15f..e766b9b600 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5760,9 +5760,9 @@ proname => 'pg_stat_get_io', provolatile => 'v', prorows => '30', proretset => 't', proparallel => 'r', prorettype => 'record', proargtypes => '', - proallargtypes => '{text,text,text,int8,float8,int8,float8,int8,float8,int8,int8,int8,int8,int8,float8,timestamptz}', - proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}', - proargnames => '{backend_type,object,context,reads,read_time,writes,write_time,extends,extend_time,op_bytes,hits,evictions,reuses,fsyncs,fsync_time,stats_reset}', + proallargtypes => '{text,text,text,int8,float8,int8,float8,int8,float8,int8,float8,int8,int8,int8,int8,int8,float8,timestamptz}', + proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}', + proargnames => '{backend_type,object,context,reads,read_time,writes,write_time,writebacks,writeback_time,extends,extend_time,op_bytes,hits,evictions,reuses,fsyncs,fsync_time,stats_reset}', prosrc => 'pg_stat_get_io' }, { oid => '1136', descr => 'statistics: information about WAL activity', diff --git a/src/include/pgstat.h b/src/include/pgstat.h index fff4ad5b6d..57a2c0866a 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -300,9 +300,10 @@ typedef enum IOOp IOOP_READ, IOOP_REUSE, IOOP_WRITE, + IOOP_WRITEBACK, } IOOp; -#define IOOP_NUM_TYPES (IOOP_WRITE + 1) +#define IOOP_NUM_TYPES (IOOP_WRITEBACK + 1) typedef struct PgStat_BktypeIO { diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h index f8ac811379..30807d5d97 100644 --- a/src/include/storage/buf_internals.h +++ b/src/include/storage/buf_internals.h @@ -388,8 +388,9 @@ extern PGDLLIMPORT CkptSortItem *CkptBufferIds; */ /* bufmgr.c */ extern void WritebackContextInit(WritebackContext *context, int *max_pending); -extern void IssuePendingWritebacks(WritebackContext *wb_context); -extern void ScheduleBufferTagForWriteback(WritebackContext *wb_context, BufferTag *tag); +extern void IssuePendingWritebacks(WritebackContext *wb_context, IOContext io_context); +extern void ScheduleBufferTagForWriteback(WritebackContext *wb_context, + IOContext io_context, BufferTag *tag); /* freelist.c */ extern IOContext IOContextForStrategy(BufferAccessStrategy strategy); diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 69957687fe..0bd6db2bf8 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1887,6 +1887,8 @@ pg_stat_io| SELECT backend_type, read_time, writes, write_time, + writebacks, + writeback_time, extends, extend_time, op_bytes, @@ -1896,7 +1898,7 @@ pg_stat_io| SELECT backend_type, fsyncs, fsync_time, stats_reset - FROM pg_stat_get_io() b(backend_type, object, context, reads, read_time, writes, write_time, extends, extend_time, op_bytes, hits, evictions, reuses, fsyncs, fsync_time, stats_reset); + FROM pg_stat_get_io() b(backend_type, object, context, reads, read_time, writes, write_time, writebacks, writeback_time, extends, extend_time, op_bytes, hits, evictions, reuses, fsyncs, fsync_time, stats_reset); pg_stat_progress_analyze| SELECT s.pid, s.datid, d.datname, diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index d4e4889e7b..b836d70775 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -1386,7 +1386,7 @@ SELECT pg_stat_have_stats('io', 0, 0); t (1 row) -SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(hits) AS io_stats_pre_reset +SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(writebacks) + sum(hits) AS io_stats_pre_reset FROM pg_stat_io \gset SELECT pg_stat_reset_shared('io'); pg_stat_reset_shared @@ -1394,7 +1394,7 @@ SELECT pg_stat_reset_shared('io'); (1 row) -SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(hits) AS io_stats_post_reset +SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(writebacks) + sum(hits) AS io_stats_post_reset FROM pg_stat_io \gset SELECT :io_stats_post_reset < :io_stats_pre_reset; ?column? diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index 7ba341314f..ec0b98687e 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -696,10 +696,10 @@ SELECT :io_sum_bulkwrite_strategy_extends_after > :io_sum_bulkwrite_strategy_ext -- Test IO stats reset SELECT pg_stat_have_stats('io', 0, 0); -SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(hits) AS io_stats_pre_reset +SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(writebacks) + sum(hits) AS io_stats_pre_reset FROM pg_stat_io \gset SELECT pg_stat_reset_shared('io'); -SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(hits) AS io_stats_post_reset +SELECT sum(evictions) + sum(reuses) + sum(extends) + sum(fsyncs) + sum(reads) + sum(writes) + sum(writebacks) + sum(hits) AS io_stats_post_reset FROM pg_stat_io \gset SELECT :io_stats_post_reset < :io_stats_pre_reset; -- 2.37.2