Hi, On Fri, 19 Apr 2024 at 11:01, Nazir Bilal Yavuz <byavu...@gmail.com> wrote: > > If I am not missing any new changes, the only problem is reading > variable bytes now. We have discussed a couple of solutions:
With the recent commit [1], pg_stat_io tracks IOs as bytes instead of blocks. This solves the variable IO size problem. I encountered another problem while rebasing the patch. The problem is basically we do not expect any pending stats while restoring the stats at the initdb. However, WAL IOs (WAL read and WAL init IOs for now) may happen before restoring the stats, so we end up having pending stats before restoring them and that causes initdb to fail. I wrote this problem to another thread [2] but this thread is a better place to discuss it, so rewriting the problem: This is where we restore stats and do not expect any pending stats at the Assert: ''' pgstat_restore_stats() -> pgstat_read_statsfile() -> pgstat_reset_after_failure() -> pgstat_drop_all_entries() -> pgstat_drop_entry_internal() -> We have an assertion there which checks if there is a pending stat entry: /* should already have released local reference */ if (pgStatEntryRefHash) Assert(!pgstat_entry_ref_hash_lookup(pgStatEntryRefHash, shent->key)); ''' This is where the WAL read happens before restoring the stats: ''' BootstrapModeMain() -> InitPostgres() -> StartupXLOG() -> ReadCheckpointRecord() -> InitWalRecovery() -> ... -> XLogReadAhead() -> XLogDecodeNextRecord() -> ReadPageInternal() -> state->routine.page_read = XLogPageRead() then WAL read happens ''' So, this assert fails because we have pending stats for the PGSTAT_KIND_BACKEND. It is only PGSTAT_KIND_BACKEND because all fixed-numbered stats (which include PGSTAT_KIND_IO) are reset there: 'pgstat_reset_after_failure() -> kind_info->reset_all_cb()' at the pgstat_reset_after_failure(). It seems that we do not care about stats that happen before restoring the stats part as we reset all fixed-numbered stats there, so not counting these WAL IOs at the initdb may be a one solution. A simple reproducer patch is attached, it includes two pgstat_count_io_op() calls. I did not include the rest of the patchset as I thought it may increase the complexity. To reproduce, just run initdb on assert enabled build after applying the patch. Then you should see: creating configuration files ... ok running bootstrap script ... TRAP: failed Assert("!pgstat_entry_ref_hash_lookup(pgStatEntryRefHash, shent->key)"), File: "../../postgres/src/backend/utils/activity/pgstat_shmem.c", Line: 859, PID: 51001 .../install/bin/postgres(ExceptionalCondition+0xab) [0x55da0959feea] I would be happy to hear your thoughts. [1] f92c854cf [2] postgr.es/m/CAN55FZ1uOq%3DFVJObp0bdj-Z8q1ZRNmA-RymPqbMD%2Bp4QaHXP3A%40mail.gmail.com -- Regards, Nazir Bilal Yavuz Microsoft
From 93bbfd611b86654356a84415367fe562eae74b68 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz <byavu...@gmail.com> Date: Thu, 16 Jan 2025 10:10:46 +0300 Subject: [PATCH] Break initdb with pending stats To reproduce, run initdb on asserts enabled build. --- src/backend/access/transam/xlog.c | 2 ++ src/backend/access/transam/xlogrecovery.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index bf3dbda901d..857f8379f3e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3322,6 +3322,8 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli, } pgstat_report_wait_end(); + pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_FSYNC, 1, 0); + if (close(fd) != 0) ereport(ERROR, (errcode_for_file_access(), diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 0bbe2eea206..cc29b19ca0a 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -3406,6 +3406,9 @@ retry: int save_errno = errno; pgstat_report_wait_end(); + + pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_READ, 1, r); + XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); if (r < 0) { @@ -3426,6 +3429,8 @@ retry: } pgstat_report_wait_end(); + pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_READ, 1, r); + Assert(targetSegNo == readSegNo); Assert(targetPageOff == readOff); Assert(reqLen <= readLen); -- 2.47.1