Hi,

On Fri, 19 Apr 2024 at 11:01, Nazir Bilal Yavuz <byavu...@gmail.com> wrote:
>
> If I am not missing any new changes, the only problem is reading
> variable bytes now. We have discussed a couple of solutions:

With the recent commit [1], pg_stat_io tracks IOs as bytes instead of
blocks. This solves the variable IO size problem.

I encountered another problem while rebasing the patch. The problem is
basically we do not expect any pending stats while restoring the stats
at the initdb. However, WAL IOs (WAL read and WAL init IOs for now)
may happen before restoring the stats, so we end up having pending
stats before restoring them and that causes initdb to fail.

I wrote this problem to another thread [2] but this thread is a better
place to discuss it, so rewriting the problem:

This is where we restore stats and do not expect any pending stats at
the Assert:

'''
pgstat_restore_stats() ->
pgstat_read_statsfile() ->
pgstat_reset_after_failure() ->
pgstat_drop_all_entries() ->
pgstat_drop_entry_internal() ->
We have an assertion there which checks if there is a pending stat entry:

    /* should already have released local reference */
    if (pgStatEntryRefHash)
        Assert(!pgstat_entry_ref_hash_lookup(pgStatEntryRefHash, shent->key));
'''

This is where the WAL read happens before restoring the stats:

'''
BootstrapModeMain() ->
InitPostgres() ->
StartupXLOG() ->
ReadCheckpointRecord() ->
InitWalRecovery() ->
... ->
XLogReadAhead() ->
XLogDecodeNextRecord() ->
ReadPageInternal() ->
state->routine.page_read = XLogPageRead() then WAL read happens
'''

So, this assert fails because we have pending stats for the
PGSTAT_KIND_BACKEND. It is only PGSTAT_KIND_BACKEND because all
fixed-numbered stats (which include PGSTAT_KIND_IO) are reset there:
'pgstat_reset_after_failure() -> kind_info->reset_all_cb()' at the
pgstat_reset_after_failure(). It seems that we do not care about stats
that happen before restoring the stats part as we reset all
fixed-numbered stats there, so not counting these WAL IOs at the
initdb may be a one solution.

A simple reproducer patch is attached, it includes two
pgstat_count_io_op() calls. I did not include the rest of the patchset
as I thought it may increase the complexity. To reproduce, just run
initdb on assert enabled build after applying the patch. Then you
should see:

creating configuration files ... ok
running bootstrap script ... TRAP: failed
Assert("!pgstat_entry_ref_hash_lookup(pgStatEntryRefHash,
shent->key)"), File:
"../../postgres/src/backend/utils/activity/pgstat_shmem.c", Line: 859,
PID: 51001
.../install/bin/postgres(ExceptionalCondition+0xab) [0x55da0959feea]

I would be happy to hear your thoughts.

[1] f92c854cf
[2] 
postgr.es/m/CAN55FZ1uOq%3DFVJObp0bdj-Z8q1ZRNmA-RymPqbMD%2Bp4QaHXP3A%40mail.gmail.com

-- 
Regards,
Nazir Bilal Yavuz
Microsoft
From 93bbfd611b86654356a84415367fe562eae74b68 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <byavu...@gmail.com>
Date: Thu, 16 Jan 2025 10:10:46 +0300
Subject: [PATCH] Break initdb with pending stats

To reproduce, run initdb on asserts enabled build.
---
 src/backend/access/transam/xlog.c         | 2 ++
 src/backend/access/transam/xlogrecovery.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bf3dbda901d..857f8379f3e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3322,6 +3322,8 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli,
 	}
 	pgstat_report_wait_end();
 
+	pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_FSYNC, 1, 0);
+
 	if (close(fd) != 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 0bbe2eea206..cc29b19ca0a 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -3406,6 +3406,9 @@ retry:
 		int			save_errno = errno;
 
 		pgstat_report_wait_end();
+
+		pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_READ, 1, r);
+
 		XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size);
 		if (r < 0)
 		{
@@ -3426,6 +3429,8 @@ retry:
 	}
 	pgstat_report_wait_end();
 
+	pgstat_count_io_op(IOOBJECT_RELATION, IOCONTEXT_NORMAL, IOOP_READ, 1, r);
+
 	Assert(targetSegNo == readSegNo);
 	Assert(targetPageOff == readOff);
 	Assert(reqLen <= readLen);
-- 
2.47.1

Reply via email to