Hi, On 2025-03-12 20:41:29 -0400, Tom Lane wrote: > I happened to notice these entries in a log file on a > buildfarm member [1]: > > 2025-03-12 15:39:53.265 UTC [7296] WARNING: found incorrect redo LSN > 0/159FB60 (expected 0/40000028) > 2025-03-12 15:39:53.265 UTC [7296] LOG: corrupted statistics file > "pg_stat/pgstat.stat" > > (this is near the end of the pg_upgrade_server.log log file). > I don't think these are related to that run's subsequent test failure, > which looks to be good old Windows randomness. I then looked into the > logs of a local BF instance that also runs xversion-upgrade tests, and > darned if I didn't find > > 2025-03-12 14:59:15.792 EDT [2216647] LOG: database system was shut down at > 2025-03-12 14:59:13 EDT > 2025-03-12 14:59:15.794 EDT [2216647] WARNING: found incorrect redo LSN > 0/46F73F18 (expected 0/47000028) > 2025-03-12 14:59:15.794 EDT [2216647] LOG: corrupted statistics file > "pg_stat/pgstat.stat" > 2025-03-12 14:59:15.795 EDT [2216644] LOG: database system is ready to > accept connections > > despite that run having completed with no report of trouble. > So this may have been going on for quite some time without our > noticing. The "corrupted statistics file" whine is most likely > caused by pg_upgrade copying the old system's pgstat.stat file > into the new installation --- is that a good idea? I have > no idea what's causing the redo LSN complaint, but it seems > like that might deserve closer investigation.
I think the two issues are closely related - this is code that was introduced, in b860848232aa, as part of work to make pgstats somewhat crashsafe. Greetings, Andres Freund