On Wed, Jan 15, 2025 at 05:20:57PM +0300, Nazir Bilal Yavuz wrote: > I think allowing only pgStatPendingContext to have > MemoryContextAllowInCriticalSection() is not enough. We need to allow > at least pgStatSharedRefContext as well to have > MemoryContextAllowInCriticalSection() as it can be allocated too. > > ''' > pgstat_prep_pending_entry() -> > pgstat_get_entry_ref() -> > pgstat_get_entry_ref_cached() -> > MemoryContextAlloc(pgStatSharedRefContext, sizeof(PgStat_EntryRef)) > '''
Yep, I was pretty sure that we have a bit more going on. Last time I began digging into the issue I was loading injection_points in shared_preload_libraries with stats enabled to see how much I could break, and this was one pattern once I've forced the pending part. I didn't get through the whole exercise. > While doing the initdb, we are restoring stats with the > pgstat_restore_stats() and we do not expect any pending stats. The > problem goes like that: > > ''' > pgstat_restore_stats() -> > pgstat_read_statsfile() -> > pgstat_reset_after_failure() -> > pgstat_drop_all_entries() -> > pgstat_drop_entry_internal() -> > We have an assertion there which checks if there is a pending stat entry: > > /* should already have released local reference */ > if (pgStatEntryRefHash) > Assert(!pgstat_entry_ref_hash_lookup(pgStatEntryRefHash, shent->key)); > ''' You mean that this is not something that happens on HEAD, but as an effect of trying to add WAL stats to pg_stat_io, right? > I was a bit surprised that Bertrand did not encounter the same problem > while working on the 'per backend WAL statistics' patch. Then I found > the reason, it is because this problem happens only for WAL read and > WAL init IOs which are starting to be tracked in my patch. By saying > that, I could not decide which thread to write about this problem, > migrating WAL stats thread or this thread. Since this thread is active > and other stats may cause the same problem, I decided to write here. > Please warn me if you think I need to write this to the migrating WAL > stats thread. I'd rather treat that on a separate thread (please add me in CC if I'm not there yet!). I am OK to discuss any issues related to backend statistics here if HEAD is impacted based on how the feature is limited currently now, though. Another, simpler, idea would be to force more calls pgstat_prep_backend_pending() where this is problematic, like before entering a critical section when generating a WAL record. I am not sure that it would take care of everything, as you mention; that would leave some holes depending on the kind of interactions with the pgstats dshash we do in these paths. -- Michael
signature.asc
Description: PGP signature