Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Andres Freund
Hi, On 2024-01-23 22:00:01 +0300, Alexander Lakhin wrote: > 23.01.2024 20:30, Andres Freund wrote: > > I don't think that's viable and would cause more problems than it solves, > > it'd > > make us think that we might have an old postgres process hanging around that > > needs to be terminted befo

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Nathan Bossart
On Tue, Jan 23, 2024 at 06:33:25PM +0100, Alvaro Herrera wrote: > On 2024-Jan-22, Nathan Bossart wrote: > >> Here is a patch. > > Looks reasonable. Committed. Thank you for the report and the reviews. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Alexander Lakhin
23.01.2024 20:30, Andres Freund wrote: I don't think that's viable and would cause more problems than it solves, it'd make us think that we might have an old postgres process hanging around that needs to be terminted before we can start up. And I simply don't see the point - we already record whe

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Nathan Bossart
On Tue, Jan 23, 2024 at 12:22:58PM -0600, Nathan Bossart wrote: > On Tue, Jan 23, 2024 at 06:33:25PM +0100, Alvaro Herrera wrote: >> Does this actually detect a problem if you take out the fix? I think >> what's going to happen is that postmaster is going to crash, then do the >> recovery cycle, t

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Nathan Bossart
On Tue, Jan 23, 2024 at 06:33:25PM +0100, Alvaro Herrera wrote: > On 2024-Jan-22, Nathan Bossart wrote: >> This might be a topic for another thread, but I do wonder whether we could >> put a generic pg_controldata check in node->stop that would at least make >> sure that the state is some sort of e

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Alvaro Herrera
On 2024-Jan-22, Nathan Bossart wrote: > Here is a patch. Looks reasonable. > This might be a topic for another thread, but I do wonder whether we could > put a generic pg_controldata check in node->stop that would at least make > sure that the state is some sort of expected shut-down state. My

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Andres Freund
Hi, On 2024-01-23 08:00:00 +0300, Alexander Lakhin wrote: > 22.01.2024 23:41, Andres Freund wrote: > > ISTM that we shouldn't basically silently overlook shutdowns due to crashes > > in > > the tests. How to not do so is unfortunately not immediately obvious to > > me... > > > > FWIW, I encoun

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Andres Freund
Hi, On 2024-01-22 16:27:43 -0600, Nathan Bossart wrote: > Here is a patch. LGTM. > This might be a topic for another thread, but I do wonder whether we could > put a generic pg_controldata check in node->stop that would at least make > sure that the state is some sort of expected shut-down stat

Re: core dumps in auto_prewarm, tests succeed

2024-01-23 Thread Nathan Bossart
On Mon, Jan 22, 2024 at 04:27:43PM -0600, Nathan Bossart wrote: > Here is a patch. I'd like to fix these crashes sooner than later, so I will plan on committing this tonight (barring objections or feedback). If this needs to be revisited later for some reason, I'm happy to do so. -- Nathan Boss

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Alexander Lakhin
Hello Andres, 22.01.2024 23:41, Andres Freund wrote: Hi, I noticed that I was getting core dumps while executing the tests, without the tests failing. Backtraces are vriations of this: ... ISTM that we shouldn't basically silently overlook shutdowns due to crashes in the tests. How to not do s

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart
On Mon, Jan 22, 2024 at 03:38:15PM -0600, Nathan Bossart wrote: > On Mon, Jan 22, 2024 at 01:24:54PM -0800, Andres Freund wrote: >> On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote: >>> I think this is because the autoprewarm state was moved to a DSM segment, >>> and DSM segments are detached bef

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart
On Mon, Jan 22, 2024 at 01:24:54PM -0800, Andres Freund wrote: > On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote: >> I think this is because the autoprewarm state was moved to a DSM segment, >> and DSM segments are detached before the on_shmem_exit callbacks are called >> during process exit. M

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Andres Freund
Hi, On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote: > On Mon, Jan 22, 2024 at 02:44:57PM -0600, Nathan Bossart wrote: > > On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote: > >> I noticed that I was getting core dumps while executing the tests, without > >> the > >> tests failing.

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart
On Mon, Jan 22, 2024 at 02:44:57PM -0600, Nathan Bossart wrote: > On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote: >> I noticed that I was getting core dumps while executing the tests, without >> the >> tests failing. Backtraces are vriations of this: > > Looking, thanks for the hea

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart
On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote: > I noticed that I was getting core dumps while executing the tests, without the > tests failing. Backtraces are vriations of this: Looking, thanks for the heads-up. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Andres Freund
Hi, I noticed that I was getting core dumps while executing the tests, without the tests failing. Backtraces are vriations of this: #0 0x00ca29cd in pg_atomic_read_u32_impl (ptr=0x7fe13497a004) at ../../../../../home/andres/src/postgresql/src/include/port/atomics/generic.h:48 #1 0x