Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-23 Thread Michael Paquier
On Sun, May 23, 2021 at 02:05:59PM +0900, Amit Langote wrote: > On Sun, May 23, 2021 at 10:28 AM Tom Lane wrote: >> I wrote: >> > ... wrong. Running v13 branch tip under CLOBBER_CACHE_ALWAYS provokes >> > a core dump in 013_partition.pl, so 1375422c is not to blame. Now >> > I'm wondering how fa

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-22 Thread Amit Langote
On Sun, May 23, 2021 at 10:28 AM Tom Lane wrote: > I wrote: > > ... wrong. Running v13 branch tip under CLOBBER_CACHE_ALWAYS provokes > > a core dump in 013_partition.pl, so 1375422c is not to blame. Now > > I'm wondering how far back there's a live issue. > > Oh, of course, it's directly the fa

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-22 Thread Tom Lane
I wrote: > ... wrong. Running v13 branch tip under CLOBBER_CACHE_ALWAYS provokes > a core dump in 013_partition.pl, so 1375422c is not to blame. Now > I'm wondering how far back there's a live issue. Oh, of course, it's directly the fault of the patch that added support for partitioned target ta

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-22 Thread Tom Lane
I wrote: > Amit Langote writes: >> BTW, I think we'd need to cherry-pick f3b141c4825 (or maybe parts of >> it) into v13 branch for back-patching this. > I already did a fair amount of that yesterday, cf 84f5c2908 et al. > But that does raise the question of how far we need to back-patch this. > I

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-22 Thread Tom Lane
Amit Langote writes: > BTW, I think we'd need to cherry-pick f3b141c4825 (or maybe parts of > it) into v13 branch for back-patching this. I already did a fair amount of that yesterday, cf 84f5c2908 et al. But that does raise the question of how far we need to back-patch this. I gather that the wh

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-21 Thread Amit Langote
On Sat, May 22, 2021 at 6:01 AM Tom Lane wrote: > Amit Langote writes: > > IMHO, it would be better to keep the lowest-level > > apply_handle_XXX_internal() out of this, because presumably we're only > > cleaning up the mess in higher-level callers. Somewhat related, one > > of the intentions be

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-21 Thread Tom Lane
I wrote: > BTW, I wonder whether it wouldn't be a good idea for the > postmaster to log something along the lines of "stopping > because restart_after_crash is off". The present behavior > can be quite mysterious otherwise (it certainly confused me). Concretely, I suggest the attached. While che

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-21 Thread Tom Lane
I wrote: > I count three distinct bugs that were exposed by this attempt: > ... > 2. Said bug causes a segfault in the apply worker process. > This causes the parent postmaster to give up and die. > I don't understand why we don't treat that like a crash > in a regular backend, considering that an

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-21 Thread Tom Lane
Amit Langote writes: > IMHO, it would be better to keep the lowest-level > apply_handle_XXX_internal() out of this, because presumably we're only > cleaning up the mess in higher-level callers. Somewhat related, one > of the intentions behind a04daa97a43, which removed > es_result_relation_info i

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-20 Thread Amit Kapila
On Fri, May 21, 2021 at 6:29 AM Michael Paquier wrote: > > On Thu, May 20, 2021 at 02:57:40PM -0400, Tom Lane wrote: > > Here's a version that does it like that. I'm not entirely convinced > > whether this is better or not. > > Hmm. I think that this is better. This makes the code easier to > f

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-20 Thread Michael Paquier
On Thu, May 20, 2021 at 02:57:40PM -0400, Tom Lane wrote: > Here's a version that does it like that. I'm not entirely convinced > whether this is better or not. Hmm. I think that this is better. This makes the code easier to follow, and the extra information is useful for debugging. The change

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-20 Thread Tom Lane
I wrote: > Michael Paquier writes: >> On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote: >>> * Replace the edata->resultRelInfo field with two fields, one for >>> the original parent and one for the actual/current target. Perhaps >>> this is worth doing, not sure. >> This one sounds more

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tom Lane
Michael Paquier writes: > On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote: >> * Replace the edata->resultRelInfo field with two fields, one for >> the original parent and one for the actual/current target. Perhaps >> this is worth doing, not sure. > This one sounds more natural to me, t

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Amit Langote
On Thu, May 20, 2021 at 5:23 AM Tom Lane wrote: > Amit Langote writes: > > IOW, the patch you posted earlier seems like the way to go. > > I really dislike that patch. I think it's doubling down on the messy, > unstructured coding patterns that got us into this situation to begin > with. I'd pr

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 02:36:03PM -0400, Andrew Dunstan wrote: > Yeah, this area needs substantial improvement. I have seen similar sorts > of nasty hangs, where the script is waiting forever for some process > that hasn't got the shutdown message. At least we probably need some way > of making su

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Michael Paquier
On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote: > I really dislike that patch. I think it's doubling down on the messy, > unstructured coding patterns that got us into this situation to begin > with. I'd prefer to expend a little effort on refactoring so that > the ExecCleanupTupleRouti

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tom Lane
Amit Langote writes: > IOW, the patch you posted earlier seems like the way to go. I really dislike that patch. I think it's doubling down on the messy, unstructured coding patterns that got us into this situation to begin with. I'd prefer to expend a little effort on refactoring so that the Ex

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Andrew Dunstan
On 5/18/21 11:03 PM, Michael Paquier wrote: > >> 3. Once the subscriber1 postmaster has exited, the TAP >> test will eventually time out, and then this happens: >> >> [.. logs ..] >> >> That is, because we failed to shut down subscriber1, the >> test script neglects to shut down subscriber2, and

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-19 Thread Tomas Vondra
On 5/19/21 1:42 AM, Tom Lane wrote: I discovered $SUBJECT after wondering why hyrax hadn't reported in recently, and trying to run check-world under CCA to see if anything got stuck. Indeed it did --- although this doesn't explain the radio silence from hyrax, because that animal doesn't run any

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Amit Langote
On Wed, May 19, 2021 at 2:05 PM Michael Paquier wrote: > On Wed, May 19, 2021 at 10:26:28AM +0530, Amit Kapila wrote: > > How about moving AfterTriggerEndQuery() to apply_handle_*_internal > > calls? That way, we might not even need to change Push/Pop calls. > > Isn't that going to be a problem wh

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Amit Kapila
On Wed, May 19, 2021 at 10:35 AM Michael Paquier wrote: > > On Wed, May 19, 2021 at 10:26:28AM +0530, Amit Kapila wrote: > > How about moving AfterTriggerEndQuery() to apply_handle_*_internal > > calls? That way, we might not even need to change Push/Pop calls. > > Isn't that going to be a problem

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Michael Paquier
On Wed, May 19, 2021 at 10:26:28AM +0530, Amit Kapila wrote: > How about moving AfterTriggerEndQuery() to apply_handle_*_internal > calls? That way, we might not even need to change Push/Pop calls. Isn't that going to be a problem when a tuple is moved to a new partition in the tuple routing? Thi

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Amit Kapila
On Wed, May 19, 2021 at 9:54 AM Michael Paquier wrote: > > On Tue, May 18, 2021 at 11:46:25PM -0400, Tom Lane wrote: > > I was wondering if we could move the ExecCleanupTupleRouting call > > into finish_estate. copyfrom.c, for example, does that during > > its shutdown function. Compare also the

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Michael Paquier
On Tue, May 18, 2021 at 11:46:25PM -0400, Tom Lane wrote: > I was wondering if we could move the ExecCleanupTupleRouting call > into finish_estate. copyfrom.c, for example, does that during > its shutdown function. Compare also the worker.c changes proposed > in Yeah, the first patch I wrote for

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Tom Lane
Michael Paquier writes: > On Tue, May 18, 2021 at 07:42:08PM -0400, Tom Lane wrote: >> This might go away if worker.c weren't so creatively different >> from the other code paths concerned with executor shutdown. > The tuple routing has made the whole worker logic messier by a larger > degree com

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Amit Langote
On Wed, May 19, 2021 at 12:04 PM Michael Paquier wrote: > On Tue, May 18, 2021 at 07:42:08PM -0400, Tom Lane wrote: > > I count three distinct bugs that were exposed by this attempt: > > > > 1. In the part of 013_partition.pl that tests firing AFTER > > triggers on partitioned tables, we have a ca

Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

2021-05-18 Thread Michael Paquier
On Tue, May 18, 2021 at 07:42:08PM -0400, Tom Lane wrote: > I count three distinct bugs that were exposed by this attempt: > > 1. In the part of 013_partition.pl that tests firing AFTER > triggers on partitioned tables, we have a case of continuing > to access a relcache entry that's already been