On Tue, Jun 13, 2023 at 11:22 AM torikoshia <torikos...@oss.nttdata.com> wrote: > > On 2023-06-13 00:52, James Coleman wrote: > >> > >> > I've attached v27. The important change here in 0001 is that it > >> > guarantees the interrupt handler is re-entrant, since that was a bug > >> > exposed by my testing. I've also included 0002 which is only meant for > >> > testing (it attempts to log in the plan in every > >> > CHECK_FOR_INTERRUPTS() call). > >> > >> When SIGINT is sent during ProcessLogQueryPlanInterrupt(), > >> ProcessLogQueryPlanInterruptActive can remain true. > >> After that, pg_log_query_plan() does nothing but just returns. > >> > >> AFAIU, v26 logs plan for each pg_log_query_plan() even when another > >> pg_log_query_plan() is running, but it doesn't cause errors or > >> critical > >> problem. > >> > >> Considering the problem solved and introduced by v27, I think v26 > >> might > >> be fine. > >> How do you think? > > > > The testing I did with calling this during every CFI is what uncovered > > the re-entrancy problem. IIRC (without running that test again) the > > problem was a stack overflow. Now, to be sure this is a particularly > > degenerate case because in real-world usage it'd be impossible in > > practice, I think, to trigger that many calls to this function (and by > > extension the interrupt handler). > > Yeah.In addition, currently only superusers are allowed to execute > pg_log_query_plan(), I think we don't need to think about cases > where users are malicious. > > > If SIGINT is the only concern we could reset > > ProcessLogQueryPlanInterruptActive in error handling code. I admit > > that part of my thought process here is thinking ahead to an > > additional patch I'd like to see on top of this, which is logging a > > query plan before cleaning up when statement timeout occurs. > > I remember this is what you wanted do.[1] > > > The > > re-entrancy issue becomes more interesting then, I think, since we > > would then have automated calling of the logging code. BTW: I'd > > thought that would make a nice follow-up patch for this, but if you'd > > prefer I could add it as another patch in the series here. > > > > What do you think about resetting the flag versus just not having it? > > If I understand you correctly, adding the flag is not necessary for this > proposal. > To keep the patch simple, I prefer not having it. >
I'm going to re-run tests with my patch version + resetting the flag on SIGINT (and any other error condition) to be certain that the issue you uncovered (where backends get stuck after a SIGINT not responding to the requested plan logging) wasn't masking any other issues. As long as that run is clean also then I believe the patch is safe as-is even without the re-entrancy guard. I'll report back with the results of that testing. Regards, James Coleman