On Wed, Jun 24, 2026 at 11:35 PM Andrei Lepikhov <[email protected]> wrote:
Thanks for your review!

2. GetCurrentQueryDesc, SetCurrentQueryDesc - need 'extern' in the .h file 3. ExplainPrintPlan, ExplainPrintTriggers - have exact duplicates in the explain.h
4. ExplainStringAssemble() has '0' input for a boolean parameter.
5. ExecSetExecProcNodeRecurse: I don't see PlanState::initPlan. Is this intentional?
6. INJECTION_POINT("log-query-interrupt", ...) - it is called inside a
signal-safe zone. In tests, the caller might do a lot of things in the attached
routine, causing hidden problems later.

I agree that these points should be addressed.
I'll post an updated patch.

I've also received some off-list feedback, and I'm planning to incorporate
those comments into the updated patch as well.


On 2026-06-26 21:50, Robert Haas wrote:
On Thu, Jun 25, 2026 at 6:33 AM Andrei Lepikhov <[email protected]> wrote:
Ok, so just mention this behaviour in the documentation - let people know that they should potentially wait for a minute or two more to see the EXPLAIN. Real-life with huge queries and big machines provide us with examples where a backend might delay a response to a signal for quite a substantial time.

Sometimes nothing happens after such an async operation, and we need to identify
the problem: has the backend stalled, is the ‘logging plan’ algorithm
ineffective, or is something else happening?

I don't think we document this in other, similar cases. Many things
can be delayed if the machine is overloaded, but unless I am missing
something, having to wait over a minute for this to work would be
*extremely* unusual and only happen in case of a machine that is
absolutely being crushed by the load. And in that case everything else
will be slow, too.

I think a delay caused by signal handling can also occur with the existing functions, such as pg_cancel_backend() and pg_log_backend_memory_contexts().

Since the documentation does not specifically mention such delays for those
functions, I thought it may not be necessary to document this point for
pg_log_query_plan() either.


On the other hand, since pg_log_query_plan() emits the log output at the
next time ExecProcNodeFirst() is executed, if there is a delay before
ExecProcNodeFirst() is called, there would also be a delay before the
plan is logged.

For example, this can happen with a query such as SELECT pg_sleep(100),
or when executing a time-consuming sort operation like the following:

  BEGIN;
  DECLARE sort_probe NO SCROLL CURSOR FOR
  SELECT g, md5(g::text) AS k
  FROM generate_series(1, 100000000) AS g
  ORDER BY k;

  FETCH 1 FROM sort_probe;

I think this behavior is specific to pg_log_query_plan(), so I am
starting to wonder whether something should be mentioned in the
documentation.


Thanks,

--
Atsushi Torikoshi
Seconded from NTT DATA CORPORATION to SRA OSS K.K.


Reply via email to