On 2023-06-03 02:51, James Coleman wrote:
Hello,
Thanks for working on this patch!
On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikos...@oss.nttdata.com>
wrote:
BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes
which is used in auto_explain, I'm feeling that the following
discussion
also applies to this patch.
> --
>
https://www.postgresql.org/message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com
>
> explaining a query is a pretty
> complicated operation that involves catalog lookups and lots of
> complicated stuff, and I don't think that it would be safe to do all
> of that at any arbitrary point in the code where ProcessInterrupts()
> happened to fire.
If I can't come up with some workaround during the next Commitfest,
I'm
going to withdraw this proposal.
While at PGCon this week I'd brought up this idea with a few people
without realizing you'd already worked on it previously, and then
after I discovered this thread several of us (Greg, Ronan, David,
Heikki, and myself -- all cc'd) discussed the safety concerns over
dinner last night.
Our conclusion was that all of the concerns we could come up with (for
example, walking though the code for ExplainTargetRel() and discussing
the relevant catalog and syscache accesses) all applied specifically
to Robert's concerns about running explain inside an aborted
transaction. After all, I'd originally started that thread to ask
about running auto-explain after a query timeout.
To put it positively: we believe that, for example, catalog accesses
inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
an existing valid transaction/query state, as it would be for this
patch -- are safe. If there were problems, then those problems are
likely bugs we already have in other CFI cases.
Thanks a lot for the discussion and sharing it!
I really appreciate it.
BTW I'm not sure whether all the CFI are called in valid transaction,
do you think we should check each of them?
Another concern Robert raised may apply here: what if a person tries
to cancel a query when we're explaining? I believe that's a reasonable
question to ask, but I believe it's a trade-off that's worth making
for the (significant) introspection benefits this patch would provide.
Is the concern here limited to the case where explain code goes crazy
as Robert pointed out?
If so, this may be a trade-off worth doing.
I am a little concerned about whether there will be cases where the
explain code is not problematic but just takes long time.
On to the patch itself: overall I think it looks like it's in pretty
good shape. I also noticed we don't have any tests (I assume it'd have
to be TAP tests) of the actual output happening, and I think it would
be worth adding that.
Checking the log output may be better, but I didn't add it since there
is only a little content that can be checked, and similar function
pg_log_backend_memory_contexts() does not do such type of tests.
Are you interested in re-opening this patch? I'd be happy to provide
further review and help to try to push this along.
Sure, I'm going to re-open this.
I've rebased the patch and attached as v26.
Thanks again for your work!
--
Regards,
--
Atsushi Torikoshi
NTT DATA CORPORATION