On Fri, 30 Aug 2024 at 22:12, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Jelte Fennema-Nio <postg...@jeltef.nl> writes: > > On Fri, Aug 30, 2024, 21:21 Tom Lane <t...@sss.pgh.pa.us> wrote: > >> While we're piling on, has anyone noticed that *non* Windows buildfarm > >> animals are also failing this test pretty frequently? > > > Yes. Fixes are here (see the ~10 emails above in the thread for details): > > https://www.postgresql.org/message-id/cageczqqo8cn2rw45xuymvzxesssst7-bcruuzufmbgqc3ue...@mail.gmail.com > > Hmm. I'm not convinced that 0001 is an actual *fix*, but it should > at least reduce the frequency of occurrence a lot, which'd help.
I also don't think it's an actual fix, but I couldn't think of a way to fix this. And since this only happens if you cancel right at the start of a postgres_fdw query, I don't think it's worth investing too much time on a fix. > I don't want to move the test case to where you propose, because > that's basically not sensible. But can't we avoid remote estimates > by just cross-joining ft1 to itself, and not using the tables for > which remote estimate is enabled? Yeah that should work too (I just saw your next email, where you said it's committed like this). > I think 0002 is probably outright wrong, or at least the change to > disable_statement_timeout is. Once we get to that, we don't want > to throw a timeout error any more, even if an interrupt was received > just before it. The disable_statement_timeout change was not the part of that patch that was necessary for stable output, only the change in the first branch of enable_statement_timeout was necessary. The reason being that enable_statement_timeout is called multiple times for a query, because start_xact_command is called multiple times in exec_simple_query. The change to disable_statement_timeout just seemed like the logical extension of that change, especially since there was basically a verbatim copy of disable_statement_timeout in the second branch of enable_statement_timeout. To make sure I understand your suggestion correctly: Are you saying you would want to completely remove the outstanding interrupt if it was caused by de statement_timout when disable_statement_timeout is called? Because I agree that would probably make sense, but that sounds like a more impactful change. But the current behaviour seems strictly worse than the behaviour proposed in the patch to me, because currently the backend would still be interrupted, but the error would indicate a reason for the interrupt that is simply incorrect i.e. it will say it was cancelled due to a user request, which never happened.