I wrote: > Hmm. I'm not convinced that 0001 is an actual *fix*, but it should > at least reduce the frequency of occurrence a lot, which'd help.
After enabling log_statement = all to verify what commands are being sent to the remote, I realized that there's a third thing this patch can do to stabilize matters: issue a regular remote query inside the test transaction, before we enable the timeout. This will ensure that we've dealt with configure_remote_session() and started a remote transaction, so that there aren't extra round trips happening for that while the clock is running. Pushed with that addition and some comment-tweaking. We'll see whether that actually makes things more stable, but I don't think it could make it worse. regards, tom lane