Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

Thomas Munro Wed, 17 Jul 2024 18:07:41 -0700

On Thu, Jul 18, 2024 at 7:00 AM Alexander Lakhin <[email protected]> wrote:
> As far as I can see (having analyzed a number of runs), the hanging occurs
> when some itimer-related activity happens before "peek_socket" in this
> event sequence:
> [main] postgres {pid} select_stuff::wait: res after verify 0
> [main] postgres {pid} select_stuff::wait: returning 0
> [main] postgres {pid} select: sel.wait returns 0
> [main] postgres {pid} peek_socket: read_ready: 0, write_ready: 1, 
> except_ready: 0
>
> (See the last occurrence of the sequence in the log.)


Yeah, right, there's a lot going on between those two lines from the
[main] thread.  There are messages from helper threads [itimer], [sig]
and [socksel].  At a guess, [socksel] might be doing extra secret
communication over our socket in order to exchange SO_PEERCRED
information, huh, is that always there?  Seems worth filing a bug
report.

For the record, I know of one other occasional test failure on Cygwin:
it randomly panics in SnapBuildSerialize().  While I don't expect
there to be any users of PostgreSQL on Cygwin (it was unusably broken
before we refactored the postmaster in v16), that one is interesting
because (1) it also happen on native Windows builds, and (2) at least
one candidate fix[1] sounds like it would speed up logical replication
on all operating systems.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BJ4jSFk%3D-hdoZdcx%2Bp7ru6xuipzCZY-kiKoDc2FjsV7g%40mail.gmail.com#afb5dc4208cc0776a060145f9571dec2

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

Reply via email to