On Tue, Dec 13, 2016 at 11:19 AM, Andres Freund <and...@anarazel.de> wrote: > On 2016-12-12 16:46:38 +0900, Michael Paquier wrote: >> OK, I think that I have spotted an issue after additional read of the >> code. When a WSA event is used for read/write activity on a socket, >> the same WSA event gets reused again and again. That's fine for >> performance reasons > > It's actually also required to not loose events, > i.e. correctness. Windows events are edge, not level triggered.
Windows might not lose the event, but PostgreSQL can certainly lose the event. In the Windows implementation of WaitEventSetBlock, we'll ignore WL_SOCKET_READABLE if WL_LATCH_SET also fires. Now, since the event is edge-triggered[1], it never fires again and we hang. Oops. Even if we rewrote WaitEventSetBlock to always process all of the events, there's nothing in the latch API that requires the caller to read from the socket the first time WL_SOCKET_READABLE shows up. The caller is perfectly entitled to write: if (rv & WL_LATCH_SET) // do stuff else if (rv & WL_SOCKET_READABLE) // do other stuff I suppose we could make a coding rule that the caller must not rely on WL_SOCKET_READABLE being level-triggered, but I think that's likely to lead to a stream of bugs that only show up on Windows, because most people are used to the Linux semantics and will naturally write code like the above which isn't robust in the face of an edge-triggered event -- just as you yourself did in WaitEventSetBlock. > The > previous implementation wasn't correct. So just resetting things ain't > ok. How was it incorrect? The previous implementation *worked* and the new one doesn't, at least in the case complained of here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms741576(v=vs.85).aspx -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers