On 11/30/2020 9:22 PM, Norton Allen wrote:
Yeah, so now the example no longer blocks for me. Unfortunately these
bugs are not present in my application, so I will need to keep working
on this.
After paring the main application down and back up, I finally narrowed
in on the condition that was causing this blocking behavior. The issue
arises when a client connect()s twice to the same server with
non-blocking unix-domain sockets before calling select().
There are a few pieces to this. With the client configured to connect()
just once, I can see that the server's select() returns as soon as the
client calls connect(), but then the server's accept() blocks until the
client calls select(). That is not proper non-blocking behavior, but it
appears that the implementation under Cygwin does require that client
and server both be communicating synchronously to accomplish the
connect() operation.
I tried running this under Ubuntu 16.04 and found that connect()
succeeded immediately, so no subsequent select() is required, and there
does not appear to be a possibility for this collision. That proves to
hold true even if the server is not waiting in select() to process the
connect() with accept().
A workaround for this issue may be to keep the socket blocking until
after connect().
I have pushed the new minimal example program, 'rapid_connects' to
https://github.com/nthallen/cygwin_unix
The server is run like before as:
$ ./rapid_connects server
The client can be run in two different modes. To connect with just one
socket:
$ ./rapid_connects client1
To connect with two:
$ ./rapid_connects client2
My immediate strategy will be to develop a workaround for my project.
Having spent a day inside cygwin1.dll, I can see that I have a steep
learning curve to make much of a contribution there.
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple