On 4/6/2021 10:50 AM, sten.kristian.ivars...@gmail.com wrote:

Using AF_UNIX/SOCK_DGRAM with current version (3.2.0) seems to
drop messages or at least they are not received in the same order
they are  sent

[snip]

Thanks for the test case.  I can confirm the problem.  I'm not
familiar enough with the current AF_UNIX implementation to debug
this easily.  I'd rather spend my time on the new implementation (on
the topic/af_unix branch).  It turns out that your test case fails
there too, but in a completely different way, due to a bug in sendto
for datagrams.  I'll see if I can fix that bug and then try again.

Ken

Ok, too bad it wasn't our own code base but good that the "mystery"
is verified

I finally succeed to build topic/af_unix (after finding out what
version of zlib was needed), but not with -D__WITH_AF_UNIX to
CXXFLAGS though and thus I haven’t tested it yet

Is it sufficient to add the define to the "main" Makefile or do you
have to add it to all the Makefile:s ? I guess I can find out though

I do it on the configure line, like this:

   ../af_unix/configure CXXFLAGS="-g -O0 -D__WITH_AF_UNIX" --prefix=...

Is topic/af_unix fairly up to date with master branch ?

Yes, I periodically cherry-pick commits from master to topic/af_unix.
I'lldo that again right now.

Either way, I'll be glad to help out testing topic/af_unix

Thanks!

I've now pushed a fix for that sendto bug, and your test case runs without
error on the topic/af_unix branch.

It seems like the test-case do work now with topic/af_unix in blocking mode, 
but when using non-blocking (with MSG_DONTWAIT) there are some issues I think

1. When the queue is empty with non-blocking recv(), errno is set to EPIPE but 
I think it should be EAGAIN (or maybe the pipe is getting broken for real of 
some reason ?)

2. When using non-blocking recv() and no message is written at all, it seems 
like recv() blocks forever

3. Using non-blocking recv() where the "client" does send less than "count" 
messages, sometimes recv() blocks forever (as well)


My naïve analysis of this is that for the first issue (if any) the wrong errno is set and 
for the second issue it blocks if no sendto() is done after the first recv(), i.e. 
nothing kicks the "reader thread" in the butt to realise the queue is empty. It 
is not super clear though what POSIX says about creating blocking descriptors and then 
using non-blocking-flags with recv(), but this works in Linux any way

The explanation is actually much simpler. In the recv code where a bound datagram socket waits for a remote socket to connect to the pipe, I simply forget to handle MSG_DONTWAIT. I've pushed a fix. Please retest.

I should add that in all my work so far on the topic/af_unix branch, I've thought mainly about stream sockets. So there may still be things remaining to be implemented for the datagram case.

Ken
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to