Corinna Vinschen wrote:
On Oct 10 20:04, Corinna Vinschen wrote:
On Oct 10 18:36, Christian Franke wrote:
After a nonblocking connect(), postfix calls poll() with pollfd.events =
POLLIN only. If poll() succeeds, it calls recv(). This fails with ENOTCONN
because the state is still connect_pending.
Oh. So it doesn't check if the connect succeeded? Does it check the
poll result for POLLERR or does it explicitely check for revents==POLLIN?
Hmm.
[...time passes...]
It looks like you catched a long-standing bug here.
This isn't even AF_LOCAL specific. The original comment in the
write_selected branch is misleading: The AF_LOCAL specific part is just
the call to af_local_connect, not setting the connect_state. There was
a previous, longer comment at one point which I shortened for no good
reason in 2005:
/* eid credential transaction on successful non-blocking connect.
Since the read bit indicates an error, don't start transaction
if it's set. */
However, If I'm not completely mistaken, your patch would only work in
the aforementioned scenario if setsockopt(SO_PEERCRED) has been called.
Otherwise the handshake would be skipped on the connect side and thus
the handshake would fail on the server side. There's also the problem
that read_ready may indicate an error. And POLLERR is only set if the
socket is polled for POLLOUT so a failing connect would go unnoticed.
In short, the whole code is written under the assumption that any sane
application calling nonblocking connect would always call select/poll to
check if connect succeeded in the first place. Obviously, as postfix
shows, this is a wrong assumption.
I'm not yet sure how to fix this, but I'll look into this next week.
I applied a fix which, I think, is much more elegant than the former
solution. The af_local_connect call is now called as soon as an
FD_CONNECT event is generated and read by a call to wait_event. It
worked for me, so I have tender hopes that I didn't miss something.
I also applied your patch on top of this new stuff and I'm just building
a new developer snapshot for testing.
A quick test of current postfix draft with the snapshot works as
expected. Thanks.
In setsockopt I added a check for
socket family and type so setsockopt(SO_PEERCRED) only works for
AF_LOCAL/SOCK_STREAM sockets, just as the entire handshake stuff.
Probably not needed because this check was already in
af_local_set_no_getpeereid() itself.
I
also added a comment to explain why we do this and a FIXME comment so we
don't forget we're still looking for a more generic solution for the
SO_PEERCRED exchange.
Definitely, at least because the current AF_LOCAL emulation has some
security issues.
Thanks,
Christian