At Wed, 10 Aug 2011 20:37:46 +0300, Timo Sirainen wrote: > On 2.8.2011, at 5.25, SATOH Fumiyasu wrote: > > >>> Dovecot ignores EINPROGRESS on connect(2) for non-blocking fd. > >>> This is wrong. After that, read(2) to fd (or write(2) to fd) fails > >>> with ENOTCONN if the connection of fd is not completed. > >>> > >>> The attached patch fixes this problem. > > If you do that, then there's no point in making the socket > non-blocking before connect().
Linux connect(2) manpage said: EINPROGRESS The socket is nonblocking and the connection cannot be completed immediately. It is pos- sible to select(2) or poll(2) for completion by selecting the socket for writing. After select(2) indicates writability, use get- sockopt(2) to read the SO_ERROR option at level SOL_SOCKET to determine whether con- nect() completed successfully (SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of the usual error codes listed here, explain- ing the reason for the failure). Solaris 10 connect(3SOCKET) manpage said: EINPROGRESS The socket is non-blocking, and the connection cannot be completed immediately. You can use select(3C) to complete the connection by selecting the socket for writing. Windows connect function document said (http://msdn.microsoft.com/en-us/library/ms737625%28v=vs.85%29.aspx): With a nonblocking socket, the connection attempt cannot be completed immediately. In this case, connect will return SOCKET_ERROR, and WSAGetLastError will return WSAEWOULDBLOCK. In this case, there are three possible scenarios: * Use the select function to determine the completion of the connection request by checking to see if the socket is writeable. * If the application is using WSAAsyncSelect to indicate interest in connection events, then the application will receive an FD_CONNECT notification indicating that the connect operation is complete (successfully or not). * If the application is using WSAEventSelect to indicate interest in connection events, then the associated event object will be signaled indicating that the connect operation is complete (successfully or not). > > On a high-load Solaris 10 box, dovecot-lda fails to query (I/O) to > > dovecot dict socket with ENOTCONN. My patch fixes this problem. > > I think Linux/etc returns EAGAIN in such situation. Maybe the right > fix is to just add EINPROGRESS check for net_connect_unix_with_retries()? > (With some extra changes so that it actually sees that errno from > net_connect_unix()) I think you MUST wait for the fd to complete connect() before read() from / write() to the fd in such situation. -- -- Name: SATOH Fumiyasu (fumiyas @ osstech co jp) -- Business Home: http://www.OSSTech.co.jp/ -- Personal Home: http://www.SFO.jp/blog/