On Wed, Apr 03, 2024 at 01:14:34PM +0100, Sad Clouds wrote:
S> Hello, I have a client/server networking application that exhibits
S> TCP socket handling errors. This only happens on FreeBSD, while NetBSD,
S> Linux, Solaris, etc. all seem to work correctly. I was hoping to get
S> some advice on what could be the root cause.
S> 
S> I have two processes - client and server, sending and receiving data
S> to/from each other on 127.0.0.1
S> 
S> Client connects to server and calls send(2)/recv(2) in a loop. This is
S> a bidirectional data exchange. When all send data is transferred,
S> client calls shutdown(sockfd, SHUT_WR) and continues receiving data on
S> the same socket until recv(2) returns 0 bytes, which signals end of
S> receive data. At this stage client calls close(sockfd) and terminates.
S> 
S> Server has the same data transfer loop as the client.
S> 
S> I frequently get ECONNRESET when calling close(2), sometimes from the
S> server and sometimes from the client process. This should not be
S> happening, but I'm not sure what could be causing it.
S> 
S> The client logic is as follows:
S> 
S> 1. Set sockfd nonblocking.
S> 2. Call send(2)/recv(2) in a loop until N bytes have been transferred in 
each direction.
S> 3. Set sockfd blocking.
S> 4. Call send_buf() to send control handshake to server.
S> 5. Call shutdown(sockfd, SHUT_WR) to signal end of send data from client.
S> 6. Call recv_buf() to receive control handshake from server.
S> 7. Call recv_buf() and verify it returned 0 bytes to indicate end of data 
from server.
S> 8. Call close(sockfd) and verify success.
S> 
S> Step 8 sometimes fails and returns ECONNRESET.
S> 
S> Functions send_buf() and recv_buf() are wrappers around send(2) and
S> recv(2) which restart those system calls until the specified number of
S> buffer bytes have been fully transferred or 0 is returned in the case
S> of recv_buf() indicating end of data. They are designed to work with
S> blocking file descriptors and avoid short reads/writes.
S> 
S> I don't understand why close(2) sometimes returns ECONNRESET when the
S> previous recv(2) call at step 7 returned 0 bytes, indicating the remote
S> TCP end sent us a FIN.
S> 
S> I don't set SO_LINGER socket option and when I checked the default on
S> FreeBSD it reports l_onoff=0, l_linger=0 so there should be no
S> immediate RST on socket close(2).
S> 
S> Does anyone have any suggestions or ideas?

Please take a look at https://reviews.freebsd.org/D48148

-- 
Gleb Smirnoff

Reply via email to