I agree this would be fine if PostgreSQL works the way you say below. However, PostgreSQL does not look at the # of bytes written and continue sending after that many bytes. PostgreSQL actually simply clears its buffer of bytes to send on this error, in this code:
pqcomm.c:1075 /* * We drop the buffered data anyway so that processing can * continue, even though we'll probably quit soon. */ PqSendPointer = 0; return EOF; The result as I saw on a system where this was occurring, was that when PostgreSQL was sending back a large result set, there was simply a fragment of it missing. scot. -----Original Message----- From: Jeff Davis [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 08, 2008 2:02 PM To: Scot Loach Cc: pgsql-bugs@postgresql.org Subject: RE: [BUGS] BUG #3855: backend sends corrupted data onEHOSTDOWNerror On Tue, 2008-01-08 at 12:57 -0500, Scot Loach wrote: > This may be true, but I still think PostgreSQL should be more > defensive and actively terminate the connection when this happens > (like ssh does) I think postgresql's behavior is well within reason. Let me explain: What is happening is that FreeBSD *actually sends the data* before returning EHOSTDOWN as an error, and leaving the TCP connection open! At the time I was tracking this problem down, I wrote a C program to demonstrate that fact. This is the core of the reason why it's a protocol violation in PostgreSQL (or SSL error) rather than a disconnection. I think PostgreSQL is making the assumption here that an unrecognized error code from send() that leaves the connection in a good state, is a temporary error that may be resolved. Thus, PostgreSQL assumes that due to the error, no data was written, and re-sends the data, succeeding this time. I reason that the openssl library makes similar assumptions (i.e. assuming an error means the data wasn't sent, and resets some internal SSL protocol state), otherwise I wouldn't get SSL errors afterward, but it would manifest itself as a PostgreSQL protocol violation regardless of whether you're using SSL or not. If the OS leaves a TCP connection open, I think it is perfectly reasonable for an application to assume that the OS has sent exactly as many bytes as it said it sent; no more, no less. I would lean toward the opinion that postgresql works just fine now, and that TCP is explicitly designed to prevent these kinds of problems, and we only see this problem because FreeBSD 6.1 TCP is broken. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly