The following bug has been logged online: Bug reference: 3855 Logged by: Scot Loach Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.4 Operating system: freebsd 6.1 Description: backend sends corrupted data on EHOSTDOWN error Details:
On FreeBSD, it is possible for a send() call on the backend socket to return an error code of EHOSTDOWN. This error can happen, for example, if a host on the local LAN is temporarily unreachable. In this case, the socket is not closed, and it may recover from this state. If it recovers, it is possible that the backend will continue sending results from a query, but it will have dropped some data from the reply. This causes the client to be out of sync with the server, which usually causes it to read an invalid length byte. This can cause various issues, such as clients crashing or, more commonly, blocking forever while trying to read a large response the server will never send. This is due to the way the backend handles errors. The following code (pqcomm.c:1075) is what happens when an error occurs on the write: /* * We drop the buffered data anyway so that processing can * continue, even though we'll probably quit soon. */ PqSendPointer = 0; return EOF; This sets PqSendPointer to 0, which effectively clears any data that was waiting to be sent. This EOF error propagates up the stack to pqformat.c: void pq_endmessage(StringInfo buf) { /* msgtype was saved in cursor field */ (void) pq_putmessage(buf->cursor, buf->data, buf->len); /* no need to complain about any failure, since pqcomm.c already did */ pfree(buf->data); buf->data = NULL; } In other words, postgres seems to be expecting that the connection will somehow be closed. Which in most errors, does happen; the stack will close the TCP connection and no harm will be done. But in the case of this particular error, the connection stays open, the client is waiting forever for bytes the server will never send, and the server is idle in its transaction, holding locks and waiting for a command from the client that will never come. The backend should either close the connection itself in this case, or handle the error better by not clearing the send buffer. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings