I wrote: > Lastly, I noticed that if I tried this repeatedly on a Unix socket, > I sometimes got
> psql: server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > could not send startup packet: Broken pipe > rather than the expected results. I think what is happening here is a > race condition, such that if the postmaster closes the socket without > having read the startup packet, the client might not have actually > gotten to its send() yet, and then it will get EPIPE from send() before > it gets to the point of reading the error response. I tried to fix this > by having report_fork_failure_to_client eat any pending data before > responding: I've applied patches for the other two issues, but I'm having second thoughts about trying to hack around this one. The proposed patch doesn't really eliminate the problem, and in any case the message is not totally off base: the server did close the connection unexpectedly. It'd be nicer if users didn't have to look in the server log to find out why, but we can't guarantee that. However, I've developed a second concern about report_fork_failure_to_client, which is its habit of sending the fork failure message formatted according to 2.0 protocol. This causes libpq (and possibly other clients) to suppose that it's talking to a pre-7.4 server and try again in 2.0 protocol. So if the fork failure is transient, you have the problem of being unexpectedly and silently downgraded to 2.0 protocol. We could fix that by changing the function to send the message in 3.0 protocol always --- it would take more code but it's certainly doable. The trouble with that is that a pre-7.4 libpq would see the error message as garbage; and I'm not sure how pleasantly the JDBC driver handles it either, if it is trying to use 2.0 protocol. A more long-range point about it is that the next time we make a protocol version bump that affects the format of error messages, the problem comes right back. It'd be better if the message somehow indicated that the server hadn't made any attempt to match the client protocol version. I guess if we went up to 3.0 protocol, we could include a SQLSTATE value in the message and libpq could test that before making assumptions. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers