Em qua., 23 de fev. de 2022 às 21:47, l...@laurent-hasson.com < l...@laurent-hasson.com> escreveu:
> > > > -----Original Message----- > > From: l...@laurent-hasson.com <l...@laurent-hasson.com> > > Sent: Saturday, December 4, 2021 14:18 > > To: Justin Pryzby <pry...@telsasoft.com> > > Cc: pgsql-performa...@postgresql.org > > Subject: RE: An I/O error occurred while sending to the backend (PG > 13.4) > > > > > > > -----Original Message----- > > > From: Justin Pryzby <pry...@telsasoft.com> > > > Sent: Saturday, December 4, 2021 12:59 > > > To: l...@laurent-hasson.com > > > Cc: pgsql-performa...@postgresql.org > > > Subject: Re: An I/O error occurred while sending to the > backend (PG > > > 13.4) > > > > > > On Sat, Dec 04, 2021 at 05:32:10PM +0000, > l...@laurent-hasson.com > > > wrote: > > > > I have a data warehouse with a fairly complex ETL process > that has > > > been running for years now across PG 9.6, 11.2 and now 13.4 > for the > > > past couple of months. I have been getting the error "An I/O > error > > > occurred while sending to the backend" quite often under load > in 13.4 > > > which I never used to get on 11.2. I have applied some tricks, > > particularly > > > with the socketTimeout JDBC configuration. > > > > > > > > So my first question is whether anyone has any idea why this > is > > > happening? My hardware and general PG configuration have not > > > changed between 11.2 and 13.4 and I NEVER experienced this on > 11.2 > > in > > > about 2y of production. > > > > > > > > Second, I have one stored procedure that takes a very long > time to > > run > > > (40mn more or less), so obviously, I'd need to set > socketTimeout to > > > something like 1h in order to call it and not timeout. That > doesn't seem > > > reasonable? > > > > > > Is the DB server local or remote (TCP/IP) to the client? > > > > > > Could you collect the corresponding postgres query logs when > this > > > happens ? > > > > > > It'd be nice to see a network trace for this too. Using > tcpdump or > > > wireshark. > > > Preferably from the client side. > > > > > > FWIW, I suspect the JDBC socketTimeout is a bad workaround. > > > > > > -- > > > Justin > > > > It's a remote server, but all on a local network. Network > performance is I > > am sure not the issue. Also, the system is on Windows Server. What > are you > > expecting to see out of a tcpdump? I'll try to get PG logs on the > failing query. > > > > Thank you, > > Laurent. > > > > > > > > > > Hello Justin, > > It has been ages! The issue has been happening a bit more often recently, > as much as once every 10 days or so. As a reminder, the set up is Postgres > 13.4 on Windows Server with 16cores and 64GB memory. I can't understand why you are still using 13.4? [1] There is a long discussion about the issue with 13.4, the project was made to fix a DLL bottleneck. Why you not use 13.6? regards, Ranier Vilela [1] https://www.postgresql.org/message-id/MN2PR15MB2560BBB3EC911D973C2FE3F885A89%40MN2PR15MB2560.namprd15.prod.outlook.com