On Thu, Feb 24, 2022 at 12:47:42AM +0000, l...@laurent-hasson.com wrote:
>     On Sat, Dec 04, 2021 at 05:32:10PM +0000, l...@laurent-hasson.com wrote:
>     >  I have a data warehouse with a fairly complex ETL process that has
>     >  been running for years now across PG 9.6, 11.2 and now 13.4 for the
>     >  past couple of months. I have been getting the error "An I/O error
>     >  occurred while sending to the backend" quite often under load in 13.4
>     >  which I never used to get on 11.2. I have applied some tricks, 
> particularly
>     >  with the socketTimeout JDBC configuration.

>  It'd be nice to see a network trace for this too.  Using tcpdump or
>  wireshark.  Preferably from the client side.
> 
> Hello Justin,
> 
> It has been ages! The issue has been happening a bit more often recently, as 
> much as once every 10 days or so. As a reminder, the set up is Postgres 13.4 
> on Windows Server with 16cores and 64GB memory. The scenario where this 
> occurs is an ETL tool called Pentaho Kettle (V7) connecting to the DB for 
> DataWarehouse workloads. The tool is Java-based and connects via JDBC using 
> postgresql-42.2.5.jar. There are no particular settings besides the 
> socketTimeout setting mentioned above.
> 
> The workload has some steps being lots of quick transactions for dimension 
> tables for example, but some fact table calculations, especially large 
> pivots, can make queries run for 40mn up to over an hour (a few of those).
> 
> I caught these in the logs at the time of a failure but unsure what to make 
> of that:
> 
> 2022-02-21 02:10:43.605 EST [1368] LOG:  unexpected EOF on client connection 
> with an open transaction
> 2022-02-21 02:10:43.605 EST [3304] LOG:  could not receive data from client: 
> An existing connection was forcibly closed by the remote host.
>       
> 2022-02-21 02:10:43.605 EST [3304] LOG:  unexpected EOF on client connection 
> with an open transaction
> 2022-02-21 02:31:38.808 EST [1704] LOG:  setsockopt(TCP_USER_TIMEOUT) not 
> supported

I suggest to enable CSV logging, which has many more columns of data.
Some of them might provide an insight - I'm not sure.
log_destination=csvlog (in addition to whatever else you have set).

And the aforementioned network trace.  You could set a capture filter on TCP
SYN|RST so it's not absurdly large.  From my notes, it might look like this:
(tcp[tcpflags]&(tcp-rst|tcp-syn|tcp-fin)!=0)

-- 
Justin


Reply via email to