Re: huge SubtransSLRU and SubtransBuffer wait_event

Lars Aksel Opsahl Fri, 02 Feb 2024 01:27:12 -0800

>From: James Pang (chaolpan) <chaol...@cisco.com>Sent: Friday, February 2, 2024 
>7:47 AMTo: Laurenz Albe <laurenz.a...@cybertec.at>; 
>pgsql-performance@lists.postgresql.org 
><pgsql-performance@lists.postgresql.org>Subject: RE: huge SubtransSLRU and 
>SubtransBuffer wait_event


>

>  Our case is   1) we use PL/PGSQL procedure1-->procedure2 (update table 
> xxxx;commit);   2) application JDBC client call procedure1 (it's a long 
> running job, sometimes it could last > 1hours).   During this time window,  
> other Postgresql JDBC clients (100-200) coming in in same time , then quickly 
> see MultiXactoffset and SubtransSLRU increased very quickly.


Hi


We had the same problem here 
https://gitlab.com/nibioopensource/resolve-overlap-and-gap . Here we can have 
more than 50 threads pushing millions of rows into common tables and one single 
final Postgis Topology structure as a final step. We also need to run try 
catch. The code is wrapped into functions and procedures and called from psql .


Just to test we tried compile with a higher number of subtrans locks and that 
just made this problem appear just a little bit later.


For us the solution was to save temporary results in array like this 
https://gitlab.com/nibioopensource/resolve-overlap-and-gap/-/commit/679bea2b4b1ba4c9e84923b65c62c32c3aed6c21#a22cbe80eb0e36ea21e4f8036e0a4109b2ff2379_611_617

. The clue is to do as much work as possible without involving any common data 
structures for instance like using arrays to hold temp results and not use a 
shared final table before it's really needed.


Then later at a final step we insert all prepared data into a final common data 
structure and where we also try to avoid try catch when possible. Then system 
can then run with verry high CPU load for 99% of the work and just at then 
verry end we start to involve the common database structure.


Another thing to avoid locks is let each thread work on it's down data as much 
possible, this means breaking up the input and sort what's unique data for this 
tread and postpone the common data to a later stage. When for instance working 
with Postgis Topology we actually split data to be sure that not two threads 
works on the same area and then at later state another thread push shared 
data/area in to the final data structure.


This steps seems to have solved this problem for us which started out here 
https://postgrespro.com/list/thread-id/2478202<https://postgrespro.com/list/thread-id/2478202>


Lars

Re: huge SubtransSLRU and SubtransBuffer wait_event

Reply via email to