We also reverted this param : cmdv3=# show max_parallel_workers_per_gather; max_parallel_workers_per_gather --------------------------------- 2 (1 row)
It was set to 8. Il giorno mar 21 gen 2020 alle ore 16:06 Nicola Contu < nicola.co...@gmail.com> ha scritto: > Hey Thomas, > after a few months, we started having this issue again. > So we revert the work_mem parameter to 600MB instead of 2GB. > But the issue is still there. A query went to segmentation fault, the DB > went to recovery mode and our app went to read only for a few minutes. > > I understand we can increase max_connections so we can have many more > segments. > > My question is : is there a way to understand the number of segments we > reached? > Currently we have 220 max_conn so as your formula is 64 + 2* 220 we have > about 500 shared segments. > We would like to increase that number to 300 or 400 but would be great to > understand if there is a way to make sure we will solve the issue as it > requires a restart of the service. > > I know you were also talking about a redesign this part in PostgreSQL. Do > you know if anything has changed in any of the newer versions after 11.5? > > Thanks a lot, > Nicola > > > > > > Il giorno gio 12 set 2019 alle ore 01:01 Thomas Munro < > thomas.mu...@gmail.com> ha scritto: > >> On Wed, Sep 11, 2019 at 11:20 PM Nicola Contu <nicola.co...@gmail.com> >> wrote: >> > If the error persist I will try to revert the work_mem. >> > Thanks a lot >> >> Hi Nicola, >> >> It's hard to say exactly what the cause of the problem is in your case >> and how to avoid it, without knowing what your query plans look like. >> PostgreSQL allows 64 + 2 * max_connections segments to exist a time, >> and it needs a number of them that depends on work_mem (in the case of >> Parallel Hash Join and Parallel Bitmap Index Scan), and also depends >> on the number of Gather nodes that appear in the plan, which in some >> unusual cases can result from partitioning. >> >> I've seen people reaching this error by running a lot of parallel >> queries concurrently. If that's the cause, then you can definitely >> get some relief by turning work_mem down, or by turning >> max_connections up (even though you don't want to allow more >> connections -- because it influences the formula for deciding on the >> DSM segment limit). We should probably adjust some of the internal >> constants to give us more slots, to avoid that problem, as discussed >> here: >> >> >> https://www.postgresql.org/message-id/flat/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com >> >> I've also seen people reaching this error by somehow coming up with >> plans that had a very large number of Gather nodes in them, >> corresponding to partitions; that's probably a bad plan (it'd >> presumably be better to terminate parallelism higher up in the plan, >> but these plans do seem to exist in the wild; I don't recall exactly >> why). I think we need a bit of a redesign so that if there are >> multiple Gather nodes, they share the same main DSM segment, instead >> of blowing through this limit. >> >> -- >> Thomas Munro >> https://enterprisedb.com >> >