Re: Postgresql OOM

Radu Radutiu Wed, 12 Jun 2024 04:29:23 -0700

Hi,


> That's helpful, thanks!
>
> One thing to note is that postgres' work_mem is confusing - it applies to
> individual query execution nodes, not the whole query. Additionally, when
> you
> use parallel workers, each of the parallel workers can use the "full"
> work_mem, rather than work_mem being evenly distributed across workers.
>
> Within that, the memory usage in the EXPLAIN ANALYZE isn't entirely
> unexpected
> (I'd say it's unreasonable if you're not a postgres dev, but that's a
> different issue).
>
> We can see each of the Hash nodes use ~1GB, which is due to
> (1 leader + 4 workers) * work_mem = 5 * 200MB = 1GB.
>
> In this specific query we probably could free the memory in the "lower"
> hash
> join nodes once the node directly above has finished building, but we don't
> have the logic for that today.
>

I would understand 1 GB, even 2GB (considering hash_mem_multiplier) but the
server ran out of memory with more than 64 GB.

>
> Of course that doesn't explain why the memory usage / temp file creation
> is so
> extreme with a lower limit / fewer workers.  There aren't any bad
> statistics
> afaics, nor can I really see a potential for a significant skew, it looks
> to
> me that the hashtables are all built on a single text field and that nearly
> all the input rows are distinct.
>
> Could you post the table definition (\d+) and the database definition
> (\l). One random guess I have is that you ended up with a
> "non-deterministic"
> collation for that column and that we end up with a bad hash due to that.
>

I was  able to eliminate the columns not involved in the query while still
retaining the problematic behavior (that's the reason for the new table
names):
postgres=# \d inreq
                                  Table "public.inreq"
         Column          |            Type             | Collation |
Nullable | Default
-------------------------+-----------------------------+-----------+----------+---------
 input_sequence          | bigint                      |           | not
null |
 msg_type                | character varying(8)        |           |
   |
 originalrequest_id      | bigint                      |           |
   |
 receive_time            | timestamp without time zone |           |
   |
 related_output_sequence | bigint                      |           |
   |
 msg_status              | character varying(15)       |           |
   |
Indexes:
    "inreq_pkey" PRIMARY KEY, btree (input_sequence)
    "inreq_originalrequest_id_idx" btree (originalrequest_id)

postgres=# \d outreq
                      Table "public.outreq"
         Column         |  Type  | Collation | Nullable | Default
------------------------+--------+-----------+----------+---------
 output_sequence        | bigint |           | not null |
 input_sequence         | bigint |           |          |
 reply_input_sequence   | bigint |           |          |
 related_input_sequence | bigint |           |          |
Indexes:
    "outreq_pkey" PRIMARY KEY, btree (output_sequence)
    "outreq_input_sequence_idx" btree (input_sequence)
    "outreq_reply_input_sequence_idx" btree (reply_input_sequence)
postgres=# SELECT datname, datcollate FROM pg_database WHERE datname =
current_database();
 datname  | datcollate
----------+-------------
 postgres | en_US.UTF-8
(1 row)

A dump of the two tables above can be found at
https://drive.google.com/file/d/1ep1MYjNzlFaICL3GlPZaMdpOxRG6WCGz/view?usp=sharing
(compressed size 1GB; size of database after import 14 GB ).

# prepare my_query (timestamp,bigint) as SELECT  t.input_sequence,
rec_tro.output_sequence, r.input_sequence, rpl_rec_tro.output_sequence,
rpl_snd_tro.output_sequence, snd_tro.output_sequence, t.msg_type  FROM
inreq t  LEFT JOIN outreq rec_tro ON rec_tro.input_sequence =
t.input_sequence LEFT JOIN inreq r ON r.originalRequest_id =
t.input_sequence   LEFT JOIN outreq rpl_rec_tro ON
rpl_rec_tro.input_sequence = r.input_sequence  LEFT JOIN outreq rpl_snd_tro
ON rpl_snd_tro.reply_input_sequence = r.input_sequence  LEFT JOIN outreq
snd_tro ON snd_tro.reply_input_sequence = t.input_sequence  WHERE
t.receive_time < $1 AND t.input_sequence < $2  AND t.msg_status IN
('COMPLETED', 'REJECTED')  ORDER BY t.msg_status DESC, t.input_sequence ;
# EXPLAIN (ANALYZE,BUFFERS)  EXECUTE my_query('2024-05-18 00:00:00',
202406020168279904);

Best Regards,
Radu

Re: Postgresql OOM

Reply via email to