Re: OOM killer while pg_restore

Ranier Vilela Sat, 05 Mar 2022 04:09:20 -0800

Em qui., 3 de mar. de 2022 às 15:32, Marc Rechté <ma...@rechte.fr> escreveu:


> Le 03/03/2022 à 16:31, Tom Lane a écrit :
> > =?UTF-8?Q?Marc_Recht=c3=a9?= <ma...@rechte.fr> writes:
> >> We have a pg_restore which fails due to RAM over-consumption of the
> >> corresponding PG backend, which ends-up with OOM killer.
> >> The table has one PK, one index, and 3 FK constraints, active while
> >> restoring.
> >> The dump contains over 200M rows for that table and is in custom format,
> >> which corresponds to 37 GB of total relation size in the original DB.
> > The FKs would result in queueing row trigger events, which would occupy
> > some memory.  But those should only need ~12 bytes per FK per row,
> > which works out to less than 10GB for this number of rows, so it may
> > be that you've hit something else that we would consider a leak.
> >
> > Does memory consumption hold steady if you drop the FK constraints?
> >
> > If not, as others have noted, we'd need more info to investigate
> > this.  The leak is probably independent of the specific data in
> > the table, so maybe you could make a small self-contained example
> > using a script to generate dummy data.
> >
> >                       regards, tom lane
> >
> >
> Actually the number of rows is 232735712.
>
> Accordingly the RAM consumption would be x12 x3 = 7.8 GiB.
>
> This is close to the 8,1g I reported earlier (actually it was closer to
> 7.8 GB, due to GiB vs. GB confusion).
>
> So there is no memory leak.
>
> It took 16 hours on my box to reach that RAM consumption, and then the
> COPY failed when checking the first FK (as the referenced table was empty).
>
> I dropped the FK, index, and 3 FK constraints and started over the
> pg_restore:
>
> 11 minutes to load the table (I did not have time to note RAM consumption)
>
> I then created the PK and index:
>
> 24 minutes
>
> For FK, I don't know because the referenced table are empty (but I'll be
> able to test next week, if deemed necessary).
>
> 16 hours vs. 35 minutes to reach the same state.
>
Maybe it's out of reach, but one way to help Postgres developers fix this
is to provide Flame Graphs [1] based on these slow operations.
For confidentiality and privacy reasons, the data is out of reach.

My 2c here.

regards,
Ranier Vilela
[1] https://www.brendangregg.com/flamegraphs.html

Re: OOM killer while pg_restore

Reply via email to