Re: [PATCH] lazy relations delete

Kyotaro Horiguchi Tue, 07 Jan 2020 20:20:28 -0800

Hello.

At Tue, 31 Dec 2019 13:16:49 +0300, Maxim Orlov <m.or...@postgrespro.ru> wrote 
in 
> Now we decided to drop many tables, let's say 1000 or 10000 not in a
> single transaction, but each table in a separate one. So, due to
> "plain" shared_buffers memory we have to do for loop for every
> relation which leads to lag between master and slave.
> 
> In real case scenario such issue lead to not a minutes lag, but hours
> lag. At the same time PostgreSQL have a great routine to delete many
> relations in a single transaction.
> 
> So, to get rid of this kind of issue here came up an idea: what if not
> to delete everyone of relations right away and just store them in an
> array, prevent shared buffers (correspond to a deleted relations) from
> been flushed. And then array reaches it max size we need to walk all
> buffers only once to "free" shared buffers correspond to a deleted
> relations.


That is a greate performane gain, but the proposal seems to lead to
database corruption. We must avoid such cases.

Relfilenode can be reused right after commit. There can be a case
where readers of the resued relfilenode see the pages from already
removed files left on shared buffers. On the other hand newly
allocated buffers for the reused relfilenode are not flushed out until
the lazy invalidate machinery actually frees the "garbage" buffers and
it leads to a broken database after a crash.  But finally the
machinery trashes away the buffers involving the correct ones at
execution time.

As for performance, hash reference for every BufferFlush call could be
a cost for unrelated transactions. And it leaves garbage buffers as
dead until more than LAZY_DELETE_ARRAY_SIZE relfilenodes are
removed.


regares.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PATCH] lazy relations delete

Reply via email to