Hello. At Tue, 31 Dec 2019 13:16:49 +0300, Maxim Orlov <m.or...@postgrespro.ru> wrote in > Now we decided to drop many tables, let's say 1000 or 10000 not in a > single transaction, but each table in a separate one. So, due to > "plain" shared_buffers memory we have to do for loop for every > relation which leads to lag between master and slave. > > In real case scenario such issue lead to not a minutes lag, but hours > lag. At the same time PostgreSQL have a great routine to delete many > relations in a single transaction. > > So, to get rid of this kind of issue here came up an idea: what if not > to delete everyone of relations right away and just store them in an > array, prevent shared buffers (correspond to a deleted relations) from > been flushed. And then array reaches it max size we need to walk all > buffers only once to "free" shared buffers correspond to a deleted > relations.
That is a greate performane gain, but the proposal seems to lead to database corruption. We must avoid such cases. Relfilenode can be reused right after commit. There can be a case where readers of the resued relfilenode see the pages from already removed files left on shared buffers. On the other hand newly allocated buffers for the reused relfilenode are not flushed out until the lazy invalidate machinery actually frees the "garbage" buffers and it leads to a broken database after a crash. But finally the machinery trashes away the buffers involving the correct ones at execution time. As for performance, hash reference for every BufferFlush call could be a cost for unrelated transactions. And it leaves garbage buffers as dead until more than LAZY_DELETE_ARRAY_SIZE relfilenodes are removed. regares. -- Kyotaro Horiguchi NTT Open Source Software Center