On Fri, Aug 7, 2020 at 9:33 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapil...@gmail.com> writes: > > On Sat, Aug 1, 2020 at 1:53 AM Andres Freund <and...@anarazel.de> wrote: > >> We could also just use pg_class.relpages. It'll probably mostly be > >> accurate enough? > > > Don't we need the accurate 'number of blocks' if we want to invalidate > > all the buffers? Basically, I think we need to perform BufTableLookup > > for all the blocks in the relation and then Invalidate all buffers. > > Yeah, there is no room for "good enough" here. If a dirty buffer remains > in the system, the checkpointer will eventually try to flush it, and fail > (because there's no file to write it to), and then checkpointing will be > stuck. So we cannot afford to risk missing any buffers. >
Right, this reminds me of the discussion we had last time on this topic where we decided that we can't even rely on using smgrnblocks to find the exact number of blocks because lseek might lie about the EOF position [1]. So, we anyway need some mechanism to push the information related to the "to be truncated or dropped relations" to the background worker (checkpointer and or others) to avoid flush issues. But, maybe it is better to push the responsibility of invalidating the buffers for truncated/dropped relation to the background process. However, I feel for some cases where relation size is greater than the number of shared buffers there might not be much benefit in pushing this operation to background unless there are already a few other relation entries (for dropped relations) so that cost of scanning the buffers can be amortized. [1] - https://www.postgresql.org/message-id/16664.1435414204%40sss.pgh.pa.us -- With Regards, Amit Kapila.