Re: Make relfile tombstone files conditional on WAL level

Thomas Munro Thu, 06 Jan 2022 00:47:40 -0800

On Thu, Jan 6, 2022 at 9:13 PM Dilip Kumar <[email protected]> wrote:
> On Thu, Jan 6, 2022 at 1:12 PM Dilip Kumar <[email protected]> wrote:
> > > I think this idea is worth more consideration. It seems like 2^56
> > > relfilenodes ought to be enough for anyone, recalling that you can
> > > only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a
> > > bunch of code that is there to guard against relfilenodes being
> > > reused. In particular, we can remove the code that leaves a 0-length
> > > tombstone file around until the next checkpoint to guard against
> > > relfilenode reuse.
> >
> > +1


+1

> I IMHO a few top level point for implementing this idea would be as listed 
> here,
>
> 1) the "relfilenode" member inside the RelFileNode will be now 64
> bytes and remove the "forkNum" all together from the BufferTag.  So
> now whenever we want to use the relfilenode or fork number we can use
> the respective mask and fetch their values.
> 2) GetNewRelFileNode() will not loop for checking the file existence
> and retry with other relfilenode.
> 3) Modify mdunlinkfork() so that we immediately perform the unlink
> request, make sure to register_forget_request() before unlink.
> 4) In checkpointer, now we don't need any handling for pendingUnlinks.

Another problem is that relfilenodes are normally allocated with
GetNewOidWithIndex(), and initially match a relation's OID.  We'd need
a new allocator, and they won't be able to match the OID in general
(while we have 32 bit OIDs at least).

Re: Make relfile tombstone files conditional on WAL level

Reply via email to