On Thu, Jan 6, 2022 at 9:13 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > On Thu, Jan 6, 2022 at 1:12 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > I think this idea is worth more consideration. It seems like 2^56 > > > relfilenodes ought to be enough for anyone, recalling that you can > > > only ever have 2^64 bytes of WAL. So if we do this, we can eliminate a > > > bunch of code that is there to guard against relfilenodes being > > > reused. In particular, we can remove the code that leaves a 0-length > > > tombstone file around until the next checkpoint to guard against > > > relfilenode reuse. > > > > +1
+1 > I IMHO a few top level point for implementing this idea would be as listed > here, > > 1) the "relfilenode" member inside the RelFileNode will be now 64 > bytes and remove the "forkNum" all together from the BufferTag. So > now whenever we want to use the relfilenode or fork number we can use > the respective mask and fetch their values. > 2) GetNewRelFileNode() will not loop for checking the file existence > and retry with other relfilenode. > 3) Modify mdunlinkfork() so that we immediately perform the unlink > request, make sure to register_forget_request() before unlink. > 4) In checkpointer, now we don't need any handling for pendingUnlinks. Another problem is that relfilenodes are normally allocated with GetNewOidWithIndex(), and initially match a relation's OID. We'd need a new allocator, and they won't be able to match the OID in general (while we have 32 bit OIDs at least).