On Thu, Sep 2, 2021 at 8:52 PM Robert Haas <robertmh...@gmail.com> wrote:
PFA, updated version of the patch, where I have fixed the issues reported by you and also done some more refactoring and patch split, next I am planning to post the patch with another approach where we scan the directory instead of scanning the pg_class for identifying the relfilenodes. For specific comments please find my response inline, > Andres pointed out that this approach ends up accessing relations > without taking a lock on them. It doesn't look like you did anything > about that. Now I have acquired a lock before scanning the pg_class as well as other relfilenode. > > + /* Built-in oids are mapped directly */ > + if (classForm->oid < FirstGenbkiObjectId) > + relfilenode = classForm->oid; > + else if (OidIsValid(classForm->relfilenode)) > + relfilenode = classForm->relfilenode; > + else > + continue; > > Am I missing something, or is this totally busted? Handled the mapped relation using relmapper. > /* > + * Now drop all buffers holding data of the target database; they should > + * no longer be dirty so DropDatabaseBuffers is safe. > > The way things worked before, this was true, but now AFAICS it's > false. I'm not sure whether that means that DropDatabaseBuffers() here > is actually unsafe or whether it just means that you haven't updated > the comment to explain the reason. Now we can only drop the buffer specific to old tablespace not the new tablespace so can not directly use the dboid, so extended the DropDatabaseBuffers interface to take tablespace oid as and input and updated the comments accordingly. > + * Since we copy the file directly without looking at the shared buffers, > + * we'd better first flush out any pages of the source relation that are > + * in shared buffers. We assume no new changes will be made while we are > + * holding exclusive lock on the rel. > > Ditto. I think these comments is related to index_copy_data() and this is still valid, it is showing in the patch due to some refactoring so I have separated out this refactoring patch as 0003 to avoid confusion. > > + /* As always, WAL must hit the disk before the data update does. */ > > Actually, the way it's coded now, part of the on-disk changes are done > before WAL is issued, and part are done after. I doubt that's the > right idea. There's nothing special about writing the actual payload > bytes vs. the other on-disk changes (creating directories and files). > In any case the ordering deserves a better-considered comment than > this one. Changed, now WAL first and then disk change. Open question: - Scan pg_class vs scan directories - Whether to retain the old created database mechanism as option or not. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
v3-0002-Extend-relmap-interfaces.patch
Description: Binary data
v3-0003-Refactor-index_copy_data.patch
Description: Binary data
v3-0004-Extend-bufmgr-interfaces.patch
Description: Binary data
v3-0001-Refactor-relmap-load-and-relmap-write-functions.patch
Description: Binary data
v3-0005-New-interface-to-lock-relation-id.patch
Description: Binary data
v3-0006-WAL-logged-CREATE-DATABASE.patch
Description: Binary data