On Mon, Mar 8, 2021 at 1:26 PM Peter Smith <smithpb2...@gmail.com> wrote: > > On Mon, Mar 8, 2021 at 4:19 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, Mar 8, 2021 at 10:04 AM Peter Smith <smithpb2...@gmail.com> wrote: > > > > > > On Sun, Mar 7, 2021 at 3:00 PM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Sun, Mar 7, 2021 at 7:35 AM Peter Smith <smithpb2...@gmail.com> > > > > wrote: > > > > > > > > > > Please find attached the latest patch set v51* > > > > > > > > > > > > > Few more comments on v51-0006-Fix-apply-worker-empty-prepare: > > > > ====================================================== > > > > 1. > > > > +/* > > > > + * A Prepare spoolfile hash entry. We create this entry in the > > > > psf_hash. This is > > > > + * for maintaining a mapping between the name of the prepared > > > > spoolfile, and the > > > > + * corresponding fileset handles of same. > > > > + */ > > > > +typedef struct PsfHashEntry > > > > +{ > > > > + char name[MAXPGPATH]; /* Hash key --- must be first */ > > > > + bool allow_delete; /* ok to delete? */ > > > > +} PsfHashEntry; > > > > + > > > > > > > > IIUC, this has table is used for two purposes in the patch (a) to > > > > check for existence of prepare spool file where we anyway to check it > > > > on disk if not found in the hash table. (b) to allow the prepare spool > > > > file to be removed on proc_exit. > > > > > > > > I think we don't need the optimization provided by (a) because it will > > > > be too rare a case to deserve any optimization, we might write a > > > > comment in prepare_spoolfile_exists to indicate such an optimization. > > > > For (b), we can use a simple list to track files to be removed on > > > > proc_exit something like we do in CreateLockFile. I think avoiding > > > > hash table usage will reduce the code and chances of bugs in this > > > > area. It won't be easy to write a lot of automated tests to test this > > > > functionality so it is better to avoid minor optimizations at this > > > > stage. > > > > > > Our data structure psf_hash also needs to be able to discover the > > > entry for a specific spool file and remove it. e.g. anything marked as > > > "allow_delete = false" (during prepare) must be able to be re-found > > > and removed from that structure at commit_prepared or > > > rollback_prepared time. > > > > > > > But, I think that is not reliable because after restart the entry > > might not be present and we anyway need to check the presence of the > > file on disk. Actually, you don't need any manipulation with list or > > hash at commit_prepared or rollback_prepared, we should just remove > > the entry for it at the prepare time and there should be an assert if > > we find that entry in the in-memory structure. > > > > > Looking at CreateLockFile code, I don't see that it is ever deleting > > > entries from its "lock_files" list on-the-fly, so it's not really a > > > fair comparison to say just use a List like CreateLockFile. > > > > > > > Sure, but you can additionally traverse the list and find the required > > entry. > > > > > So, even though we (currently) only have a single data member > > > "allow_delete", I think the requirement to do a key lookup/delete > > > makes a HTAB a more appropriate data structure than a List. > > > > > > > Actually, that member is also not required at all because you just > > need it till the time of prepare and then remove it. > > > > OK, I plan to change like this. > - Now the whole hash simply means "delete-on-exit". If the key (aka > filename) exists, delete that file on exit. If not don't > - Remove the "allow_delete" member (as you say it can be redundant > using the new interpretation above) > - the *only* code that CREATES a key will be when > prepare_spoolfile_create is called from begin_prepare. > - at apply_handle_prepare time the key is REMOVED (so that file will > not be deleted in case of a restart / error before commit/rollback) >
So, the only real place where you need to perform any search is at the prepare time and I think it should always be the first element if we use the list here. Am I missing something? If not, I don't see why you want to prefer HTAB over a simple list? You can remove the first element and probably have an assert to confirm it is the correct element (by checking the path) you are removing. -- With Regards, Amit Kapila.