On Mon, 17 Aug 2020 at 15:05, Ashutosh Sharma <ashu.coe...@gmail.com> wrote: > > Hello Masahiko-san, > > Thanks for the review. Please check the comments inline below: > > On Fri, Aug 14, 2020 at 10:07 AM Masahiko Sawada > <masahiko.saw...@2ndquadrant.com> wrote: > > > Thank you for updating the patch! Here are my comments on v5 patch: > > > > --- a/contrib/Makefile > > +++ b/contrib/Makefile > > @@ -35,6 +35,7 @@ SUBDIRS = \ > > pg_standby \ > > pg_stat_statements \ > > pg_trgm \ > > + pg_surgery \ > > pgcrypto \ > > > > I guess we use alphabetical order here. So pg_surgery should be placed > > before pg_trgm. > > > > Okay, will take care of this in the next version of patch. > > > --- > > + if (heap_force_opt == HEAP_FORCE_KILL) > > + ItemIdSetDead(itemid); > > > > I think that if the page is an all-visible page, we should clear an > > all-visible bit on the visibility map corresponding to the page and > > PD_ALL_VISIBLE on the page header. Otherwise, index only scan would > > return the wrong results. > > > > I think we should let VACUUM do that. Please note that this module is > intended to be used only on a damaged relation and should only be > operated on damaged tuples of such relations. And the execution of any > of the functions provided by this module on a damaged relation must be > followed by VACUUM with DISABLE_PAGE_SKIPPING option on that relation. > This is necessary to bring back a damaged relation to the sane state > once a surgery is performed on it. I will try to add this note in the > documentation for this module. > > > --- > > + /* > > + * We do not mark the buffer dirty or do WAL logging for unmodifed > > + * pages. > > + */ > > + if (!did_modify_page) > > + goto skip_wal; > > + > > + /* Mark buffer dirty before we write WAL. */ > > + MarkBufferDirty(buf); > > + > > + /* XLOG stuff */ > > + if (RelationNeedsWAL(rel)) > > + log_newpage_buffer(buf, true); > > + > > +skip_wal: > > + END_CRIT_SECTION(); > > + > > > > s/unmodifed/unmodified/ > > > > okay, will fix this typo. > > > Do we really need to use goto? I think we can modify it like follows: > > > > if (did_modity_page) > > { > > /* Mark buffer dirty before we write WAL. */ > > MarkBufferDirty(buf); > > > > /* XLOG stuff */ > > if (RelationNeedsWAL(rel)) > > log_newpage_buffer(buf, true); > > } > > > > END_CRIT_SECTION(); > > > > No, we don't need it. We can achieve the same by checking the status > of did_modify_page flag as you suggested. I will do this change in the > next version. > > > --- > > pg_force_freeze() can revival a tuple that is already deleted but not > > vacuumed yet. Therefore, the user might need to reindex indexes after > > using that function. For instance, with the following script, the last > > two queries: index scan and seq scan, will return different results. > > > > set enable_seqscan to off; > > set enable_bitmapscan to off; > > set enable_indexonlyscan to off; > > create table tbl (a int primary key); > > insert into tbl values (1); > > > > update tbl set a = a + 100 where a = 1; > > > > explain analyze select * from tbl where a < 200; > > > > -- revive deleted tuple on heap > > select heap_force_freeze('tbl', array['(0,1)'::tid]); > > > > -- index scan returns 2 tuples > > explain analyze select * from tbl where a < 200; > > > > -- seq scan returns 1 tuple > > set enable_seqscan to on; > > explain analyze select * from tbl; > > > > I am not sure if this is the right use-case of pg_force_freeze > function. I think we should only be running pg_force_freeze function > on a tuple for which VACUUM reports "found xmin ABC from before > relfrozenxid PQR" sort of error otherwise it might worsen the things > instead of making it better.
Should this also be documented? I think that it's hard to force the user to always use this module in the right situation but we need to show at least when to use. > > Also, if a tuple updated and moved to another partition is revived by > > heap_force_freeze(), its ctid still has special values: > > MovedPartitionsOffsetNumber and MovedPartitionsBlockNumber. I don't > > see a problem yet caused by a visible tuple having the special ctid > > value, but it might be worth considering either to reset ctid value as > > well or to not freezing already-deleted tuple. > > > > For this as well, the answer remains the same as above. Perhaps the same is true when a tuple header is corrupted including xmin and ctid for some reason and the user wants to fix it? I'm concerned that a live tuple having the wrong ctid will cause SEGV or PANIC error in the future. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services