On Thu, Aug 20, 2020 at 11:04 AM Masahiko Sawada <masahiko.saw...@2ndquadrant.com> wrote: > > On Wed, 19 Aug 2020 at 20:45, Ashutosh Sharma <ashu.coe...@gmail.com> wrote: > > > > On Wed, Aug 19, 2020 at 3:55 PM Masahiko Sawada > > <masahiko.saw...@2ndquadrant.com> wrote: > > > > > > On Wed, 19 Aug 2020 at 15:09, Ashutosh Sharma <ashu.coe...@gmail.com> > > > wrote: > > > > > > > > On Wed, Aug 19, 2020 at 9:27 AM Masahiko Sawada > > > > <masahiko.saw...@2ndquadrant.com> wrote: > > > > > > > > > > On Mon, 17 Aug 2020 at 15:05, Ashutosh Sharma <ashu.coe...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > pg_force_freeze() can revival a tuple that is already deleted but > > > > > > > not > > > > > > > vacuumed yet. Therefore, the user might need to reindex indexes > > > > > > > after > > > > > > > using that function. For instance, with the following script, the > > > > > > > last > > > > > > > two queries: index scan and seq scan, will return different > > > > > > > results. > > > > > > > > > > > > > > set enable_seqscan to off; > > > > > > > set enable_bitmapscan to off; > > > > > > > set enable_indexonlyscan to off; > > > > > > > create table tbl (a int primary key); > > > > > > > insert into tbl values (1); > > > > > > > > > > > > > > update tbl set a = a + 100 where a = 1; > > > > > > > > > > > > > > explain analyze select * from tbl where a < 200; > > > > > > > > > > > > > > -- revive deleted tuple on heap > > > > > > > select heap_force_freeze('tbl', array['(0,1)'::tid]); > > > > > > > > > > > > > > -- index scan returns 2 tuples > > > > > > > explain analyze select * from tbl where a < 200; > > > > > > > > > > > > > > -- seq scan returns 1 tuple > > > > > > > set enable_seqscan to on; > > > > > > > explain analyze select * from tbl; > > > > > > > > > > > > > > > > > > > I am not sure if this is the right use-case of pg_force_freeze > > > > > > function. I think we should only be running pg_force_freeze function > > > > > > on a tuple for which VACUUM reports "found xmin ABC from before > > > > > > relfrozenxid PQR" sort of error otherwise it might worsen the things > > > > > > instead of making it better. > > > > > > > > > > Should this also be documented? I think that it's hard to force the > > > > > user to always use this module in the right situation but we need to > > > > > show at least when to use. > > > > > > > > > > > > > I've already added some examples in the documentation explaining the > > > > use-case of force_freeze function. If required, I will also add a note > > > > about it. > > > > > > > > > > > Also, if a tuple updated and moved to another partition is > > > > > > > revived by > > > > > > > heap_force_freeze(), its ctid still has special values: > > > > > > > MovedPartitionsOffsetNumber and MovedPartitionsBlockNumber. I > > > > > > > don't > > > > > > > see a problem yet caused by a visible tuple having the special > > > > > > > ctid > > > > > > > value, but it might be worth considering either to reset ctid > > > > > > > value as > > > > > > > well or to not freezing already-deleted tuple. > > > > > > > > > > > > > > > > > > > For this as well, the answer remains the same as above. > > > > > > > > > > Perhaps the same is true when a tuple header is corrupted including > > > > > xmin and ctid for some reason and the user wants to fix it? I'm > > > > > concerned that a live tuple having the wrong ctid will cause SEGV or > > > > > PANIC error in the future. > > > > > > > > > > > > > If a tuple header itself is corrupted, then I think we must kill that > > > > tuple. If only xmin and t_ctid fields are corrupted, then probably we > > > > can think of resetting the ctid value of that tuple. However, it won't > > > > be always possible to detect the corrupted ctid value. It's quite > > > > possible that the corrupted ctid field has valid values for block > > > > number and offset number in it, but it's actually corrupted and it > > > > would be difficult to consider such ctid as corrupted. Hence, we can't > > > > do anything about such types of corruption. Probably in such cases we > > > > need to run VACUUM FULL on such tables so that new ctid gets generated > > > > for each tuple in the table. > > > > > > Understood. > > > > > > Perhaps such corruption will be able to be detected by new heapam > > > check functions discussed on another thread. My point was that it > > > might be better to attempt making the tuple header sane state as much > > > as possible when fixing a live tuple in order to prevent further > > > problems such as databases crash by malicious attackers. > > > > > > > Agreed. So, to handle the ctid related concern that you raised, I'm > > planning to do the following changes to ensure that the tuple being > > marked as frozen contains the correct item pointer value. Please let > > me know if you are okay with these changes. > > Given that a live tuple never indicates to ve moved partitions, I > guess the first condition in the if statement is not necessary. The > rest looks good to me, although other hackers might think differently. >
Okay, thanks for confirming. I am planning to go ahead with this approach. Will later see what others have to say about it. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com