On Sat, Apr 21, 2018 at 1:26 AM, Andrew Gierth <and...@tao11.riddles.org.uk> wrote: >>>>>> "Amit" == Amit Kapila <amit.kapil...@gmail.com> writes: > > Amit> I haven't tried to reproduce it, but I can see the possibility of > Amit> the problem described by you. What should we do next? I could see > Amit> few possibilities: (a) Vacuum for main and toast table should > Amit> always use same OldestXmin, > > I don't see how this could be arranged without either disabling the > ability to vacuum the toast table independently, or storing the xmin > somewhere. >
I think we can use the same xmin for both main heap and toast by computing it before scanning the main heap (lazy_vacuum_rel) and then pass it down. I have tried it and came up with the attached patch. > Amit> (b) Before removing the row from toast table, we should ensure > Amit> that row in the main table is removed, > > We can't find the main table row version(s) without scanning the whole > main table, so this amounts (again) to disabling vacuum on the toast > table only. > > Amit> (c) Change the logic during rewrite such that it can detect this > Amit> situation and skip copying the tuple in the main table, > > This seems more promising, but the problem is how to detect the error > safely (since currently, it's just ereport()ed from deep inside the > toast code). How about: > > 1) add a toast_datum_exists() call or similar that returns false if the > (first block of) the specified external toast item is missing > > 2) when copying a RECENTLY_DEAD tuple, check all its external column > values using this function beforehand, and if any are missing, treat the > tuple as DEAD instead. > > Amit> on a quick look, this seems tricky, because the toast table TID > Amit> might have been reused by that time, > > Toast pointers don't point to TIDs fortunately, they point to OIDs. The > OID could have been reused (if there's been an OID wraparound since the > toast item was created), but that should be harmless: it means that we'd > incorrectly copy the new value with the old tuple, but the old tuple is > never going to be visible to anybody ever again so nothing will see that. > Yeah, that's right, but it gives some uneasy feeling that we are attaching the wrong toast value. If you think there is some basic flaw in Approach-1 (as in attached patch) or it requires some major surgery then we can look further into this. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
use_same_oldestxmin_mainheap_toast_v1.patch
Description: Binary data