On 2016-06-16 12:43:34 -0400, Robert Haas wrote: > On Thu, Jun 16, 2016 at 12:14 PM, Andres Freund <and...@anarazel.de> wrote: > >> > The issue isn't there without the feature, because we (should) never > >> > access a tuple/detoast a column when it's invisible enough for the > >> > corresponding toast tuple to be vacuumed away. But with > >> > old_snapshot_timeout that's obviously (intentionally) not the case > >> > anymore. Due to old_snapshot_threshold we'll prune tuples which, > >> > without it, would still be considered HEAPTUPLE_RECENTLY_DEAD. > >> > >> Is there really an assumption that the heap and the TOAST heap are > >> only ever vacuumed with the same OldestXmin value? Because that seems > >> like it would be massively flaky. > > > > There's not. They can be vacuumed days apart. But if we vacuum the toast > > table with an OldestXmin, and encounter a dead toast tuple, by the > > definition of OldestXmin (excluding STO), there cannot be a session > > reading the referencing tuple anymore - so that shouldn't matter. > > I don't understand how STO changes that. I'm not saying it doesn't > change it, but I don't understand why it would.
Because we advance OldestXmin more aggressively, while allowing snapshots that are *older* than OldestXmin to access old tuples on pages which haven't been touched. > The root of my confusion is: if we prune a tuple, we'll bump the page > LSN, so any session that is still referencing that tuple will error > out as soon as it touches the page on which that tuple used to exist. Right. On the main table. But we don't peform that check on the toast table/pages. So if we prune toast tuples, which are still referenced by (unvacuumed) main relation, we can get into trouble. > It won't even survive long enough to care that the tuple isn't there > any more. > > Maybe it would help if you lay out the whole sequence of events, like: > > S1: Does this. > S2: Does that. > S1: Now does something else. I presume it'd be something like: Assuming a 'toasted' table, which contains one row, with a 1GB field. S1: BEGIN REPEATABLE READ; S1: SELECT SUM(length(one_gb_record)) FROM toasted; S2: DELETE FROM toasted; AUTOVAC: vacuum toasted's toast table, it's large. skip toasted, it's small S1: SELECT SUM(length(one_gb_record)) FROM toasted; <missing chunk error> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers