Hi, On 2019-12-19 07:08:06 -0800, Mark Dilger wrote: > > As soon as a transaction aborts, the TOAST rows can be vacuumed > > away, but the READ UNCOMMITTED transaction might've already seen the > > main tuple. This is not even a particularly tight race, necessarily, > > since for example the table might be scanned, feeding tuples into a > > tuplesort, and then the detoating might happen further up in the query > > tree after the sort has completed. > > I don't know if this could be fixed without adding overhead to toast > processing for non-RECOVERY transactions, but perhaps it doesn't need > to be fixed at all. Perhaps you just accept that in RECOVERY mode you > can't see toast data, and instead get NULLs for all such rows. Now, > that could have security implications if somebody defines a policy > where NULL in a toast column means "allow" rather than "deny" for > some issue, but if this RECOVERY mode is limited to superusers, that > isn't such a big objection.
I mean, that's just a small part of the issue. You can get *different* data back for toast columns - incompatible with the datatype, leading to crashes. You can get *different* data back for the same query, running it twice, because data that was just inserted can get pruned away if the inserting transaction aborted. > There may be a number of other gotchas still to be resolved, but > abandoning the patch at this stage strikes me as premature. I think iff we'd want this feature, you'd have to actually use a much larger hammer, and change the snapshot logic to include information about which aborted transactions are visible, and whose rows cannot be removed. And then vacuuming/hot pruning need to be changed to respect that. And note that'll affect *all* sessions, not just the one wanting to use READ UNCOMMITTED. Greetings, Andres Freund