On Mon, Oct 22, 2018 at 07:15:38PM -0300, Alvaro Herrera wrote: > On 2018-Oct-22, Andres Freund wrote: >> Hm? My point is that this fix just puts a band-aid onto *one* of the >> places that read a XLOG_RUNNING_XACTS. Which still leaves the contents >> of WAL record corrupted. There's not even a note at the WAL-record's >> definition or its logging denoting that the contents are not what you'd >> expect. I don't mean that the fix would break logical decoding, but >> that it's possible that an equivalent of the problem affecting hot >> standby also affects logical decoding. And even leaving those two users >> aside, it's possible that there will be further vulernable internal >> users or extensions parsing the WAL. > > Ah! I misinterpreted what you were saying. I agree we shouldn't let > the WAL message have wrong data. Of course we shouldn't just add a > code comment stating that the data is wrong :-)
Well, following the same kind of thoughts, txid_current_snapshot() uses sort_snapshot() to remove all the duplicates after fetching its data from GetSnapshotData(), so wouldn't we want to do something about removal of duplicates if dummy PGXACT entries are found while scanning the ProcArray also in this case? What I would think we should do is not only to patch GetRunningTransactionData() but also GetSnapshotData() so as we don't have duplicates also in this case, and do things in such a way that both code paths use the same logic, and that we don't need to have sort_snapshot() anymore. That would be more costly though... -- Michael
signature.asc
Description: PGP signature