On Mon, Aug 1, 2022 at 10:13 PM Simon Riggs <simon.ri...@enterprisedb.com> wrote: > > "A mathematical catastrophe is a point in a model of an input-output > system, where a vanishingly small change in the input can produce a > large change in the output." > > We have just such a change in Postgres: when a snapshot overflows. In > this case it takes only one subxid over the subxid cache limit to slow > down every request in XidInMVCCSnapshot(), which becomes painful when > a long running transaction exists at the same time. This situation has > been noted by various bloggers, but is illustrated clearly in the > attached diagram, generated by test results from Julien Tachoires. > > The reason for the slowdown is clear: when we overflow we check every > xid against subtrans, producing a large stream of lookups. Some > previous hackers have tried to speed up subtrans - this patch takes a > different approach: remove as many subtrans lookups as possible. (So > is not competing with those other solutions). > > Attached patch improves on the situation, as also shown in the attached > diagram. > > The patch does these things: > > 1. Rework XidInMVCCSnapshot() so that it always checks the snapshot > first, before attempting to lookup subtrans. A related change means > that we always keep full subxid info in the snapshot, even if one of > the backends has overflowed. > > 2. Use binary search for standby snapshots, since the snapshot subxip > is in sorted order. > > 3. Rework GetTopmostTransaction so that it a) checks xmin as it goes, > b) only does one iteration on standby snapshots, both of which save > subtrans lookups in appropriate cases. > (This was newly added in v6) > > Now, is this a panacea? Not at all. What this patch does is smooth out > the catastrophic effect so that a few overflowed subxids don't spoil > everybody else's performance, but eventually, if many or all sessions > have their overflowed subxid caches then the performance will descend > as before, albeit that the attached patch has some additional > optimizations (2, 3 above). So what this gives is a better flight > envelope in case of a small number of occasional overflows. > > Please review. Thank you.
+1, I had a quick look into the patch to understand the idea and I think the idea looks really promising to me. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com