Hi, On 2021-08-03 22:23:58 +0300, Michail Nikolaev wrote: > > I'm not sure that we need to care as much about the cost of > > KnownAssignedXidsGetAndSetXmin() - for one, the caching we now have makes > > that > > less frequent. > > It is still about 5-7% of CPU for a typical workload, a considerable > amount for me.
I'm not saying we shouldn't optimize things. Just that it's less pressing. And what kind of price we're willing to optimize may have changed. > And a lot of systems still work on 12 and 13. I don't see us backporting performance improvements around this to 12 and 13, so I don't think that matters much... We've done that a few times, but usually when there's some accidentally quadratic behaviour or such. > > But more importantly, it'd not be hard to maintain an > > occasionally (or opportunistically) maintained denser version for > > GetSnapshotData() - there's only a single writer, making the concurrency > > issues a lot simpler. > > Could you please explain it in more detail? > Single writer and GetSnapshotData() already exclusively hold > ProcArrayLock at the moment of KnownAssignedXidsRemove, > KnownAssignedXidsGetAndSetXmin, and sometimes KnownAssignedXidsAdd. GetSnapshotData() only locks ProcArrayLock in shared mode. The problem is that we don't want to add a lot of work KnownAssignedXidsAdd/Remove, because very often nobody will build a snapshot for that moment and building a sorted, gap-free, linear array of xids isn't cheap. In my experience it's more common to be bottlenecked by replay CPU performance than on replica snapshot building. During recovery, where there's only one writer to the procarray / known xids, it might not be hard to opportunistically build a dense version of the known xids whenever a snapshot is built. Greetings, Andres Freund