On Wed, 14 Aug 2024 at 01:13, Heikki Linnakangas <hlinn...@iki.fi> wrote: > > On 05/04/2024 13:49, Andrey M. Borodin wrote: > >> On 5 Apr 2024, at 02:08, Kirill Reshke <reshkekir...@gmail.com> wrote: > > Thanks for taking a look, Kirill! > > >> maybe we need some hooks here? Or maybe, we can take CSN here from > >> extension somehow. > > > > I really like the idea of CSN-provider-as-extension. > > But it's very important to move on with CSN, at least on standby, to make > > CSN actually happen some day. > > So, from my perspective, having LSN-as-CSN is already huge step forward. > > Yeah, I really don't want to expand the scope of this. > > Here's a new version. Rebased, and lots of comments updated. > > I added a tiny cache of the CSN lookups into SnapshotData, which can > hold the values of 4 XIDs that are known to be visible to the snapshot, > and 4 invisible XIDs. This is pretty arbitrary, but the idea is to have > something very small to speed up the common cases that 1-2 XIDs are > repeatedly looked up, without adding too much overhead. > > > I did some performance testing of the visibility checks using these CSN > snapshots. The tests run SELECTs with a SeqScan in a standby, over a > table where all the rows have xmin/xmax values that are still > in-progress in the primary. > > Three test scenarios: > > 1. large-xact: one large transaction inserted all the rows. All rows > have the same XMIN, which is still in progress > > 2. many-subxacts: one large transaction inserted each row in a separate > subtransaction. All rows have a different XMIN, but they're all > subtransactions of the same top-level transaction. (This causes the > subxids cache in the proc array to overflow) > > 3. few-subxacts: All rows are inserted, committed, and vacuum frozen. > Then, using 10 in separate subtransactions, DELETE the rows, in an > interleaved fashion. The XMAX values cycle like this "1, 2, 3, 4, 5, 6, > 7, 8, 9, 10, 1, 2, 3, 4, 5, ...". The point of this is that these > sub-XIDs fit in the subxids cache in the procarray, but the pattern > defeats the simple 4-element cache that I added. > > The test script I used is attached. I repeated it a few times with > master and the patches here, and picked the fastest runs for each. Just > eyeballing the results, there's about ~10% variance in these numbers. > Smaller is better. > > Master: > > large-xact: 4.57732510566711 > many-subxacts: 18.6958119869232 > few-subxacts: 16.467698097229 > > Patched: > > large-xact: 10.2999930381775 > many-subxacts: 11.6501438617706 > few-subxacts: 19.8457028865814 > > With cache: > > large-xact: 3.68792295455933 > many-subxacts: 13.3662350177765 > few-subxacts: 21.4426419734955 > > The 'large-xacts' results show that the CSN lookups are slower than the > binary search on the 'xids' array. Not a surprise. The 4-element cache > fixes the regression, which is also not a surprise. > > The 'many-subxacts' results show that the CSN lookups are faster than > the current method in master, when the subxids cache has overflowed. > That makes sense: on master, we always perform a lookup in pg_subtrans, > if the suxids cache has overflowed, which is more or less the same > overhead as the CSN lookup. But we avoid the binary search on the xids > array after that. > > The 'few-subxacts' shows a regression, when the 4-element cache is not > effective. I think that's acceptable, the CSN approach has many > benefits, and I don't think this is a very common scenario. But if > necessary, it could perhaps be alleviated with more caching, or by > trying to compensate by optimizing elsewhere. > > -- > Heikki Linnakangas > Neon (https://neon.tech)
Thanks for the update. I will try to find time for perf-testing this. Firstly, random suggestions. Sorry for being too nit-picky 1) in 0002 > +/* > + * Number of shared CSNLog buffers. > + */ > +static Size > +CSNLogShmemBuffers(void) > +{ > + return Min(32, Max(16, NBuffers / 512)); > +} Should we GUC this? 2) In 0002 CSNLogShmemInit: > + //SlruPagePrecedesUnitTests(CsnlogCtl, SUBTRANS_XACTS_PER_PAGE); remove this? 3) In 0002 InitCSNLogPage: > + SimpleLruZeroPage(CsnlogCtl, pageno); we can use ZeroCSNLogPage here. This will justify existance of this function a little bit more. 4) In 0002: > +++ b/src/backend/replication/logical/snapbuild.c > @@ -27,7 +27,7 @@ > * removed. This is achieved by using the replication slot mechanism. > * > * As the percentage of transactions modifying the catalog normally is fairly > - * small in comparisons to ones only manipulating user data, we keep track of > + * small in comparison to ones only manipulating user data, we keep track of > * the committed catalog modifying ones inside [xmin, xmax) instead of keeping > * track of all running transactions like it's done in a normal snapshot. Note > * that we're generally only looking at transactions that have acquired an This change is unrelated to 0002 patch, let's just push it as a separate change. Overall, 0002 looks straightforward, though big. I however wonder how we can test that this change does not lead to any unpleasant problem, like observing uncommitted changes on replicas, corruption, and other stuff? Maybe some basic injection-point-based TAP test here is desirable? -- Best regards, Kirill Reshke