At Wed, 22 Dec 2021 13:11:38 +0000, "houzj.f...@fujitsu.com" <houzj.f...@fujitsu.com> wrote in > Hi, > > When reviewing some logical replication patches. I noticed that in > function get_rel_sync_entry() we always invoke get_rel_relispartition() > and get_rel_relkind() at the beginning which could cause unnecessary > cache access. > > --- > get_rel_sync_entry(PGOutputData *data, Oid relid) > { > RelationSyncEntry *entry; > bool am_partition = get_rel_relispartition(relid); > char relkind = get_rel_relkind(relid); > --- > > The extra cost could sometimes be noticeable because get_rel_sync_entry is a > hot function which is executed for each change. And the 'am_partition' and > 'relkind' are necessary only when we need to rebuild the RelationSyncEntry. > > Here is the perf result for the case when inserted large amounts of data into > a > un-published table in which case the cost is noticeable. > > --12.83%--pgoutput_change > |--11.84%--get_rel_sync_entry > |--4.76%--get_rel_relispartition > |--4.70%--get_rel_relkind > > So, I think it would be better if we do the initialization only when > RelationSyncEntry in invalid. > > Attach a small patch which delay the initialization. > > Thoughts ?
A simple benchmarking that replicates pgbench workload showed me that the function doesn't enter the path to use the am_partition and relkind in almost all (99.999..%) cases and I don't think it is a special case. Thus I think we can expect that we gain about 10% without any possibility of loss. Addition to that, it is simply a good practice to keep variable scopes narrow. So +1 for this change. regards. -- Kyotaro Horiguchi NTT Open Source Software Center