At Wed, 22 Dec 2021 13:11:38 +0000, "houzj.f...@fujitsu.com" 
<houzj.f...@fujitsu.com> wrote in 
> Hi,
> 
> When reviewing some logical replication patches. I noticed that in
> function get_rel_sync_entry() we always invoke get_rel_relispartition() 
> and get_rel_relkind() at the beginning which could cause unnecessary
> cache access.
> 
> ---
> get_rel_sync_entry(PGOutputData *data, Oid relid)
> {
>       RelationSyncEntry *entry;
>       bool            am_partition = get_rel_relispartition(relid);
>       char            relkind = get_rel_relkind(relid);
> ---
> 
> The extra cost could sometimes be noticeable because get_rel_sync_entry is a
> hot function which is executed for each change. And the 'am_partition' and
> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry.
> 
> Here is the perf result for the case when inserted large amounts of data into 
> a
> un-published table in which case the cost is noticeable.
> 
> --12.83%--pgoutput_change
>     |--11.84%--get_rel_sync_entry
>           |--4.76%--get_rel_relispartition
>               |--4.70%--get_rel_relkind
> 
> So, I think it would be better if we do the initialization only when
> RelationSyncEntry in invalid.
> 
> Attach a small patch which delay the initialization.
> 
> Thoughts ?

A simple benchmarking that replicates pgbench workload showed me that
the function doesn't enter the path to use the am_partition and
relkind in almost all (99.999..%) cases and I don't think it is a
special case. Thus I think we can expect that we gain about 10%
without any possibility of loss.

Addition to that, it is simply a good practice to keep variable scopes
narrow.

So +1 for this change.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Reply via email to