On Tue, Oct 25, 2022 at 8:38 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Oct 21, 2022 at 6:32 PM houzj.f...@fujitsu.com > <houzj.f...@fujitsu.com> wrote: > > I've started to review this patch. I tested v40-0001 patch and have > one question: > > IIUC even when most of the changes in the transaction are filtered out > in pgoutput (eg., by relation filter or row filter), the walsender > sends STREAM_START. This means that the subscriber could end up > launching parallel apply workers also for almost empty (and streamed) > transactions. For example, I created three subscriptions each of which > subscribes to a different table. When I loaded a large amount of data > into one table, all three (leader) apply workers received START_STREAM > and launched their parallel apply workers. >
The apply workers will be launched just the first time then we maintain a pool so that we don't need to restart them. > However, two of them > finished without applying any data. I think this behaviour looks > problematic since it wastes workers and rather decreases the apply > performance if the changes are not large. Is it worth considering a > way to delay launching a parallel apply worker until we find out the > amount of changes is actually large? > I think even if changes are less there may not be much difference because we have observed that the performance improvement comes from not writing to file. > For example, the leader worker > writes the streamed changes to files as usual and launches a parallel > worker if the amount of changes exceeds a threshold or the leader > receives the second segment. After that, the leader worker switches to > send the streamed changes to parallel workers via shm_mq instead of > files. > I think writing to file won't be a good idea as that can hamper the performance benefit in some cases and not sure if it is worth. -- With Regards, Amit Kapila.