On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut <peter.eisentr...@enterprisedb.com> wrote: > > On 27.05.21 12:04, Amit Kapila wrote: > >>> Also, I am thinking that instead of a stat view, do we need > >>> to consider having a system table (pg_replication_conflicts or > >>> something like that) for this because what if stats information is > >>> lost (say either due to crash or due to udp packet loss), can we rely > >>> on stats view for this? > >> Yeah, it seems better to use a catalog. > >> > > Okay. > > Could you store it shared memory? You don't need it to be crash safe, > since the subscription will just run into the same error again after > restart. You just don't want it to be lost, like with the statistics > collector. >
But, won't that be costly in cases where we have errors in the processing of very large transactions? Subscription has to process all the data before it gets an error. I think we can even imagine this feature to be extended to use commitLSN as a skip candidate in which case we can even avoid getting the data of that transaction from the publisher. So if this information is persistent, the user can even set the skip identifier after the restart before the publisher can send all the data. Also, I think we can't assume after the restart we will get the same error because the user can perform some operations after the restart and before we try to apply the same transaction. It might be that the user wanted to see all the errors before the user can set the skip identifier (and or method). I think the XID (or say another identifier like commitLSN) which we want to use for skipping the transaction as specified by the user has to be stored in the catalog because otherwise, after the restart we won't remember it and the user won't know that he needs to set it again. Now, say we have multiple skip identifiers (XIDs, commitLSN, ..), isn't it better to store all conflict-related information in a separate catalog like pg_subscription_conflict or something like that. I think it might be also better to later extend it for auto conflict resolution where the user can specify auto conflict resolution info for a subscription. Is it better to store all such information in pg_subscription or have a separate catalog? It is possible that even if we have a separate catalog for conflict info, we might not want to store error info there. -- With Regards, Amit Kapila.