On Wed, Jul 3, 2024 at 3:35 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Jul 3, 2024 at 2:16 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Wed, Jul 3, 2024 at 12:30 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Wed, Jul 3, 2024 at 11:29 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > But waiting after applying the operations and before applying the > > > commit would mean that we need to wait with the locks held. That could > > > be a recipe for deadlocks in the system. I see your point related to > > > performance but as we are not expecting clock skew in normal cases, we > > > shouldn't be too much bothered on the performance due to this. If > > > there is clock skew, we expect users to fix it, this is just a > > > worst-case aid for users. > > > > But if we make it wait at the very first operation that means we will > > not suck more decoded data from the network and wouldn't that make the > > sender wait for the network buffer to get sucked in by the receiver? > > > > That would be true even if we wait just before applying the commit > record considering the transaction is small and the wait time is > large. > > > Also, we already have a handling of parallel apply workers so if we do > > not have an issue of deadlock there or if we can handle those issues > > there we can do it here as well no? > > > > Parallel apply workers won't wait for a long time. There is some > similarity and in both cases, deadlock will be detected but chances of > such implementation-related deadlocks will be higher if we start > waiting for a random amount of times. The other possibility is that we > can keep a cap on the max clock skew time above which we will give > ERROR even if the user has configured wait.
+1. But I think cap has to be on wait-time. As an example, let's say the user has configured 'clock skew tolerance' of 10sec while the actual clock skew between nodes is 5 min. It means, we will mostly have to wait '5 min - 10sec' to bring the clock skew to a tolerable limit, which is a huge waiting time. We can keep a max limit on this wait time. thanks Shveta