On Thu, Aug 22, 2024 at 3:44 PM shveta malik <shveta.ma...@gmail.com> wrote: > > > For clock-skew and timestamp based resolution, if needed, I will post > another email for the design items where suggestions are needed. >
Please find issues which need some thoughts and approval for time-based resolution and clock-skew. 1) Time based conflict resolution and two phase transactions: Time based conflict resolution (last_update_wins) is the one resolution which will not result in data-divergence considering clock-skew is taken care of. But when it comes to two-phase transactions, it might not be the case. For two-phase transaction, we do not have commit timestamp when the changes are being applied. Thus for time-based comparison, initially it was decided to user prepare timestamp but it may result in data-divergence. Please see the example at [1]. Example at [1] is a tricky situation, and thus in the initial draft, we decided to restrict usage of 2pc and CDR together. The plan is: a) During Create subscription, if the user has given last_update_wins resolver for any conflict_type and 'two_phase' is also enabled, we ERROR out. b) During Alter subscription, if the user tries to update resolver to 'last_update_wins' but 'two_phase' is enabled, we error out. Another solution could be to save both prepare_ts and commit_ts. And when any txn comes for conflict resolution, we first check if prepare_ts is available, use that else use commit_ts. Availability of prepare_ts would indicate it was a prepared txn and thus even if it is committed, we should use prepare_ts for comparison for consistency. This will have some overhead of storing prepare_ts along with commit_ts. But if the number of prepared txns are reasonably small, this overhead should be less. We currently plan to go with restricting 2pc and last_update_wins together, unless others have different opinions. ~~ 2) parallel apply worker and conflict-resolution: As discussed in [2] (see last paragraph in [2]), for streaming of in-progress transactions by parallel worker, we do not have commit-timestamp with each change and thus it makes sense to disable parallel apply worker with CDR. The plan is to not start parallel apply worker if 'last_update_wins' is configured for any conflict_type. ~~ 3) parallel apply worker and clock skew management: Regarding clock-skew management as discussed in [3], we will wait for the local clock to come within tolerable range during 'begin' rather than before 'commit'. And this wait needs commit-timestamp in the beginning, thus we plan to restrict starting pa-worker even when clock-skew related GUCs are configured. Earlier we had restricted both 2pc and parallel worker worker start when detect_conflict was enabled, but now since detect_conflict parameter is removed, we will change the implementation to restrict all 3 above cases when last_update_wins is configured. When the changes are done, we will post the patch. ~~ 4) <not related to timestamp and clock skew> Earlier when 'detect_conflict' was enabled, we were giving WARNING if 'track_commit_timestamp' was not enabled. This was during CREATE and ALTER subscription. Now with this parameter removed, this WARNING has also been removed. But I think we need to bring back this WARNING. Currently default resolvers set may work without 'track_commit_timestamp' but when user gives CONFLICT RESOLVER in create-sub or alter-sub explicitly making them configured to non-default values (or say any values, does not matter if few are defaults), we may still emit this warning to alert user: 2024-07-26 09:14:03.152 IST [195415] WARNING: conflict detection could be incomplete due to disabled track_commit_timestamp 2024-07-26 09:14:03.152 IST [195415] DETAIL: Conflicts update_differ and delete_differ cannot be detected, and the origin and commit timestamp for the local row will not be logged. Thoughts? If we emit this WARNING during each resolution, then it may flood our log files, thus it seems better to emit it during create or alter subscription instead of during resolution. ~~ [1]: Example of 2pc inconsistency: --------------------------------------------------------- Two nodes, A and B, are subscribed to each other and have identical data. The last_update_wins strategy is configured. Both contain the data: '1, x, node'. Timeline of Events: 9:00 AM on Node A: A transaction (txn1) is prepared to update the row to '1, x, nodeAAA'. We'll refer to this as change1 on Node A. 9:01 AM on Node B: An update occurs for the row, changing it to '1, x, nodeBBB'. This update is then sent to Node A. We'll call this change2 on Node B. At 9:02 AM: --Node A: Still holds '1, x, node' because txn1 is not yet committed. --Node B: Holds '1, x, nodeBBB'. --Node B receives the prepared transaction from Node A at 9:02 AM and raises an update_differ conflict. --Since the local change occurred at 9:01 AM, which is later than the 9:00 AM prepare-timestamp from Node A, Node B retains its local change. At 9:05 AM: --Node A commits the prepared txn1. --The apply worker on Node A has been waiting to apply the changes from Node B because the tuple was locked by txn1. --Once the commit occurs, the apply worker proceeds with the update from Node B. --When update_differ is triggered, since the 9:05 AM commit-timestamp from Node A is later than the 9:01 AM commit-timestamp from Node B, Node A’s update wins. Final Data on Nodes: Node A: '1, x, nodeAAA' Node B: '1, x, nodeBBB' Despite the last_update_wins resolution, the nodes end up with different data. The data divergence happened because on node B, we used change1's prepare_ts (9.00) for comparison; while on node A, we used change1's commit_ts(9.05) for comparison. --------------------------------------------------------- [2]: https://www.postgresql.org/message-id/CAFiTN-sf23K%3DsRsnxw-BKNJqg5P6JXcqXBBkx%3DEULX8QGSQYaw%40mail.gmail.com [3]: https://www.postgresql.org/message-id/CAA4eK1%2BhdMmwEEiMb4z6x7JgQbw1jU2XyP1U7dNObyUe4JQQWg%40mail.gmail.com thanks Shveta