On Mon, Jul 24, 2023 at 4:22 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > On 7/24/23 12:40, Amit Kapila wrote: > > On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat > > <ashutosh.bapat....@gmail.com> wrote: > >> > >> 0005, 0006 and 0007 are all related to the initial sequence sync. [3] > >> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to > >> be reviewed in this response. > >> > >> I followed the discussion starting [1] till [2]. The second one > >> mentions the interlock mechanism which has been implemented in 0005 > >> and 0006. While I don't have an objection to allowing LOCKing a > >> sequence using the LOCK command, I am not sure whether it will > >> actually work or is even needed. > >> > >> The problem described in [1] seems to be the same as the problem > >> described in [2]. In both cases we see the sequence moving backwards > >> during CATCHUP. At the end of catchup the sequence is in the right > >> state in both the cases. > >> > > > > I think we could see backward sequence value even after the catchup > > phase (after the sync worker is exited and or the state of rel is > > marked as 'ready' in pg_subscription_rel). The point is that there is > > no guarantee that we will process all the pending WAL before > > considering the sequence state is 'SYNCDONE' and or 'READY'. For > > example, after copy_sequence, I see values like: > > > > postgres=# select * from s; > > last_value | log_cnt | is_called > > ------------+---------+----------- > > 165 | 0 | t > > (1 row) > > postgres=# select nextval('s'); > > nextval > > --------- > > 166 > > (1 row) > > postgres=# select nextval('s'); > > nextval > > --------- > > 167 > > (1 row) > > postgres=# select currval('s'); > > currval > > --------- > > 167 > > (1 row) > > > > Then during the catchup phase: > > postgres=# select * from s; > > last_value | log_cnt | is_called > > ------------+---------+----------- > > 33 | 0 | t > > (1 row) > > postgres=# select * from s; > > last_value | log_cnt | is_called > > ------------+---------+----------- > > 66 | 0 | t > > (1 row) > > > > postgres=# select * from pg_subscription_rel; > > srsubid | srrelid | srsubstate | srsublsn > > ---------+---------+------------+----------- > > 16394 | 16390 | r | 0/16374E8 > > 16394 | 16393 | s | 0/1637700 > > (2 rows) > > > > postgres=# select * from pg_subscription_rel; > > srsubid | srrelid | srsubstate | srsublsn > > ---------+---------+------------+----------- > > 16394 | 16390 | r | 0/16374E8 > > 16394 | 16393 | r | 0/1637700 > > (2 rows) > > > > Here Sequence relid id 16393. You can see sequence state is marked as ready. > > > > postgres=# select * from s; > > last_value | log_cnt | is_called > > ------------+---------+----------- > > 66 | 0 | t > > (1 row) > > > > Even after that, see below the value of the sequence is still not > > caught up. Later, when the apply worker processes all the WAL, the > > sequence state will be caught up. > > > > postgres=# select * from s; > > last_value | log_cnt | is_called > > ------------+---------+----------- > > 165 | 0 | t > > (1 row) > > > > So, there will be a window where the sequence won't be caught up for a > > certain period of time and any usage of it (even after the sync is > > finished) during that time could result in inconsistent behaviour. > > > > I'm rather confused about which node these queries are executed on. > Presumably some of it is on publisher, some on subscriber? >
These are all on the subscriber. > Can you create a reproducer (TAP test demonstrating this?) I guess it > might require adding some sleeps to hit the right timing ... > I have used the debugger to reproduce this as it needs quite some coordination. I just wanted to see if the sequence can go backward and didn't catch up completely before the sequence state is marked 'ready'. On the publisher side, I created a publication with a table and a sequence. Then did the following steps: SELECT nextval('s') FROM generate_series(1,50); insert into t1 values(1); SELECT nextval('s') FROM generate_series(51,150); Then on the subscriber side with some debugging aid, I could find the values in the sequence shown in the previous email. Sorry, I haven't recorded each and every step but, if you think it helps, I can again try to reproduce it and share the steps. -- With Regards, Amit Kapila.