On Fri, Mar 10, 2023 at 3:25 PM Önder Kalacı <onderkal...@gmail.com> wrote: > >> > > Done with an important caveat. I think this reorganization of the test helped > us to find one edge case regarding dropped columns. > > I realized that the dropped columns also get into the tuples_equal() > function. And, > the remote sends NULL to for the dropped columns(i.e., remoteslot), but > index_getnext_slot() (or table_scan_getnextslot) indeed fills the dropped > columns on the outslot. So, the dropped columns are not NULL in the outslot. > > This triggers tuples_equal to fail. To fix that, I improved the tuples_equal > such that it skips the dropped columns. >
Good catch. By any chance, have you tried with generated columns? See logicalrep_write_tuple()/logicalrep_write_attrs() where we neither send anything for dropped columns nor for generated columns. Similarly, on receiving side, in logicalrep_rel_open() and slot_store_data(), we seem to be using NULL for such columns. > I also spend quite a bit of time understanding how/why this impacts > HEAD. See steps below on HEAD, where REPLICA IDENTITY FULL > fails to replica the data properly: > > > -- pub > CREATE TABLE test (drop_1 jsonb, x int, drop_2 numeric, y text, drop_3 > timestamptz); > ALTER TABLE test REPLICA IDENTITY FULL; > INSERT INTO test SELECT NULL, i, i, (i)::text, now() FROM > generate_series(0,1)i; > CREATE PUBLICATION pub FOR ALL TABLES; > > -- sub > CREATE TABLE test (drop_1 jsonb, x int, drop_2 numeric, y text, drop_3 > timestamptz); > CREATE SUBSCRIPTION sub CONNECTION 'host=localhost port=5432 dbname=postgres' > PUBLICATION pub; > > -- show that before dropping the columns, the data in the source and > -- target are deleted properly > DELETE FROM test WHERE x = 0; > > -- both on the source and target > SELECT count(*) FROM test WHERE x = 0; > ┌───────┐ > │ count │ > ├───────┤ > │ 0 │ > └───────┘ > (1 row) > > -- drop columns on both the the source > ALTER TABLE test DROP COLUMN drop_1; > ALTER TABLE test DROP COLUMN drop_2; > ALTER TABLE test DROP COLUMN drop_3; > > -- drop columns on both the the target > ALTER TABLE test DROP COLUMN drop_1; > ALTER TABLE test DROP COLUMN drop_2; > ALTER TABLE test DROP COLUMN drop_3; > > -- on the target > ALTER SUBSCRIPTION sub REFRESH PUBLICATION; > > > -- after dropping the columns > DELETE FROM test WHERE x = 1; > > -- source > SELECT count(*) FROM test WHERE x = 1; > ┌───────┐ > │ count │ > ├───────┤ > │ 0 │ > └───────┘ > (1 row) > > -- target, OOPS wrong result!!!! > SELECT count(*) FROM test WHERE x = 1; > > ┌───────┐ > │ count │ > ├───────┤ > │ 1 │ > └───────┘ > (1 row) > > > Should we have a separate patch for the tuples_equal change so that we > might want to backport? > Yes, it would be better to report and discuss this in a separate thread, > Attached as v40_0001 on the patch. > > Note that I need to have that commit as 0001 so that 0002 patch > passes the tests. > I think we can add such a test (which relies on existing buggy behavior) later after fixing the existing bug. For now, it would be better to remove that test and add it after we fix dropped columns issue in HEAD. -- With Regards, Amit Kapila.