On Fri, Mar 10, 2023 at 7:58 PM Önder Kalacı <onderkal...@gmail.com> wrote: > > > I think one option could be to drop some cases altogether, but not sure we'd > want that. > > As a semi-related question: Are you aware of any setting that'd make > pg_stat_all_indexes > reflect the changes sooner? It is hard to debug what is the bottleneck in the > tests, but > I have a suspicion that there might be several poll_query_until() calls on > pg_stat_all_indexes, which might be the reason? >
Yeah, I also think poll_query_until() calls on pg_stat_all_indexes is the main reason for these tests taking more time. When I commented those polls, it drastically reduces the test time. On looking at pgstat_report_stat(), it seems we don't report stats sooner than 1s and as most of this patch's test relies on stats, it leads to taking more time. I don't have a better idea to verify this patch without checking whether the index scan is really used by referring to pg_stat_all_indexes. I think trying to reduce the poll calls may help in reducing the test timings further. Some ideas on those lines are as follows: 1. +# Testcase start: SUBSCRIPTION USES INDEX WITH PUB/SUB DIFFERENT DATA VIA +# A UNIQUE INDEX THAT IS NOT PRIMARY KEY OR REPLICA IDENTITY No need to use Delete test separate for this. 2. +$node_publisher->safe_psql('postgres', + "UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1;"); +$node_publisher->safe_psql('postgres', + "UPDATE test_replica_id_full SET x = x + 1 WHERE x = 3;"); + +# check if the index is used even when the index has NULL values +$node_subscriber->poll_query_until( + 'postgres', q{select idx_scan=2 from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';} +) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates test_replica_id_full table"; Here, I think only one update is sufficient. 3. +$node_subscriber->safe_psql('postgres', + "CREATE INDEX people_last_names ON people(lastname)"); + +# wait until the index is created +$node_subscriber->poll_query_until( + 'postgres', q{select count(*)=1 from pg_stat_all_indexes where indexrelname = 'people_last_names';} +) or die "Timed out while waiting for creating index people_last_names"; I don't think we need this poll. 4. +# update 2 rows +$node_publisher->safe_psql('postgres', + "UPDATE people SET firstname = 'no-name' WHERE firstname = 'first_name_1';"); +$node_publisher->safe_psql('postgres', + "UPDATE people SET firstname = 'no-name' WHERE firstname = 'first_name_3' AND lastname = 'last_name_3';"); + +# wait until the index is used on the subscriber +$node_subscriber->poll_query_until( + 'postgres', q{select idx_scan=2 from pg_stat_all_indexes where indexrelname = 'people_names';} +) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates two rows via index scan with index on expressions and columns"; + +$node_publisher->safe_psql('postgres', + "DELETE FROM people WHERE firstname = 'no-name';"); + +# wait until the index is used on the subscriber +$node_subscriber->poll_query_until( + 'postgres', q{select idx_scan=4 from pg_stat_all_indexes where indexrelname = 'people_names';} +) or die "Timed out while waiting for check subscriber tap_sub_rep_full deletes two rows via index scan with index on expressions and columns"; I think having one update or delete should be sufficient. 5. +# update rows, moving them to other partitions +$node_publisher->safe_psql('postgres', + "UPDATE users_table_part SET value_1 = 0 WHERE user_id = 4;"); + +# wait until the index is used on the subscriber +$node_subscriber->poll_query_until( + 'postgres', q{select sum(idx_scan)=1 from pg_stat_all_indexes where indexrelname ilike 'users_table_part_%';} +) or die "Timed out while waiting for updates on partitioned table with index"; + +# delete rows from different partitions +$node_publisher->safe_psql('postgres', + "DELETE FROM users_table_part WHERE user_id = 1 and value_1 = 1;"); +$node_publisher->safe_psql('postgres', + "DELETE FROM users_table_part WHERE user_id = 12 and value_1 = 12;"); + +# wait until the index is used on the subscriber +$node_subscriber->poll_query_until( + 'postgres', q{select sum(idx_scan)=3 from pg_stat_all_indexes where indexrelname ilike 'users_table_part_%';} +) or die "Timed out while waiting for check subscriber tap_sub_rep_full updates partitioned table"; + Can we combine these two polls? 6. +# Testcase start: SUBSCRIPTION USES INDEX WITH MULTIPLE ROWS AND COLUMNS, ALSO +# DROPS COLUMN In this test, let's try to update/delete 2-3 rows instead of 20. And after drop columns, let's keep just one of the update or delete. 7. Apart from the above, I think it is better to use wait_for_catchup() consistently before trying to verify the data on the subscriber. We always use it in other tests. I guess here you are relying on the poll for index scans to ensure that data is replicated but I feel it may still be better to use wait_for_catchup(). Similarly, wait_for_subscription_sync uses the publisher name and appname in other tests, so it is better to be consistent. It can avoid random failures by ensuring data is synced. -- With Regards, Amit Kapila.