On Fri, Feb 17, 2023 at 10:44:49PM +0100, Daniel Gustafsson wrote: > When adding a check to pg_upgrade a while back I noticed in a profile that the > cluster compatibility check phase spend a lot of time in connectToServer. > Some > of this can be attributed to data type checks which each run serially in turn > connecting to each database to run the check, and this seemed like a place > where we can do better. > > The attached patch moves the checks from individual functions, which each > loops > over all databases, into a struct which is consumed by a single umbrella check > where all data type queries are executed against a database using the same > connection. This way we can amortize the connectToServer overhead across more > accesses to the database.
This change consolidates all the data type checks, so instead of 7 separate loops through all the databases, there is just one. However, I wonder if we are leaving too much on the table, as there are a number of other functions that also loop over all the databases: * get_loadable_libraries * get_db_and_rel_infos * report_extension_updates * old_9_6_invalidate_hash_indexes * check_for_isn_and_int8_passing_mismatch * check_for_user_defined_postfix_ops * check_for_incompatible_polymorphics * check_for_tables_with_oids * check_for_user_defined_encoding_conversions I suspect consolidating get_loadable_libraries, get_db_and_rel_infos, and report_extension_updates would be prohibitively complicated and not worth the effort. old_9_6_invalidate_hash_indexes is only needed for unsupported versions, so that might not be worth consolidating. check_for_isn_and_int8_passing_mismatch only loops through all databases when float8_pass_by_value in the control data differs, so that might not be worth it, either. The last 4 are for supported versions and, from a very quick glance, seem possible to consolidate. That would bring us to a total of 11 separate loops that we could consolidate into one. However, the data type checks seem to follow a nice pattern, so perhaps this is easier said than done. IIUC with the patch, pg_upgrade will immediately fail as soon as a single check in a database fails. I believe this differs from the current behavior where all matches for a given check in the cluster are logged before failing. I wonder if it'd be better to perform all of the data type checks in all databases before failing so that all of the violations are reported. Else, users would have to run pg_upgrade, fix a violation, run pg_upgrade again, fix another one, etc. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com