On Mon, Aug 22, 2022 at 3:55 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > To solve that problem, how about rewriting the system table in the new > cluster which has a conflicting relfilenode? I think we can probably > do this conflict checking before processing the tables from the old > cluster.
Thanks for chiming in. Right now, there are two parts to the relfilenumber preservation system, and this scheme doesn't quite fit into either of them. First, the dump includes commands to set pg_class.relfilenode in the new cluster to the same value that it had in the old cluster. The dump can't include any SQL commands that depend on what's happening in the new cluster because pg_dump(all) only connects to a single cluster, which in this case is the old cluster. Second, pg_upgrade itself copies the files from the old cluster to the new cluster. This doesn't involve a database connection at all. So there's no part of the current relfilenode preservation mechanism that can look at the old AND the new database and decide on some SQL to execute against the new database. I thought for a while that we could use the information that's already gathered by get_rel_infos() to do what you're suggesting here, but it doesn't quite work, because that function excludes system tables, and we can't afford to do that here. We'd either need to modify that query to include system tables - at least for the new cluster - or run a separate one to gather information about system tables in the new cluster. Then, we could put all the pg_class.relfilenode values we found in the new cluster into a hash table, loop over the list of rels this function found in the old cluster, and for each one, probe into the hash table. If we find a match, that's a system table that needs to be moved out of the way before calling create_new_objects(), or maybe inside that function but before it runs pg_restore. That doesn't seem too crazy, I think. It's a little bit of new mechanism, but it doesn't sound horrific. It's got the advantage of being significantly cheaper than my proposal of moving everything out of the way unconditionally, and at the same time it retains one of the key advantages of that proposal - IMV, anyway - which is that we don't need separate relfilenode ranges for user and system objects any more. So I guess on balance I kind of like it, but maybe I'm missing something. -- Robert Haas EDB: http://www.enterprisedb.com