On Tue, Mar 17, 2020 at 8:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Yikes. Well, if there aren't obvious operational problems, it might be > that the data is actually UTF8-clean, or almost entirely so. Maybe you > could look at the problem as being one of validation. In that case, > it'd be possible to consider not taking the production DB down, but just > doing a pg_dump from it and seeing if you can restore somewhere else. > If not, fix the broken data; repeat till clean. After that you could > do pg_upgrade with a clear conscience. I think you'll still end up > manually fixing the inconsistent datcollate/datctype settings though. >
For this test, would we restore into an en_US.UTF-8/UTF8 database? Then, assuming no errors (or fixing any errors until clean), we change the datcollate/datctype settings in prod and proceed with pg_upgrade (obviously after testing all of that heavily)? What are the ramifications of changing collation like that? Should we consider rebuilding indexes ASAP after that? Don. -- Don Seiler www.seiler.us