Re: Mixed Locales and Upgrading

2020-06-15 Thread Don Seiler
Good morning, Back with a follow-up question to all this. I'm wondering if we shouldn't also change the locale settings for postgres/template0/template1 to match our new desires setting en_US.UTF-8 with UTF8 encoding. We haven't written anything to postgres. Some of our DB clusters have changed th

Re: Mixed Locales and Upgrading

2020-04-29 Thread Tom Lane
Adrian Klaver writes: > On 4/29/20 1:10 PM, Don Seiler wrote: >> On Tue, Apr 7, 2020 at 11:41 AM Don Seiler > > wrote: >> Since I'm not changing the postgres or template0 databases (leaving >> those as en_US/LATIN1), do I keep lc_collate/lc_ctype as en_US? It's >> just the

Re: Mixed Locales and Upgrading

2020-04-29 Thread Adrian Klaver
On 4/29/20 1:10 PM, Don Seiler wrote: On Tue, Apr 7, 2020 at 11:41 AM Don Seiler > wrote: Follow-up question, the locale setting on the host would still be set to en_US (as would the postgres and template0 databases). Should I look to change that locale on the

Re: Mixed Locales and Upgrading

2020-04-29 Thread Don Seiler
On Tue, Apr 7, 2020 at 11:41 AM Don Seiler wrote: > > Follow-up question, the locale setting on the host would still be set to > en_US (as would the postgres and template0 databases). Should I look to > change that locale on the system to en_US.UTF-8, or even just for the > postgres user that the

Re: Mixed Locales and Upgrading

2020-04-08 Thread Peter J. Holzer
On 2020-03-30 17:30:32 -0400, Tom Lane wrote: > Don Seiler writes: > > Actually, would I need to re-index on text columns that we know contain > > UUID strings? UUID characters seem to be pretty basic alphanumeric ASCII > > characters. > > I think you're all right with respect to those, since the

Re: Mixed Locales and Upgrading

2020-04-08 Thread Peter Eisentraut
On 2020-04-07 18:41, Don Seiler wrote: Follow-up question, the locale setting on the host would still be set to en_US (as would the postgres and template0 databases). Should I look to change that locale on the system to en_US.UTF-8, or even just for the postgres user that the DB cluster runs as

Re: Mixed Locales and Upgrading

2020-04-07 Thread Don Seiler
On Mon, Mar 30, 2020 at 4:39 PM Don Seiler wrote: > On Mon, Mar 30, 2020 at 4:30 PM Tom Lane wrote: > >> Don Seiler writes: >> > Actually, would I need to re-index on text columns that we know contain >> > UUID strings? UUID characters seem to be pretty basic alphanumeric ASCII >> > characters.

Re: Mixed Locales and Upgrading

2020-03-30 Thread Don Seiler
On Mon, Mar 30, 2020 at 4:30 PM Tom Lane wrote: > Don Seiler writes: > > Actually, would I need to re-index on text columns that we know contain > > UUID strings? UUID characters seem to be pretty basic alphanumeric ASCII > > characters. > > I think you're all right with respect to those, since

Re: Mixed Locales and Upgrading

2020-03-30 Thread Tom Lane
Don Seiler writes: > Actually, would I need to re-index on text columns that we know contain > UUID strings? UUID characters seem to be pretty basic alphanumeric ASCII > characters. I think you're all right with respect to those, since they're the same under any encoding. It's columns containing

Re: Mixed Locales and Upgrading

2020-03-30 Thread Don Seiler
On Sun, Mar 22, 2020 at 4:48 PM Don Seiler wrote: > > Here's the fun part. A lot of the tables use UUIDv4 strings for primary > keys. However these are stored in text/varchar columns. > Actually, would I need to re-index on text columns that we know contain UUID strings? UUID characters seem to

Re: Mixed Locales and Upgrading

2020-03-22 Thread Don Seiler
On Tue, Mar 17, 2020 at 9:25 PM Michael Paquier wrote: > > There is no way to know how much indexes would get broken without > having a look at it. Anything ASCII-based should be of no problem. > If you have a doubt, reindexing evey index which includes text column > data is the best course of a

Re: Mixed Locales and Upgrading

2020-03-17 Thread Michael Paquier
On Tue, Mar 17, 2020 at 10:45:50AM -0400, Tom Lane wrote: > Don Seiler writes: >> What are the ramifications of changing collation like that? Should we >> consider rebuilding indexes ASAP after that? > > Text indexes would definitely be at risk here. I'm not really certain > how bad the problem

Re: Mixed Locales and Upgrading

2020-03-17 Thread Tom Lane
Don Seiler writes: > On Tue, Mar 17, 2020 at 8:56 AM Tom Lane wrote: >> Yikes. Well, if there aren't obvious operational problems, it might be >> that the data is actually UTF8-clean, or almost entirely so. Maybe you >> could look at the problem as being one of validation. > For this test, wou

Re: Mixed Locales and Upgrading

2020-03-17 Thread Don Seiler
On Tue, Mar 17, 2020 at 8:56 AM Tom Lane wrote: > > Yikes. Well, if there aren't obvious operational problems, it might be > that the data is actually UTF8-clean, or almost entirely so. Maybe you > could look at the problem as being one of validation. In that case, > it'd be possible to consid

Re: Mixed Locales and Upgrading

2020-03-17 Thread Tom Lane
Don Seiler writes: > On Mon, Mar 16, 2020 at 10:28 AM Tom Lane wrote: >> I don't think you should use pg_upgrade here at all. A dump/restore >> is really the only way to make sure that you have validly encoded data. > That is what I thought, and probably not what they'll want to hear given > th

Re: Mixed Locales and Upgrading

2020-03-17 Thread Don Seiler
On Tue, Mar 17, 2020 at 8:06 AM Don Seiler wrote: > On Mon, Mar 16, 2020 at 10:28 AM Tom Lane wrote: > >> > Well, in principle you could likewise manually update pg_database's >> datcollate and datctype columns to say "en_US.utf8". However, there's >> a much bigger problem here --- what steps i

Re: Mixed Locales and Upgrading

2020-03-17 Thread Don Seiler
On Mon, Mar 16, 2020 at 10:28 AM Tom Lane wrote: > > Egad. > My thoughts exactly. > Well, in principle you could likewise manually update pg_database's > datcollate and datctype columns to say "en_US.utf8". However, there's > a much bigger problem here --- what steps if any did this cowboy ta

Re: Mixed Locales and Upgrading

2020-03-16 Thread Tom Lane
Don Seiler writes: > However this database has encoding UTF8 while still having ctype and > collation of en_US. I've since found that when this was last upgraded, they > ran "update pg_database set encoding = pg_char_to_encoding('UTF8') where > datname = 'test';" to change the encoding. Egad. >

Mixed Locales and Upgrading

2020-03-16 Thread Don Seiler
Good morning, I have a few clusters that need to be upgraded from PG 9.6 or 10 to 12. Normally I'd just plan to pg_upgrade the lot and be good to go. However I've found that quite a few (including our biggest/busiest database) have mixed locales. In the case of the biggest/busiest database, the cl