Re: Encoding/collation question

2019-12-18 Thread Karsten Hilbert
On Thu, Dec 12, 2019 at 08:35:53AM -0500, Tom Lane wrote: > C collation basically devolves to strcmp/memcmp, which are as standard > and well-defined as can be. If you're happy with the way it sorts > things then there's no reason not to use it. So that's the collation to use when "technical" so

Re: Encoding/collation question

2019-12-12 Thread Rich Shepard
On Thu, 12 Dec 2019, Andrew Gierth wrote: Note that it's perfectly fine to use UTF8 encoding and C collation (this has the effect of sorting strings in Unicode codepoint order); this is as fast for comparisons as LATIN1/C is. Andrew, This is really useful insight. I've not thought of the rela

Re: Encoding/collation question

2019-12-12 Thread Tom Lane
Karsten Hilbert writes: > Question: is C collation expected to be future-proof / > rock-solid /stable -- like UTF8 for encoding choice -- or > could it end up like the SQL-ASCII encoding did: Yeah, we > support it, it's been in use a long time, it should work, > but, nah, one doesn't really want t

Re: Encoding/collation question

2019-12-12 Thread Karsten Hilbert
On Thu, Dec 12, 2019 at 05:03:59AM +, Andrew Gierth wrote: > Rich> I doubt that my use will notice meaningful differences. Since > Rich> there are only two or three databases in UTF8 and its collation > Rich> perhaps I'll convert those to LATIN1 and C. > > Note that it's perfectly fine to u

Re: Encoding/collation question

2019-12-11 Thread Andrew Gierth
> "Rich" == Rich Shepard writes: Rich> I doubt that my use will notice meaningful differences. Since Rich> there are only two or three databases in UTF8 and its collation Rich> perhaps I'll convert those to LATIN1 and C. Note that it's perfectly fine to use UTF8 encoding and C collation (

Re: Encoding/collation question

2019-12-11 Thread Rich Shepard
On Wed, 11 Dec 2019, Tom Lane wrote: String comparisons in non-C collations tend to be a lot slower than they are in C collation. Whether this makes a noticeable difference to you depends on your workload, but certainly we've seen performance gripes that trace to that. Tom, How interesting.

Re: Encoding/collation question

2019-12-11 Thread Tom Lane
Rich Shepard writes: > My older databases have LATIN1 encoding and C collation; the newer ones have > UTF8 encoding and en_US.UTF-8 collation. A web search taught me that I can > change each old database by dumping it and restoring it with the desired > encoding and collation types. My question is