Re: [GENERAL] Unicode + LC_COLLATE

2004-04-23 Thread Peter Eisentraut
Am Donnerstag, 22. April 2004 16:37 schrieb Priem, Alexander: > But if you use anything other than C, you can't use indexes in > Like-clauses, right? No, see . > Would lc-collate=C be bad in combination with UNICODE encoding? What >

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread John Sidney-Woollett
Priem, Alexander said: > Would lc-collate=C be bad in combination with UNICODE encoding? What > lc-collate setting would you recommend for UNICODE encoding which will > provide good sorting for all (most) common languages? (dutch, english, > french, german) It seems that LC_COLLATE=C is not a good

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread John Sidney-Woollett
Peter Eisentraut said: > Am Donnerstag, 22. April 2004 13:17 schrieb John Sidney-Woollett: > You get your strings sorted in binary order of the UTF-8 encoding, which > is probably not very interesting, but it's possible. Agreed. >> Is it also true that if LC_COLLATE != 'C' that indexes cannot be

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread Karsten Hilbert
John, > I guess if I have some time I should build some different DBs with > different combinations of encoding and collations and summarise my > findings using different types of data and sort/search commands, in case > anyone else has the same level of confusion that I do... that'd be excellent.

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread John Sidney-Woollett
Tom Lane said: > C locale basically means "sort by the byte sequence values". It'll do > something self-consistent, but maybe not what you'd like for UTF8 > characters. OK, that explains that. I guess I will need to try it out to see what the effect is on extended character sets. >> Our database

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread Peter Eisentraut
Am Donnerstag, 22. April 2004 13:17 schrieb John Sidney-Woollett: > Does anyone know what the effect of --lc-collate=C --encoding=UNICODE will > be for sorts (and indexes?) when a multibyte unicode character is > encountered? You get your strings sorted in binary order of the UTF-8 encoding, which

Re: [GENERAL] Unicode + LC_COLLATE

2004-04-22 Thread Tom Lane
"John Sidney-Woollett" <[EMAIL PROTECTED]> writes: > Does anyone know what the effect of --lc-collate=C --encoding=UNICODE will > be for sorts (and indexes?) when a multibyte unicode character is > encountered? C locale basically means "sort by the byte sequence values". It'll do something self-c