Re: [GENERAL] Unicode + LC_COLLATE

John Sidney-Woollett Thu, 22 Apr 2004 10:48:22 -0700

Peter Eisentraut said:
> Am Donnerstag, 22. April 2004 13:17 schrieb John Sidney-Woollett:
> You get your strings sorted in binary order of the UTF-8 encoding, which
> is probably not very interesting, but it's possible.


Agreed.

>> Is it also true that if LC_COLLATE != 'C' that indexes cannot be used
>> for LIKE comparisons (and is this also true for en_US.iso885915)?

> No, see <http://www.postgresql.org/docs/7.4/static/indexes-opclass.html>.

I wish I understood what this page actually was trying to say.

Is it saying that varchar_pattern_ops sorts according to the 'C' locale
regardless of LC_COLLATE, and that varchar_ops sorts according to the
current value of LC_COLLATE?

> This setup will result in UTF-8 characters being sorted by the system
> thinking
> they are actually ISO-8859-15 characters.  So the result will be random at
> best.

Actually the LC_COLLATE is currently 'C' not as I reported ISO-8859-1.

What would be a correct LC_COLLATE value for my database if we want to
primarily service ISO-8859-1, but allow for
cyrillic/chinese/japanese/korean characters too and have them sorting and
indexing correctly? We are building a multilanguage website...

ls /usr/share/locale produces:
ca  de  [EMAIL PROTECTED]  en_SE  fi  hr  ko            no     sk  zh_TW
cs  el  en_GB        en_US  fr  it  locale.alias  pl     sv
da  en  [EMAIL PROTECTED]      es     gl  ja  nl            pt_BR  tr

Thanks for anymore info.

John Sidney-Woollett


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [GENERAL] Unicode + LC_COLLATE

Reply via email to