Tatsuo Ishii > Sent: Sunday, May 08, 2005 3:41 PM > To: John Hansen > Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Patch for collation using ICU > > > Alvaro Herrera wrote: > > > Sent: Sunday, May 08, 2005 2:49 PM > > > To: John Hansen > > > Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; [EMAIL PROTECTED]; > > > pgsql-hackers@postgresql.org > > > Subject: Re: [HACKERS] Patch for collation using ICU > > > > > > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: > > > > Tatsuo Ishii wrote: > > > > > > > > So Japanese(including ASCII)/UNICODE behavior is > > > perfectly correct > > > > > at this moment. > > > > > > > > Right, so you _never_ use accented ascii characters in > Japanese? > > > > (like è for example, whose uppercase is È) > > > > > > That isn't ASCII. It's latin1 or some other ASCII extension. > > > > Point taken... > > But... > > > > If you want EUC_JP (Japanese + ASCII) then use that as your > backend encoding, not UTF-8 (unicode). > > UTF-8 encoded databases are very useful for representing multiple > > languages in the same database, but this usefulness > vanishes if functions like upper/lower doesn't work correctly. > > I'm just curious if Germany/French/Spanish mixed text can be > sorted correctly. I think these languages need their own > locales even with UNICODE/ICU.
No, they will not sort correctly, for that you still need the locale. > > > So optimizing for 3 languages breaks more than a hundred, > that's doesn't seem fair! That is a compromise I'd be willing to agree on. :) > Why don't you add a GUC variable or some such to control the > upper/lower behavior? > -- > Tatsuo Ishii > > ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend