Re: [HACKERS] Patch for collation using ICU

John Hansen Sun, 08 May 2005 01:49:39 -0700

Tatsuo Ishii
> Sent: Sunday, May 08, 2005 3:41 PM
> To: John Hansen
> Cc: [EMAIL PROTECTED]; [email protected]; 
> [EMAIL PROTECTED]; [email protected]
> Subject: Re: [HACKERS] Patch for collation using ICU
> 
> > Alvaro Herrera wrote:
> > > Sent: Sunday, May 08, 2005 2:49 PM
> > > To: John Hansen
> > > Cc: Tatsuo Ishii; [email protected]; [EMAIL PROTECTED]; 
> > > [email protected]
> > > Subject: Re: [HACKERS] Patch for collation using ICU
> > > 
> > > On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
> > > > Tatsuo Ishii wrote:
> > > 
> > > > > So Japanese(including ASCII)/UNICODE behavior is
> > > perfectly correct
> > > > > at this moment.
> > > > 
> > > > Right, so you _never_ use accented ascii characters in 
> Japanese? 
> > > > (like � for example, whose uppercase is �)
> > > 
> > > That isn't ASCII.  It's latin1 or some other ASCII extension.
> > 
> > Point taken...
> > But...
> > 
> > If you want EUC_JP (Japanese + ASCII) then use that as your 
> backend encoding, not UTF-8 (unicode).
> > UTF-8 encoded databases are very useful for representing multiple 
> > languages in the same database, but this usefulness 
> vanishes if functions like upper/lower doesn't work correctly.
> 
> I'm just curious if Germany/French/Spanish mixed text can be 
> sorted correctly. I think these languages need their own 
> locales even with UNICODE/ICU.


No, they will not sort correctly, for that you still need the locale.

> 
> > So optimizing for 3 languages breaks more than a hundred, 
> that's doesn't seem fair!

That is a compromise I'd be willing to agree on. :)
 
> Why don't you add a GUC variable or some such to control the 
> upper/lower behavior?
> --
> Tatsuo Ishii
> 
> 

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: [HACKERS] Patch for collation using ICU

Reply via email to