Tzafrir Cohen <[EMAIL PROTECTED]> wrote on 31/7/02 17:25:

>On Wed, 31 Jul 2002, Oded 
>Arbel wrote:
>
>> BTW: I prefer storing 
>textual data in the database 
>as unicode
>> (preferably utf8 to 
>facilitate easier display to 
>the web) encoded data
>> in binary fields - it gives 
>predictable enough sorting, 
>and you neednot
>> worry about character 
>sets, especially when aiming 
>for multi-lingual
>> applications (and I do  
>consider english/hebrew 
>being multi-lingual
>> enough to warrant 
>unicode).
>>
>
>What if there is some English 
>text? case insensitive search 
>is quite
>problematic in some cases 
>('aaa' comes after 'ZZZ').

true, but if you are willing to accept that (and I don't think its such a problem for 
most uses), then you are in the clear - no matter what encoding you choose, as long as 
you are being consistent about it.

>Also, if the UTF8 text 
>includes nikud it has to be 
>ignored-away during the
>sorting.

why ? though have never tried sorting on text with nikud, but I fail to see the 
problem : why should it  matter if alef with 'segol' sorts after alef with 'kamatz'?

>IIRC proper utf collating 
>would do the above two (this 
>costs some extra
>cpu cycles even when this 
>collating is not used, I 
>believe).

Does mysql know about utf-8 ? 

--
Oded


=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to