On Wed, Nov 12, 2003 at 09:18:24PM +0000, Nicholas Clark wrote:
> On Wed, Nov 12, 2003 at 01:57:14PM -0500, Dan Sugalski wrote:
> 
> > You're going to run into problems no matter what you do, and as
> > transcoding could happen with each comparison arguably you need to make a
> > local copy of the string for each comparison, as otherwise you run the
> > risk of significant data loss as a sring gets transcoded back and forth
> > across a lossy boundary.
> 
> I think that this rules out what I was going to ask/suggested, having read
> Leo's patch. I was wondering why there wasn't a straight memcmp of the
> two strings whenever their encoding were the same. I presume that there
> are some encodings where two different binary representations are considered
> "equal", hence we can't blindly assume that a byte compare is sufficient.
    
    yep, AFAIK there are at least two different ways to express
    the german umlaut ä (i can see it on my keyboard) in unicode. i think
    simon cozins has a good paper (somewhere) about that.

    re,
    tc


Reply via email to