Re: Constant strings - again

Dan Sugalski Wed, 21 Apr 2004 11:28:50 -0700

At 11:17 AM -0700 4/21/04, Jeff Clites wrote:

On Apr 21, 2004, at 10:20 AM, Dan Sugalski wrote:

At 9:22 AM -0700 4/21/04, Jeff Clites wrote:
On Apr 21, 2004, at 4:05 AM, Leopold Toetsch wrote:

... a factor ~14 performance increase for the "not equal" case.
Ah, great! (And the "not equal" case is the only one which should be showing a speed up--the "same" and "equal" cases are expected to be unaffected.)
Just to make sure... we're making sure the strings are always properly decomposed before comparing, right?
Nope, this is a literal "equal" comparison--you'd build a normalized compare on top of this. (There's 2 reasons for that: (1) You definitely need a non-normalized comparison available, because often that's what you want, and (2) For normalized comparison, you need to pick which style of normalization you want--there are at least 4 choices, each of which makes sense in different situations.)

We need to address that, then. If we're doing unicode, we damn well need to do it right--å is å, regardless of whether it's composed or decomposed.

If people want low-level binary comparisons (and generally we *shouldn't* for most things) then they'll need to force the string to binary. -- Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Re: Constant strings - again

Reply via email to