Re: [HACKERS] like/ilike improvements

Andrew Dunstan Fri, 25 May 2007 03:58:42 -0700


Zeugswetter Andreas ADI SD wrote:

You have to be on a first byte before you can meaningfullyapply NextChar, and you have to use NextChar or else youdon't count characters correctly (eg "__" must match 2 charsnot 2 bytes).
Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xxxxxx).

It doesn't matter - we are satisfied that it won't happen. However, thismight well be a useful optimisation of NextChar() for the UTF8 case assomething like


 do { (t)++; (tlen)--}  while ((*(t) & 0xC0) == 0x80 && tlen > 0)

In fact, I'm wondering if that might make the other UTF8 stuff redundant- the whole point of what we're doing is to avoid expensive calls toNextChar;


cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [HACKERS] like/ilike improvements

Reply via email to