On 2007-05-22, Tom Lane <[EMAIL PROTECTED]> wrote: > If "%" advances by bytes then this will find a spurious match. The > only thing that prevents it is if "B" can't be both a leading and a > trailing byte of validly-encoded MB characters.
Which is (by design) true in UTF8, but is not true of most other multibyte charsets. The %_ case is also trivially handled in UTF8 by simply ensuring that _ doesn't match a non-initial octet. This allows % to advance by bytes without danger of losing sync. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster