Re: [Patch] Lookup width of UTF-8 character is wrong

2014-03-08 Thread Nicholas Marriott
Yes, you are right... oops. Fix applied thanks. On Sat, Mar 08, 2014 at 10:14:19PM +0900, Koga Osamu wrote: > Hello, > > I found a bug in lookup width of UTF-8 data which consists of 4 bytes. > > The problem is in utf8_combine. > In there the Unicode codepoint is reconstructed from UTF-8 sequen

[Patch] Lookup width of UTF-8 character is wrong

2014-03-08 Thread Koga Osamu
Hello, I found a bug in lookup width of UTF-8 data which consists of 4 bytes. The problem is in utf8_combine. In there the Unicode codepoint is reconstructed from UTF-8 sequence but the first byte is treated incorrectly. According to UTF-8 structure, only last 3 bits of the first byte of the 4-by