Hi Pádraig,
> However, the first byte of a multibyte
> UTF-8 char is the same for a lot of characters
Yes. The last byte is equidistributed across the range 0x80..0xBF, whereas
the first byte is often the same. I'm applying the commit below to exploit it
for speed.
> I was wondering myself about
Hello Stafano,
> Some projects (like Automake) have the policy of keeping multiple
> ChangeLog entries having the same author and date lumped togheter,
> preferring e.g.:
>
> 2000-01-01 Foo Bar
>
> Add foo
>
> Add bar
>
> over:
>
> 2000-01-01 Foo Bar
>
> Add foo