Re: [dev][st][patch] new utf decoder

Silvan Jegen Fri, 21 Mar 2014 02:40:30 -0700

Heyho

On Thu, Mar 20, 2014 at 5:39 PM, Damian Okrasa <dokr...@gmail.com> wrote:
> Hey,
>
> this patch replaces current utf decoder with a new one, which is ~50
> lines shorter and should be easier to understand. Parsing 5 and 6
> sequences, if necessary, requires trivial modification of UTF_SIZ
> constant and utfbyte, utfmask, utfmin, utfmax arrays.


I can't yet claim to fully understand the code but according to my testing with

https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

the behavior of the decoder has not changed a bit which I'll assume is
a good thing.

"Benchmarking" the decoder with

time for i in `seq 10000`; do cat UTF-8-test.txt; done;

did not seem to highlight any significant differences either.

I will stare at the code some more but so far it looks good to me.


Cheers,

Silvan

Re: [dev][st][patch] new utf decoder

Reply via email to