Op 12/27/2021 om 4:39 PM schreef Bart via lazarus:
pn8^ =11100010 //first byte
(pn8^ shr 7) =11111111 //<<-- I would have expected that to be 00000001 ?
Depends on if pn8^ is signed or not, for a signed shift it makes sense.
The definition as pint8 (instead of puint8) is an odd choice.
The expression seems to be 1 when the top bits are 10 iow when it is a
follow bytes of utf8, that is what the comment says, and I as far as I
can see the signedness doesn't matter.
Basically to me that seems to be a branchless version of
if (p[i] and %11000000)=%10000000 then
inc(result);
...which counts all utf8 follow bytes, and then subtracts it from the
number of bytes in a string to find the number of utf8 sequences/codepoints.
Maybe the absolute stuff confuses somehow? Also make sure the input is
100% the same by printing the values of the bytes of the input string.
--
_______________________________________________
lazarus mailing list
lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus