On 3 July 2023 9:12:03 +0200, Hairy Pixels via fpc-pascal 
<fpc-pascal@lists.freepascal.org> wrote:
>> On Jul 3, 2023, at 2:04 PM, Tomas Hajny via fpc-pascal 
>> <fpc-pascal@lists.freepascal.org> wrote:
>> 
>> No - in this case, the "header" is the highest bit of that byte being 0.
>
>Oh it's the header BIT. Admittedly I don't understand how this function 
>returns the highest bit using that case, which I think he was suggesting.
>
>function UTF8CodepointSizeFast(p: PChar): integer;
>begin
> case p^ of
>   #0..#191   : Result := 1;
>   #192..#223 : Result := 2;
>   #224..#239 : Result := 3;
>   #240..#247 : Result := 4;
>   else Result := 1; // An optimization + prevents compiler warning about 
> uninitialized Result.
> end;
>end;

That's why I wrote "in this case". The "header" itself is not fixed size 
either, but the algorithm above shows how you can derive the length from the 
first byte.

Tomas

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to