On 3 July 2023 9:12:03 +0200, Hairy Pixels via fpc-pascal <fpc-pascal@lists.freepascal.org> wrote: >> On Jul 3, 2023, at 2:04 PM, Tomas Hajny via fpc-pascal >> <fpc-pascal@lists.freepascal.org> wrote: >> >> No - in this case, the "header" is the highest bit of that byte being 0. > >Oh it's the header BIT. Admittedly I don't understand how this function >returns the highest bit using that case, which I think he was suggesting. > >function UTF8CodepointSizeFast(p: PChar): integer; >begin > case p^ of > #0..#191 : Result := 1; > #192..#223 : Result := 2; > #224..#239 : Result := 3; > #240..#247 : Result := 4; > else Result := 1; // An optimization + prevents compiler warning about > uninitialized Result. > end; >end;
That's why I wrote "in this case". The "header" itself is not fixed size either, but the algorithm above shows how you can derive the length from the first byte. Tomas _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal