> On Jul 2, 2023, at 11:16 PM, Jer Haan <jdehaan2...@gmail.com> wrote: > > This table is copied from Wikipedia.<uencoding.pas>Hope it’s useful for you. > If you improve the code pls let me know. >
This is perfect, thanks! Much more complicated than I thought. I'm curious now, if you were going the other direction and parsing a string of different unicode characters with different code point sequence lengths how would you know which length it was? For example I started off know which unicode scalar to use by looking at a table but if I had to find the character is stream of text? I think UTF8 can have 1-4 byte characters so you could encounter 1 byte character followed by 4 byte characters interleaved and there's no header or terminator for each character. How is this solved? Regards, Ryan Joseph _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal