Re: Why can't D store all UTF-8 code units in char type? (not really understanding explanation)

ag0aep6g via Digitalmars-d-learn Sat, 03 Dec 2022 05:06:53 -0800

On 02.12.22 22:39, thebluepandabear wrote:

Hm, that specifically might not be. The thing is, I thought a UTF-8 codeunit can store 1-4 bytes for each character, so how is it right to saythat `char` is a utf-8 code unit, it seems like it's just an ASCII codeunit.

You're simply not using the term "code unit" correctly. A UTF-8 codeunit is just one of those 1-4 bytes. Together they form a "sequence"which encodes a "code point".

And all (true) ASCII code units are indeed also valid UTF-8 code units.Because UTF-8 is a superset of ASCII. If you save a file as ASCII andopen it as UTF-8, that works. But it doesn't work the other way around.

Re: Why can't D store all UTF-8 code units in char type? (not really understanding explanation)

Reply via email to