Leif Halvard Silli wrote:

I believe that even the U+FEFF *itself* is either UTF-32LE or UTF-32BE.
Thus, there is, per se, no implication of lack of byte-order mark in
Martin’s statement.

By definition, data in the "UTF-nBE" or "UTF-nLE" encoding scheme (for whatever value of n) does not have a byte-order mark.

Assuming that the label "UTF-32" is defined the
same way as the label "UTF-16", then it is an umbrella label or a
"macro label" (hint: macro language) which covers the two *real*
encodings - UTF-32LE and UTF-32BE.

I've sometimes wished it were that way, that (for example) the "UTF-32BE" and "UTF-32LE" encoding schemes were defined as variations of "UTF-32" with special rules related to the BOM, not defined as completely separate encoding schemes. But that's not how the definitions are written.

The LE and BE versions are not at all "the two *real* encodings" when there is real-world data that contains an initial U+FEFF meant to be interpreted as a BOM or "signature."

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

Reply via email to