Leif Halvard Silli wrote:
I believe that even the U+FEFF *itself* is either UTF-32LE or
UTF-32BE.
Thus, there is, per se, no implication of lack of byte-order mark in
Martin’s statement.
By definition, data in the "UTF-nBE" or "UTF-nLE" encoding scheme (for
whatever value of n) does not have a byte-order mark.
Assuming that the label "UTF-32" is defined the
same way as the label "UTF-16", then it is an umbrella label or a
"macro label" (hint: macro language) which covers the two *real*
encodings - UTF-32LE and UTF-32BE.
I've sometimes wished it were that way, that (for example) the
"UTF-32BE" and "UTF-32LE" encoding schemes were defined as variations of
"UTF-32" with special rules related to the BOM, not defined as
completely separate encoding schemes. But that's not how the definitions
are written.
The LE and BE versions are not at all "the two *real* encodings" when
there is real-world data that contains an initial U+FEFF meant to be
interpreted as a BOM or "signature."
--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell