Hi, recently I had to study *seriously* Unicode and encodings for one project in Python but I left with a couple of doubts arised after reading the unicode chapter of Dive into Python 3 book by Mark Pilgrim.
1- Mark says: "Also (and you’ll have to trust me on this, because I’m not going to show you the math), due to the exact nature of the bit twiddling, there are no byte-ordering issues. A document encoded in UTF-8 uses the exact same stream of bytes on any computer." Is it true UTF-8 does not have any "big-endian/little-endian" issue because of its encoding method? And if it is true, why Mark (and everyone does) writes about UTF-8 with and without BOM some chapters later? What would be the BOM purpose then? 2- If that were true, can you point me to some documentation about the math that, as Mark says, demonstrates this? thank you Carlo -- http://mail.python.org/mailman/listinfo/python-list