Steve Horsley wrote:
It is my understanding that the BOM (U+feff) is actually the Unicode
character "Non-breaking zero-width space".
My understanding is that this used to be the case. According to
http://www.unicode.org/faq/utf_bom.html#38
the application should now specify specific processing, and both
simply dropping it, or reporting an error are both acceptable behaviour.
Applications that need the ZWNBSP behaviour (i.e. want to indicate that
there should be no break at this point) should use U+2060 (WORD JOINER).
Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list