On Thu, Jan 16, 2014 at 1:13 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')): > return 'utf_16' > elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')): > return 'utf_32'
I'd swap the order of these two checks. If the file starts FF FE 00 00, your code will guess that it's UTF-16 and begins with a U+0000. ChrisA -- https://mail.python.org/mailman/listinfo/python-list