On 6/7/18 4:40 PM, Daniel Glus wrote: > I'm trying to figure out the entire list of possible encodings for a Python > source file - that is, encodings that can go in a PEP 263 > <https://www.python.org/dev/peps/pep-0263/> encoding specification, like # > -*- encoding: foo -*-. > > Is this list the same as the list given in the documentation for the codecs > library, under "Standard Encodings" > <https://docs.python.org/3.6/library/codecs.html#standard-encodings>? If > not, where can I find the actual list? > > (I know that list is the same as the set of unique values in CPython's > /Lib/encodings/aliases.py > <https://github.com/python/cpython/blob/master/Lib/encodings/aliases.py>, > or equivalently, the set of filenames in /Lib/encodings/ > <https://github.com/python/cpython/blob/master/Lib/encodings/>, but again > I'm not sure.) > -Daniel
Reading the proposal, I see one thing that seems worthy of a comment, the proposal specifically calls out the UTF-8 'BOM" sequence, (which the Unicode standard actually doesn't recommend using, as UTF-8 doesn't have a 'Byte Order Problem', but doesn't allow the UTF-16 (0xFF, 0xFE or 0xFE, 0xFF) or UCS-4 BOM (0x00, 0x00, 0xFE, 0xFF or 0xFF, 0xFE, 0x00, 0x00) marks which while the formats are unlikely are very likely to have the marks, and detecting the marks are very important to detect those encoding as they are NOT 'ACSII Compatible' formats, so the rest of the header doesn't match what would be expected. -- Richard Damon -- https://mail.python.org/mailman/listinfo/python-list