Nick Coghlan added the comment:

A given encoding may have multiple aliases, and also variant spellings that are 
normalized before doing the codec lookup. Doing the lookup first means we run 
through all of the normalisation and aliasing machinery and then compare the 
*canonical* names. For example:

>>> import codecs
>>> codecs.lookup('ANSI_X3.4_1968').name
'ascii'
>>> codecs.lookup('ansi_x3.4_1968').name
'ascii'
>>> codecs.lookup('ansi-x3.4-1968').name
'ascii'
>>> codecs.lookup('ASCII').name
'ascii'
>>> codecs.lookup('ascii').name
'ascii'

A public "codecs.is_same_encoding" API might be a worthwhile and 
self-documenting addition, rather than just adding a comment that explains the 
need for the canonicalisation dance.

As far as the second question goes, for non-seekable output streams, this API 
is inherently a case of "here be dragons" - that's a large part of the reason 
why it took so long for us to accept it as a feature we really should provide. 
We need to support writing a BOM to sys.stdout and sys.stderr - potentially 
doing so in the middle of existing output isn't really any different from the 
chance of implicitly switching encodings mid-stream.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15216>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to