zy added the comment:
I do not have documents on this subject. Though, I found that GNU iconv(1)
behaves the same as my proposed behavior. My reading of the source code
suggests that iconv(1) treat all encodings equally, which I think should also
be true for python.
As of security concerns
zy added the comment:
> So the correct result for b'\xff\n'.decode('gb2312', 'replace') is u'?\n'?
I think it should be so. This behavior does not leave out possible information,
has no side-effect on later decodings, and should the '\n
New submission from zy :
let s='\xff\n'
The expected result of s.decode('gb2312', 'ignore') is u"\n", while in 2.6.6 it
is u"".
s can be replaced with chr(m) + chr(n) , where m is in range of 128~255, and
n in 0~127.
In the above cases,