is this a unicode/string bug?

olsongt Fri, 09 Dec 2005 13:35:52 -0800

I was going to submit to sourceforge, but my unicode skills are weak.
I was trying to strip characters from a string that contained values
outside of ASCII.  I though I could just encode as 'ascii' in 'replace'
mode but it threw an error.  Strangely enough, if I decode via the
ascii codec and then encode via the ascii codec, I get what I want.
That being said, this may be operating correctly.


>>> print 'aaa\xae'
aaa®
>>> 'aaa\xae'.encode('ascii','replace') #should return 'aaa?'
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 3:
ordinal not in range(128)
>>> 'aaa\xae'.decode('ascii','replace') #but this doesn't throw an error?
u'aaa\ufffd'
>>> 'aaa\xae'.decode('ascii','replace').encode('ascii','replace') #this does 
>>> what I wanted
'aaa?'
>>>

-- 
http://mail.python.org/mailman/listinfo/python-list

is this a unicode/string bug?

Reply via email to