[EMAIL PROTECTED] wrote: > So I've got a problem. > > I've got a database of information that is encoded in Windows/CP1252. > What I want to do is dump this to a UTF-8 encoded text file (a RSS > feed). > > While the overall problem seems to be related to the conversion, the > only error I'm getting is a > > "UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position > 163: ordinal not in range(128)" > > So somewhere I'm missing an implicit conversion to ASCII which is > completely aggrivating my brain. > > So, what fundamental issue am I completely overlooking?
That nowhere in your *code* do you mention "I've got a database of information that is encoded in Windows/CP1252". This is not recorded anywhere in your database. Python is fantastic, but we don't expect a readauthorsmind() function until Python 4000 :-) > > Code follows. > [snip] > > sql_query = "select story.subject as subject, story.content as > content, story.summary as summary, story.sid as sid, posts.bid as > board, posts.date_to_publish as date from story$ The above line has been mangled ... fortunately it doesn't affect the diagnostic outcome. [snip] > > > output.write(u'<description>' + unicode(descript) + > u'</description>\n') # this is the line that causes the error. What is happening is that unicode(descript) has not been told what encoding to use to decode your "Windows/CP1252" text, and it uses the default encoding, "ascii". You need to put unicode(descript, 'cp1252'). Cheers, John -- http://mail.python.org/mailman/listinfo/python-list