Thank you! solved it with this: unicode( data.decode('latin_1') ) and when I write it to the file... f = codecs.open(path, encoding='utf-8', mode='w+') f.write(self.__rssDoc.toxml())
Diez B. Roggisch skrev: > Niclas schrieb: >> Hi >> >> I'm having trouble to work with the special charcters in swedish (Å Ä >> Ö å ä ö). The script is parsing and extracting information from a >> webpage. This works fine and I get all the data correctly. The >> information is then added to a rss file (using >> xml.dom.minidom.Document() to create the file), this is where it goes >> wrong. Letters like Å ä ö get messed up and the rss file does not >> validate. How can I convert the data to UTF-8 without loosing the >> special letters? > > Show us code, and example text (albeit I know it is difficult to get > that right using news/mail) > > The basic idea is this: > > scrapped_byte_string = scrap_the_website() > > output = scrappend_byte_string.decode('website-encoding').encode('utf-8') > > > > Diez -- http://mail.python.org/mailman/listinfo/python-list