Siegfried Heintze wrote:

Does someone have a little python script that will read a file in UTF-8/UTF-16/UTF-32 (my choice) and search for all the characters between 0x7f-0xffffff and convert them to an ASCII digit string that begins with "&#" and ends with ";" and output the whole thing? If not, could someone tell me how to write one?

    file = open("filename.txt", "rb")
    text = file.read()
    text = unicode(text, "utf-8")
    text = text.encode("ascii", "xmlcharrefreplace")
    print text

tweak as necessary.

</F>

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to