Hi, How the internationalization works in Python 2.6.
I have an input string to the script. I do not know the encoding of the string. I want to write that string to an xml file. Here I am trying different encoding formats to decode the input string and make it as unicode. Then using the same encoding format while creating the xml string and writing to a file. Is that approach fine ? or any other way to support internationalization if we do not know the encoding format for the input string ? I am getting the xml string without any issues. print xml_string works fine . But when it is writing to a file, the tag value got changed, even though I used the same encoding format used for decoding. I written a sample code like below import os import codecs from xml.dom.minidom import Document def write_to_xml(output_string, encod_fmt): doc = Document() root = doc.createElement('root') doc.appendChild(root) tag_key = doc.createElement('output_string') tag_value = output_string tag_key.appendChild(doc.createTextNode((tag_value))) root.appendChild(tag_key) xml_string = doc.toprettyxml(indent=" ",encoding=encod_fmt) print xml_string fname = os.path.join('/root/output.xml') doc.writexml(codecs.open(fname,'wb',encod_fmt), encoding=encod_fmt) def convert_string(input_string): try: input_string_unicode = input_string.decode('utf-8') encoding = 'utf-8' except UnicodeDecodeError: try: input_string_unicode = input_string.decode('Latin-1') encoding = 'Latin-1' except UnicodeDecodeError: try: input_string_unicode = input_string.decode('iso-8859-1') encoding = 'iso-8859-1' except UnicodeDecodeError: raise #output_string = input_string_unicode.encode(encoding) write_to_xml(input_string_unicode, encoding) if __name__ == '__main__': input_string = raw_input() convert_string(input_string) Output --------- [root] python i18n_test.py Étest <?xml version="1.0" encoding="Latin-1"?> <root> <output_string> Étest </output_string> </root> But the file content is as below. <?xml version="1.0" encoding="Latin-1"?><root><output_string><C9>test</output_string></root> -- Regards D.S. DIleep _______________________________________________ BangPypers mailing list BangPypers@python.org https://mail.python.org/mailman/listinfo/bangpypers