Hello,

 

I have a Python app that parses XML files and then writes to text files.

 

However, the output text file is "sometimes" encoded in some Asian language.

 

Here is my code:

 

encoding = "iso-8859-1"

clean_sent = nltk.clean_html(sent.text)

clean_sent = clean_sent.encode(encoding, "ignore");

 

I also tried "UTF-8" encoding, but received the same results.

 

Any suggestions?

 

Thanks.

 

 

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to