Re: minidom and unicode errors

Abhimanyu Seth Mon, 06 Mar 2006 22:51:28 -0800

On 3/7/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote:

Abhimanyu Seth wrote:

> > I have the following line in my xml file:
> > <target>Exception beim Löschen des Audit-Moduls aufgetreten. Exception
> Stack
> > lautet: %1.</target>
> > ExpatError: not well-formed (invalid token): line 8, column 27

> I've specified utf-8 in the xml header
> <?xml version="1.0" encoding="utf-8"?>

are you sure you're using utf-8 in the XML file? the ö you pasted into
your mail is an iso-8859-1 code, not an utf-8 code.

> Anyway,
> >> f = codecs.open ("c:/test.txt", "r", "latin-1")
> >> dom = minidom.parseString (codecs.encode (f.read(), "utf-8"))
> works.

which means that you've labelled the file as utf-8, but that it actually
contains iso-8859-1. fixing the file should fix this.

</F>

--
http://mail.python.org/mailman/listinfo/python-list

Sorry, my mistake. The file was not saved as utf-8. Saving it as utf-8 solves my problems.
>> f = codecs.open ("c:/test.txt", "r", "utf-8")
>> dom = minidom.parseString (codecs.encode (f.read(), "utf-8"))

However, I still need to encode the string returned by f.read () before passing it to parseString. Otherwise I get an exception.

Thanks, anyway for all the help.

--
Regards,
Abhimanyu

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: minidom and unicode errors

Reply via email to