Re: ignoring chinese characters parsing xml file

Fabian López Mon, 22 Oct 2007 14:31:08 -0700

Thanks Mark, the code is like this. The attrib name is the problem:

from lxml import etree


context = etree.iterparse("file.xml")
for action, elem in context:
    if elem.tag == "weblog":
        print action, elem.tag , elem.attrib["name"],elem.attrib["url"],
elem.attrib["rssUrl"]

And the xml file like:
<weblog name="xxxxxx" url="http://weblogli.com " when="4" />


22 Oct 2007 20:20:16 GMT, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]>:
>
> On Mon, 22 Oct 2007 21:24:40 +0200, Fabian López wrote:
>
> > I am parsing an XML file that includes chineses characters, like ^
> > uu啖啖才是w.扉L锍才是 or ヘアアイロン... The problem is that I get an error like:
> > UnicodeEncodeerror:'charmap' codec can't encode characters in
> > position..
>
> You say you are *parsing* the file but this is an *encode* error.  Parsing
> means *decoding*.
>
> You have to show some code and the actual traceback to get help.  Crystal
> balls are not that reliable.  ;-)
>
> Ciao,
>         Marc 'BlackJack' Rintsch
> --
> http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ignoring chinese characters parsing xml file

Reply via email to