I'm new to xml mongering so forgive me if there's an obvious well-known answer to this. It's not real obvious from the library documentation I've looked at so far. Basically I have to munch of a bunch of xml files which contain character entities like ú which are apparently nonstandard. They appear in w3.org tables but xml.etree.cElementTree.ElementTree.parse barfs at them and xmllint barfs at them.
Basically I want to know if there's a way to supply the regular parser (preferably xml.etree but I guess I can switch to another one if necessary) with some kind of entity table, and/or if the info is supposed to be found in the DTD or someplace like that. Right now I'm ignoring the DTD and simply figuring out the doc structure by eyeballing the xml files, maybe not a perfectly approved method but it seems to be what most people do. Thanks -- http://mail.python.org/mailman/listinfo/python-list