Petr Muller wrote:
Hi,

Thanks for response and sorry for I wasn't clear first time. I have a
heap of data (logs), from which I build a XML document using
xml.dom.minidom. In this data, some xml invalid characters may occur -
form feed (\x0c) character is one example.

Is this a hypothetical problem or do you actually have such chars?
If so, are they random hiccups from sick loggers, storage or transmission errors, or informative markers intentionally inserted? When you find them, do you want to silently ignore, ignore but raise a flag (metalog the log error), or act on them as part of the parsing/structuring process?

Silently deleting chars is easy with str.translate.

I don't know what else is illegal in xml, so I've searched if there's
some method how to prepare strings for insertion to a xml doc before I
start research on a xml spec and write such function on my own.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to