On Apr 11, 10:33 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: > Hi again, > > Stefan Behnel wrote: > > Silfheed wrote: > >> So first off I know that CDATA is generally hated and just shouldn't > >> be done, but I'm simply required to parse it and spit it back out. > >> Parsing is pretty easy with lxml, but it's the spitting back out > >> that's giving me issues. The fact that lxml strips all the CDATA > >> stuff off isnt really a big issue either, so long as I can create > >> CDATA blocks later with <>&'s showing up instead of <>& . > >> I've scoured through the lxml docs, but probably not hard enough, so > >> anyone know the page I'm looking for or have a quick how to? > > > There's nothing in the docs because lxml doesn't allow you to create CDATA > > sections. You're not the first one asking that, but so far, no one really > > had > > a take on this. > > So I gave it a try, then. In lxml 2.1, you will be able to do this: > > >>> root = Element("root") > >>> root.text = CDATA('test') > >>> tostring(root)) > '<root><![CDATA[test]]></root>' > > This does not work for .tail content, only for .text content (no technical > reason, I just don't see why that should be enabled). > > There's also a parser option "strip_cdata" now that allows you to leave CDATA > sections in the tree. However, they will *not* behave any different than > normal text, so you can't even see at the API level that you are dealing with > CDATA. If you want to be really, really sure, you can always do this: > > >>> root.text = CDATA(root.text) > > Hope that helps, > > Stefan
That is immensely cool. Do you plan to stick it into svn soon? Thanks! -- http://mail.python.org/mailman/listinfo/python-list