On Sep 17, 4:01 pm, Paul Boddie <[EMAIL PROTECTED]> wrote: > On 17 Sep, 22:31, [EMAIL PROTECTED] wrote: > > > > > What's the best way to get at the XML? Do I need to somehow parse it > > using the HTMLParser and then parse that with minidom or what? > > Probably easiest is to use an XML processing toolkit or library which > supports HTML parsing. Since the libxml2 library (written in C) makes > a fairly good job of HTML parsing, I would suggest either libxml2dom > (for a DOM-like API) or lxml (for an ElementTree-like API) as suitable > Python wrappers of libxml2. Of course, HTMLParser or SGMLParser should > work, but the programming style is a bit more convoluted unless you're > used to XML processing using a SAX-like API. > > Paul > > P.S. I'm biased towards libxml2dom, being the developer, but I use it > routinely and it generally does the job for me.
I have lxml installed and I appear to also have libxml2dom installed. I know lxml has decent docs, but I don't see much for yours. Is this the only place to go: http://www.boddie.org.uk/python/libxml2dom.html ? Mike -- http://mail.python.org/mailman/listinfo/python-list