Andy Lester wrote: > > On Jul 8, 2006, at 10:31 PM, Michael G Schwern wrote: > >> If your XPath parser balks at non-XHTML HTML then just run it through >> HTML::Tidy->clean which will convert it to XHTML. > > Usually.
If usually isn't good enough, you can always write your own HTML converter with HTML::TreeBuilder. I do this in my blog software: http://trac.jrock.us/trac/blog_software/browser/lib/Blog/Format/HTML.pm This has the added advantage of allowing you to remove "nasty" HTML, if that's relevant in your application. Regards, Jonathan Rockway
signature.asc
Description: OpenPGP digital signature