Simon Courtenage wrote:
I also use htmlparser, which is rather good. I've had to customize it,
though, to parse strings containing
html source rather than accept urls of resources to fetch etc. Also it
crashes on meta tags that don't have
name attributes (something I discovered only a couple of days ago).
Actually, it already accepts strings without modifying the library:
String htmlSource = "<html>...</html>";
Parser parser = new Parser(new Lexer(htmlSource));
I will have to watch out for those meta tags though. Time to go test it.
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia Ph: +61 2 9280 0699
Web: http://www.nuix.com.au/ Fax: +61 2 9212 6902
This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]