Hi all,

I've seen more modules like HTML::... and others that understand the
structure of an HTML document.

If I want to create a web spider that parses more web pages, how can I parse
them if they are in diverse formats?
Some of them might be using the HTML old format, others the XHTML, ... and
so on.

Are these modules (HTML::...) understanding the structure of all those file
formats?

If not, is there any module which does this?
If not, should I use more modules and use methods from all of them?

I am a little bit confused.

Thank you for any hints.

Teddy,
Teddy's Center: http://teddy.fcc.ro/
Email: [EMAIL PROTECTED]



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to