Re: Fwd: Parsing web pages

Dave Gray Fri, 03 Mar 2017 10:30:06 -0800

The submodules WWW::Mechanize::Firefox or WWW::Mechanize::PhantomJS are
worth a look too, depending on the complexity/js-heaviness of the pages
you're parsing and what your setup looks like exactly (full headless; on
your computer, etc).


On Fri, Mar 3, 2017 at 1:39 AM, Lars Noodén <lars.noo...@gmail.com> wrote:

> On 03/03/2017 02:15 AM, kavita kulkarni wrote:
> > Hello,
> >
> > Can you suggest some effective ways to parse multiple web pages from the
> > web site.
> > I cannot use web crawling as the format of the pages is not same. I am
> > interested in the data from specific table on each page.
> >
> > Thanks in advance.
> > Kavita
> >
>
> Once you have acquired the page using either WWW:Mechanize, LWP, or even
> just wget you can extract the table.
>
> The modules HTML::TreeBuilder and HTML::TreeBuilder::XPath do extraction
> rather easily if there is some consistent way to identify the table.
>
> Regards,
> Lars
>
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
>

Re: Fwd: Parsing web pages

Reply via email to