On Aug 1, 10:51 am, rob.di...@gmx.com (Rob Dixon) wrote:
> On 01/08/2011 11:03, VinoRex.E wrote:
>
>
>
> > Hi everyone i am a  beginer for Perl can you give me a psedocode and a
> > sample code for a spider program.It will be helpful in understanding web
> > interfaces.Thank you
>
> If you can't write your own pseudocode for a web spider then check
> Bharathiar University for a more appropriate course. One version goes
>
>    function fetchall(URL)
>      content = get(URL)
>      loop for it over findlinks(content)
>        content = content + fetchall(it)
>      return content
>    end
>
> Since the purpose of your efforts is to learn Perl, I think a module
> like WWW::Mechanize is the wrong choice. To write a program that
> accesses the internet, you should install and study the LWP library.

 LWP::RobotUA can be used in conjunction with other modules
 in the LWP library suite too. It'll provide methods to ensure
appropriate spidering behavior, ie, not hitting sites too fast and
heeding a site's 'robots.txt' guidelines. This is very important for
any spidering programs you write.

--
Charles DeRykus


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to