bryan rasmussen wrote:
> Hi,
>
> Sorry, was imprecise, I meant not save the downloaded page locally.
> There probably isn't one though, so I should build one myself.
> Probably just need a good crawler that can be set to dump all links
> into dataset that I can analyse with R.
>
> Cheers,
> Bryan Rasmussen
>
There's quite a few already: webchecker, Orchid, mechanize, mygale:

http://codesnipers.com/?q=node/223&&title=Detecting-Dead-Links

http://pxr.openlook.org/pxr/source/Tools/webchecker/

http://sig.levillage.org/?p=599
http://www.robertblum.com/articles/2005/11/21/challenge-map-i-python-web-scraping
http://www.rexx.com/~dkuhlman/quixote_htmlscraping.html

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to