On Sat, Dec 13, 2008 at 10:08:43AM -0800, Erik Wickstrom wrote:
> Hi all,
> 
> I have an application that is doing some web spidering.  Right now I'm
> using urllib to retrieve the URLs, but it is painfully slow.  I was
> wondering if it's feasible to swap out urllib with a twisted client
> that uses deferds so I can process urls in a more "parallel" fashion?
> 
> I've done a bunch of Googleing, but I haven't come across anything
> that I can use as a drop in replacement.   If you can point me in the
> right direction I'd really appreciate it!

Well, twisted.web.client is your path to DoS fame. Actually, if you are 
intenting to use it for
spidering 3rd party websites, I'd recommend a small dispatcher object that 
limits the number of concurrent connections per target 
server.

Andreas

> 
> Thanks for your help!
> Erik
> 
> _______________________________________________
> Twisted-Python mailing list
> [email protected]
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

_______________________________________________
Twisted-Python mailing list
[email protected]
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to