On Sat, Dec 13, 2008 at 10:08:43AM -0800, Erik Wickstrom wrote: > Hi all, > > I have an application that is doing some web spidering. Right now I'm > using urllib to retrieve the URLs, but it is painfully slow. I was > wondering if it's feasible to swap out urllib with a twisted client > that uses deferds so I can process urls in a more "parallel" fashion? > > I've done a bunch of Googleing, but I haven't come across anything > that I can use as a drop in replacement. If you can point me in the > right direction I'd really appreciate it!
Well, twisted.web.client is your path to DoS fame. Actually, if you are intenting to use it for spidering 3rd party websites, I'd recommend a small dispatcher object that limits the number of concurrent connections per target server. Andreas > > Thanks for your help! > Erik > > _______________________________________________ > Twisted-Python mailing list > [email protected] > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python _______________________________________________ Twisted-Python mailing list [email protected] http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
