Re: Web Crawling/Threading and Things That Go Bump in the Night

2006-08-04 Thread [EMAIL PROTECTED]
Rem, what OS are you trying this on? Windows XP SP2 has a limit of around 40 tcp connections per second... Remarkable wrote: > Hello all > > I am trying to write a reliable web-crawler. I tried to write my own > using recursion and found I quickly hit the "too many sockets" open > problem. So I lo

Web Crawling/Threading and Things That Go Bump in the Night

2006-08-04 Thread Remarkable
Hello all I am trying to write a reliable web-crawler. I tried to write my own using recursion and found I quickly hit the "too many sockets" open problem. So I looked for a threaded version that I could easily extend. The simplest/most reliable I found was called Spider.py (see attached). At