Re: Finding sentinel text when using a thread pool...

2017-05-20 Thread Christopher Reimer via Python-list
On 5/20/2017 1:19 AM, dieter wrote: If your (590) pages are linked together (such that you must fetch a page to get the following one) and page fetching is the limiting factor, then this would limit the parallelizability. The pages are not linked together. The URL requires a page number. If I

Re: Finding sentinel text when using a thread pool...

2017-05-20 Thread dieter
Christopher Reimer writes: > I'm developing a web scraper script. It takes 25 minutes to process > 590 pages and ~9,000 comments. I've been told that the script is > taking too long. > > The way the script currently works is that the page requester is a > generator function that requests a page, c

Finding sentinel text when using a thread pool...

2017-05-19 Thread Christopher Reimer
Greetings, I'm developing a web scraper script. It takes 25 minutes to process 590 pages and ~9,000 comments. I've been told that the script is taking too long. The way the script currently works is that the page requester is a generator function that requests a page, checks if the page cont