[Twisted-Python] Scrapy spiders waiting in reactor thread when callFromThread gets call repeatedly

Adi Lavi Sun, 21 Dec 2014 03:50:37 -0800

Hi,
I am using Pika's asynchronous consumer implementation with Scrapy and
Twisted. I have twisted reactor running on the main thread, and Rabbit
consumer running on a background thread. When I get a message and want to
start my spider, I use 'callFromThread' to wake the reactor thread, init
the spider and start crawling.


Alas, on high load of Q messages, I find that because 'callFromThread' is
called all the time, Scrapy does not start downloading until there is some
'break' in these calls.

I am wondering what is the best approach to gain high scale with Scrapy,
Twisted and RabbitMQ. Should I continue using the current design, and
simply do some buffering or batching to reduce the 'callFromThread'
frequency? Perhaps I should use a synchronous design?

Thanks

_______________________________________________
Twisted-Python mailing list
[email protected]
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

[Twisted-Python] Scrapy spiders waiting in reactor thread when callFromThread gets call repeatedly

Reply via email to