Hi,
I am using Pika's asynchronous consumer implementation with Scrapy and
Twisted. I have twisted reactor running on the main thread, and Rabbit
consumer running on a background thread. When I get a message and want to
start my spider, I use 'callFromThread' to wake the reactor thread, init
the spider and start crawling.

Alas, on high load of Q messages, I find that because 'callFromThread' is
called all the time, Scrapy does not start downloading until there is some
'break' in these calls.

I am wondering what is the best approach to gain high scale with Scrapy,
Twisted and RabbitMQ. Should I continue using the current design, and
simply do some buffering or batching to reduce the 'callFromThread'
frequency? Perhaps I should use a synchronous design?

Thanks
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

Reply via email to