[pylons-discuss] Tuning the number of waitress threads for production

Tom Wiltzius Fri, 11 Mar 2016 13:08:45 -0800

Hi there,

Recently I've been going through a round of (attempted) performance tuning 
of our Pyramid/waitress application (we use it as an API server, with 
SqlAlchemy as the ORM and nginx sitting in front of waitress as an SSL 
proxy), and this has left me with a few questions about the nature of the 
Pyramid/Waitress thread model. These questions are more about waitress than 
about Pyramid, but I don't see any dedicated waitress discussion lists so 
thought I'd try here. Please feel free to redirect me if I'm asking in the 
wrong place.


When loading complex pages browser clients will send as many as 10 API 
requests in parallel. I've noticed that when this happens, requests that I 
know return quickly on their own will get "blocked" behind requests that 
take longer -- the first byte of the response for later requests comes only 
after the earlier requests are finished downloading.

According to the info I can find on the waitress design [1], it has a fixed 
thread pool to service requests (defaulting to 4). My theory is that if the 
threads get tied up with a few slow requests, the server can no longer 
service the faster ones. Bumping the number of waitress threads to 20 (more 
than the number of requests we ever make in parallel) seems to mitigate the 
issue from a single client; the faster requests no longer block behind the 
slower requests.

However, this "solution" leaves me with more questions than answers. That 
same design document [1] indicates that waitress worker threads never do 
any I/O. But our application logic does lots of I/O to talk to the database 
server on another machine (through SqlAlchemy). So...

- Am I misunderstanding the waitress design? Or are we doing it wrong? 
- Is the Pyramid initialization code only run once (setting up routes, 
etc), or is it run once per worker thread? We have a bunch of our own 
services we initialize at the same time as route registration. We try to 
run them as singletons, and it all seems to work but now I'm in doubt over 
when/where this code is executed (is it on the waitress master process?) I 
read through the only other topic I could find discussing this [2] but it 
mostly discusses manually spinning up threads for slow tasks -- I'd like to 
avoid doing that for all database operations if at all possible.
- Other than increased memory consumption are there any significant 
downsides to increasing the the number threads? I thought I read somewhere 
to set the number of worker threads to the number of CPU cores available, 
which would make sense if the workload was CPU bound but our workload about 
50% CPU and 50% database (i.e. I/O) by wall time.
- Is it possible I'm looking in the wrong place entirely, and nginx is 
actually what's causing request serialization? We're using 4 nginx worker 
processes with the default (512) number of concurrent connections, so my 
assumption is this is not the bottleneck.

Any guidance, insight, or further documentation references greatly 
appreciated!

Thanks,
Tom

[1] http://docs.pylonsproject.org/projects/waitress/en/latest/design.html
[2] https://groups.google.com/d/topic/pylons-discuss/cC5Thn4fvyE/discussion

-- 
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/pylons-discuss.
For more options, visit https://groups.google.com/d/optout.

[pylons-discuss] Tuning the number of waitress threads for production

Reply via email to