I've been trying to figure this one out and can't seem to come up with an
answer, hopefully someone can enlighten me.
The simple picture is that I have a spout, a CPU-bound task, and an
IO-bound task that's sending data out over the network. The IO-bound task
is making batches and doing async calls with the batches, so it's not
blocking (and indeed its capacity is very low). However, when sending this
data (instead of a no-op on that task) I only get about half the throughput
as I would otherwise.
What I've looked into:
- the thread pool used to execute the async call has plenty of capacity
- network IO is not saturated
- max spout pending is not in play
- execute latency is very low on the network bound task
Does anyone have any ideas? Is there a config that could possibly be
rate-limiting incoming data to that task since it's juggling many unacked
tuples?