As Flink Async IO operator is designed for external API or DB calls, are
there any specific guidelines / tips for scaling up this operator?
Particularly for use-cases where incoming events are being ingested at a
very high-speed and the Async IO operator with orderedWait mode can not
keep up with that speed (although the target API endpoint it is calling is
load tested to provide much higher throughput with very minimal latency).
In our case adding Async IO operator to the pipeline *reduced the
throughput by 88% to 90%*. This is huge performance hit!

We tried a couple of things:

   1. Increasing the async buffer capacity parameter, there by increasing
   the number of concurrent requests at any given point in time that are
   waiting for response. This proved counter-productive beyond a very small
   number like 50 or 100.
   2. Increasing the operator parallelism. This does not help much as the
   number of cores on our machines are limited (8 or 12)
   3. Tuning the AsyncHTTPClient configuration (keepAlive=true,
   maxConnections, maxConnectionsPerHost) and the size of FixedThreadPool
   used by the Listener of its Future. Again without much improvement.

Our observation is that although Async IO operator works for one stream
element at a time, the operator and its underlying HTTP client are
multithreaded and need higher core machines for high-speed stream
processing. If the only machines available for this kind of applications
are 8 to 16 cores, we face challenges in meeting the required throughput
and latency SLAs.

Are there any micro-benchmarks or tuning guidelines for using Async IO for
high-speed stream processing, so we know how much throughput to expect from
it?
Thanks & regards,
Arti

Reply via email to