I think Seed is correct that we don't properly report backpressure from an
AsyncWaitOperator. The problem is that not the Task's main execution thread
but the Emitter thread will emit the elements and, thus, be stuck in the
`requestBufferBuilderBlocking` method. This, however, does not mean that
th
> On Mar 20, 2019, at 6:49 PM, Seed Zeng wrote:
>
> Hey Andrey and Ken,
> Sorry about the late reply. I might not have been clear in my question
> The performance of writing to Cassandra is the same in both cases, only that
> the source rate was higher in the case of the async function is prese
Hi Seed,
when you create `AsyncDataStream.(un)orderedWait` which capacity do you
pass in or you use the default one (100)?
Best,
Andrey
On Thu, Mar 21, 2019 at 2:49 AM Seed Zeng wrote:
> Hey Andrey and Ken,
> Sorry about the late reply. I might not have been clear in my question
> The performa
Hey Andrey and Ken,
Sorry about the late reply. I might not have been clear in my question
The performance of writing to Cassandra is the same in both cases, only
that the source rate was higher in the case of the async function is
present.
Something is "buffering" and not propagating backpressure
Hi Seed,
Sorry for confusion, I see now it is separate. Back pressure should still
be created because internal async queue has capacity
but not sure about reporting problem, Ken and Till probably have better
idea.
As for consumption speed up, async operator creates another thread to
collect the r
Hi Seed,
I was assuming the Cassandra sink was separate from and after your async
function.
I was trying to come up for an explanation as to why adding the async function
would improve your performance.
The only very unlikely reason I thought of was that the async function somehow
caused data
Hi Ken and Andrey,
Thanks for the response. I think there is a confusion that the writes to
Cassandra are happening within the Async function.
In my test, the async function is just a pass-through without doing any
work.
So any Cassandra related batching or buffering should not be the cause for
t
Hi Seed,
It’s a known issue that Flink doesn’t report back pressure properly for
AsyncFunctions, due to how it monitors the output collector to gather back
pressure statistics.
But that wouldn’t explain how you get a faster processing with the
AsyncFunction inserted into your workflow.
I have
AsyncFunctionHi Seed,
I think the back pressure should emerge by blocking
the AsyncFunction.asyncInvoke call.
So it depends on how ResultFuture is generated from Cassandra client
whether it blocks on submitting request or not when the number of pending
requests is too big.
Maybe, AsyncFunction.asy