Hi Galen,
i will tell from my experience as a Flink user and developer of Flink jobs.



*"if the input to an AsyncFunction is a keyed stream, can I assume that all
input elements with the same key will be handled by the same instance of
the async operator"*
>From what I know (and someone can correct me if I'm wrong) this is
possible. However you have to make sure that there is no Re-balance or
re-shuffle between those operators. For example operators after first
.keyBy(..) call must have same parallelism level.

Regarding:
" I have a situation where I would like to enforce that async operations
associated with a particular key happen sequentially,"

This is also possible as far as I know. In fact I was implementing
streaming pipeline with similar requirements like
*"maintaining order of events withing keyBy group across multiple operators
including Async operators". *
We achieved that with same thing -> making sure that all operators in
entire pipeline except Source and Sink had exact same parallelism level.
Additional thing to remember here is that if you call .keyBy(...) again but
with different key extractor, then original order might not be preserved
since keyBy will execute re-shuffle/re-balance.

We were also using reinterpretAsKeyedStream feature [1] after async
operators to avoid calling ".keyBay" multiple times in pipeline. Calling
.keyBy always has negative impact on performance.
With reinterpretAsKeyedStream we were able to use keyed operators with
access to keyed state after Async operators.

Hope that helped.

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/experimental/

Regards,
Krzysztof Chmielewski







pt., 14 paź 2022 o 19:11 Galen Warren <ga...@cvillewarrens.com> napisał(a):

> I have a question about Flink's Async IO support: Async I/O | Apache Flink
> <https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/asyncio/>
> .
>
> I understand that access to state is not supported in an AsyncFunction.
> However, if the input to an AsyncFunction is a keyed stream, can I assume
> that all input elements with the same key will be handled by the same
> instance of the async operator, as would normally be the case with keyed
> streams/operators?
>
> I'm asking because I have a situation where I would like to enforce that
> async operations associated with a particular key happen sequentially, i.e.
> if two elements come through with the same key, I need  the async operation
> for the second to happen after the async operation for the first one
> completes. I think I can achieve this using a local map of "in flight"
> async operations in the operator itself, but only if I can rely on all
> input elements with the same key being processed by the same async operator.
>
> If anyone can confirm how this works, I'd appreciate it. Thanks.
>

Reply via email to