Thanks a bunch!
>For example, the Flink Kafka source operator's parallel instances maintain
as operator state a mapping of partitions to offsets for the partitions
that it is assigned to.
This I think clarifies things. This is literally state for the operator to
do its job, not really row data. T
Hi!
Operator state is bound to a single parallel operator instance; there is no
partitioning happening here.
It is typically used in Flink source and sink operators. For example, the
Flink Kafka source operator's parallel instances maintain as operator state
a mapping of partitions to offsets for
This is so helpful, thank you!
So just to clarify (3), Operator state has a partitioning scheme, but it's
simply not by key, it's something else that's special under-the-hood? In
which case, what data is stored in an Operator? I assumed it must be the
input data for e.g. a join, so that it can rea
Hi,
On Fri, Sep 4, 2020 at 1:37 PM Rex Fenley wrote:
> Hello!
>
> I've been digging into State Storage documentation, but it's left me
> scratching my head with a few questions. Any help will be much appreciated.
>
> Qs:
> 1. Is there a way to use RocksDB state backend for Flink on AWS EMR?
> Po
Hello!
I've been digging into State Storage documentation, but it's left me
scratching my head with a few questions. Any help will be much appreciated.
Qs:
1. Is there a way to use RocksDB state backend for Flink on AWS EMR?
Possibly with S3 backed savepoints for recovery (or maybe hdfs for
savep