Hi Peter,
Building the state for a DataStream job in a DataSet (batch) job is
currently not possible.
You can however, implement a DataStream job that reads batch data and
builds the state. When all data was processed, you'd need to save the state
as a savepoint and can resume a streaming job fro
Hi,
We have a Flink streaming pipeline (1.4.2) which reads from Kafka, uses
mapWithState with RocksDB and writes the updated states to Cassandra.
We also would like to reprocess the ingested records from HDFS. For this we
consider computing the latest state of the records over the whole dataset
in