Re: Persisting state in RocksDB

2021-06-10 Thread Arvid Heise
Hi Paul, You can leave operators dangling. So no need to add fake sinks. If you write to HTTP, the best option is actually asyncIO. [1] This will run much much faster. AsyncIO however has no state access (we want to change that eventually but for now it's to avoid too many antipatterns). For me

Re: Persisting state in RocksDB

2021-06-10 Thread Paul K Moore
Hi Arvid - thanks for the welcome :) The SinkFunction is custom (extends RichSinkFunction), and writes to a legacy web service over HTTP. I’ll investigate the keyBy+KeyedProcessFunction further - thanks. Frankly I looked at this but I think I was confusing myself between working with KV store

Re: Persisting state in RocksDB

2021-06-09 Thread Arvid Heise
Hi Paul, Welcome to the club! What's your SinkFunction? Is it custom? If so, you could also implement CheckpointedFunction to read and write data. Here you could use OperatorStateStore and with it the BroadcastState. However, broadcast state is quite heavy (it sends all data to all instances, so

Persisting state in RocksDB

2021-06-08 Thread Paul K Moore
Hi all, First post here, so please be kind :) Firstly some context; I have the following high-level job topology: (1) FlinkPulsarSource -> (2) RichAsyncFunction -> (3) SinkFunction 1. The FlinkPulsarSource reads event notifications about article updates from a Pulsar topic 2. The RichAsyncFunc