Greetings, Flink developers.

I would like to open up a discussion of a proposal [1] to add a new kind of
state to Flink.

The goal here is to optimize a fairly common pattern, which is using

MapState<Long, List<Event>>

to store lists of events associated with timestamps. This pattern is used
internally in quite a few operators that implement sorting and joins, and
it also shows up in user code, for example, when implementing custom
windowing in a KeyedProcessFunction.

Nico Kruber, Seth Wiesman, and I have implemented a POC that achieves a
more than 2x improvement in throughput when performing these operations on
RocksDB by better leveraging the capabilities of the RocksDB state backend.

See FLIP-220 [1] for details.

Best,
David

[1] https://cwiki.apache.org/confluence/x/Xo_FD

Reply via email to