Hi all!

I would suggest a change of the current default for timers. A bit of
background:

  - Timers (for windows, process functions, etc.) are state that is managed
and checkpointed as well.
  - When using the MemoryStateBackend and the FsStateBackend, timers are
kept on the JVM heap, like regular state.
  - When using the RocksDBStateBackend, timers can be kept in RocksDB (like
other state) or on the JVM heap. The JVM heap is the default though!

I find this a bit un-intuitive and would propose to change this to let the
RocksDBStateBackend store all state in RocksDB by default.
The rationale being that if there is a tradeoff (like here), safe and
scalable should be the default and unsafe performance be an explicit choice.

This sentiment seems to be shared by various users as well, see
https://twitter.com/StephanEwen/status/1214590846168903680 and
https://twitter.com/StephanEwen/status/1214594273565388801
We would of course keep the switch and mention in the performance tuning
section that this is an option.

# RocksDB State Backend Timers on Heap
  - Pro: faster
  - Con: not memory safe, GC overhead, longer synchronous checkpoint time,
no incremental checkpoints

#  RocksDB State Backend Timers on in RocksDB
  - Pro: safe and scalable, asynchronously and incrementally checkpointed
  - Con: performance overhead.

Please chime in and let me know what you think.

Best,
Stephan

Reply via email to