Hi Stephan,

Thanks for starting this discussion.
+1 for stores times in RocksDB by default.
In the past, when Flink didn't save the times with RocksDb, I had a
headache. I always adjusted parameters carefully to ensure that there was
no risk of Out of Memory.

Just curious, how much impact of heap and RocksDb for times on performance
- if there is no order of magnitude difference between heap and RocksDb,
there is no problem in using RocksDb.
- if there is, maybe we should improve our documentation to let users know
about this option. (Looks like a lot of users didn't know)

Best,
Jingsong Lee

On Fri, Jan 17, 2020 at 3:18 AM Yun Tang <myas...@live.com> wrote:

> Hi Stephan,
>
> I am +1 for the change which stores timers in RocksDB by default.
>
> Some users hope the checkpoint could be completed as fast as possible,
> which also need the timer stored in RocksDB to not affect the sync part of
> checkpoint.
>
> Best
> Yun Tang
> ------------------------------
> *From:* Andrey Zagrebin <azagre...@apache.org>
> *Sent:* Friday, January 17, 2020 0:07
> *To:* Stephan Ewen <se...@apache.org>
> *Cc:* dev <d...@flink.apache.org>; user <user@flink.apache.org>
> *Subject:* Re: [DISCUSS] Change default for RocksDB timers: Java Heap =>
> in RocksDB
>
> Hi Stephan,
>
> Thanks for starting this discussion. I am +1 for this change.
> In general, number of timer state keys can have the same order as number
> of main state keys.
> So if RocksDB is used for main state for scalability, it makes sense to
> have timers there as well
> unless timers are used for only very limited subset of keys which fits
> into memory.
>
> Best,
> Andrey
>
> On Thu, Jan 16, 2020 at 4:27 PM Stephan Ewen <se...@apache.org> wrote:
>
> Hi all!
>
> I would suggest a change of the current default for timers. A bit of
> background:
>
>   - Timers (for windows, process functions, etc.) are state that is
> managed and checkpointed as well.
>   - When using the MemoryStateBackend and the FsStateBackend, timers are
> kept on the JVM heap, like regular state.
>   - When using the RocksDBStateBackend, timers can be kept in RocksDB
> (like other state) or on the JVM heap. The JVM heap is the default though!
>
> I find this a bit un-intuitive and would propose to change this to let the
> RocksDBStateBackend store all state in RocksDB by default.
> The rationale being that if there is a tradeoff (like here), safe and
> scalable should be the default and unsafe performance be an explicit choice.
>
> This sentiment seems to be shared by various users as well, see
> https://twitter.com/StephanEwen/status/1214590846168903680 and
> https://twitter.com/StephanEwen/status/1214594273565388801
> We would of course keep the switch and mention in the performance tuning
> section that this is an option.
>
> # RocksDB State Backend Timers on Heap
>   - Pro: faster
>   - Con: not memory safe, GC overhead, longer synchronous checkpoint time,
> no incremental checkpoints
>
> #  RocksDB State Backend Timers on in RocksDB
>   - Pro: safe and scalable, asynchronously and incrementally checkpointed
>   - Con: performance overhead.
>
> Please chime in and let me know what you think.
>
> Best,
> Stephan
>
>

-- 
Best, Jingsong Lee

Reply via email to