Hi Vijay,

I think David has provided enough background knowledge for the difference of 
these two state backends.
When talking about the OOM problem of RocksDB, from Flink-1.10, rocksDB would 
be bounded to managed memory. However, RocksDB would use the managed memory as 
much as possible and the rest part for other native memory would be limited to 
the JVM overhead space[1] and consider to increase the memory of that part.


[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/memory/mem_setup.html#capped-fractionated-components

Best,
Yun Tang
________________________________
From: David Anderson <da...@alpinegizmo.com>
Sent: Saturday, July 18, 2020 19:34
To: Vijay Bhaskar <bhaskar.eba...@gmail.com>
Cc: user <user@flink.apache.org>
Subject: Re: Flink FsStatebackend is giving better performance than RocksDB

You should be able to tune your setup to avoid the OOM problem you have run 
into with RocksDB. It will grow to use all of the memory available to it, but 
shouldn't leak. Perhaps something is misconfigured.

As for performance, with the FSStateBackend you can expect:

* much better throughput and average latency
* possibly worse worst-case latency, due to GC pauses

Large, multi-slot TMs can be more of a problem with the FSStateBackend because 
of the increased scope for GC. You may want to run with more TMs, each with 
fewer slots.

You'll also be giving up the possibility of taking advantage of quick, local 
recovery [1] that comes for free when using incremental checkpoints with 
RocksDB. Local recovery can still be used with the FSStateBackend, but it comes 
at more of a cost.

As for migrating your state, you may be able to use the State Processor API [2] 
to rewrite a savepoint taken from RocksDB so that it can be loaded by the 
FSStateBackend, so long as you aren't using any windows or ListCheckpointed 
state.

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery
[2] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html<https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/libs/state_processor_api.html>

Best,
David

On Fri, Jul 17, 2020 at 1:53 PM Vijay Bhaskar 
<bhaskar.eba...@gmail.com<mailto:bhaskar.eba...@gmail.com>> wrote:
Hi
While doing scale testing we observed that FSStatebackend is out performing 
RocksDB.
When using RocksDB, off heap  memory keeps growing over a period of time and 
after a day pod got terminated with OOM.

Whereas the same data pattern FSStatebackend is running for days without any 
memory spike and OOM.

Based on documentation this is what i understood:

When the state size is so huge that we can't keep it in memory, that case 
RocksDB is preferred. That means indirectly when FSStatebackend is performing 
poorly when state size grows, RocksDB is preferred, right?

Another question, We have run production using RocksDB for quite some time, if 
we switch to FSStatebackend, then what are the consequences?
Following is what i can think of:
Very first time i'll lose the state
Thereafter w.r.t Save Points and Checkpoints the behavior is same ( I know 
there is no incremental checkpoint, but its a performance purpose)

Other than that, do I see any issues?

Regards
Bhaskar

Reply via email to