Performance issue associated with managed RocksDB memory

Juha Mynttinen Wed, 24 Jun 2020 05:25:32 -0700

Hello there,

In Flink 1.10 the configuration parameter state.backend.rocksdb.memory.managed 
defaults to true. This is great, since it makes it simpler to stay within the 
memory budget e.g. when running in a container environment. However, I've 
noticed performance issues when the switch is enabled.


To isolate where the issue is I've written some test code.

If I create a Flink job that has a single "heavy" operator (call it X) that 
just keeps a simple state (per user) things work fast when testing how many 
events / s sec the job can process. However, If I add downstream of X a 
simplest possible window operator, things can get slow, especially when I 
increase the parallelism. With slow I mean even 90% less events / second. The 
bad thing is that things get slower when parallelism is increased.

What seems to happen is that the window operator constantly flushes memtable(s) 
because X fills up the shared ḿemtable memory. This naturally causes the window 
operator also to compact its RocksDB db. I can see the constant flush / 
compaction in RocksDB log and in the fact that there are new SST files all the 
time. This flushing is (theoretically) unneeded, since the size of the state is 
< 1kb and it really should fit to the memtable.

If a disable the managed memory switch, things are fast (even if I increase 
parallelism). There are magnitudes less flushes and compactions, I assume 
because now the state fits nicely to the memtable. Also, if I downgrade to 
Flink 1.9, things are fast (there's no shared memory there).

I have a tester program that clearly illustrates the issue and tests results in 
a Google Sheet. The tester is too complex to be included inline here. Should I 
file a JIRA ticket or where should I put the test code?

Regards,
Juha

Performance issue associated with managed RocksDB memory

Reply via email to