Hello there,

In Flink 1.10 the configuration parameter state.backend.rocksdb.memory.managed 
defaults to true. This is great, since it makes it simpler to stay within the 
memory budget e.g. when running in a container environment. However, I've 
noticed performance issues when the switch is enabled.

To isolate where the issue is I've written some test code.

If I create a Flink job that has a single "heavy" operator (call it X) that 
just keeps a simple state (per user) things work fast when testing how many 
events / s sec the job can process. However, If I add downstream of X a 
simplest possible window operator, things can get slow, especially when I 
increase the parallelism. With slow I mean even 90% less events / second. The 
bad thing is that things get slower when parallelism is increased.

What seems to happen is that the window operator constantly flushes memtable(s) 
because X fills up the shared ḿemtable memory. This naturally causes the window 
operator also to compact its RocksDB db. I can see the constant flush / 
compaction in RocksDB log and in the fact that there are new SST files all the 
time. This flushing is (theoretically) unneeded, since the size of the state is 
< 1kb and it really should fit to the memtable.

If a disable the managed memory switch, things are fast (even if I increase 
parallelism). There are magnitudes less flushes and compactions, I assume 
because now the state fits nicely to the memtable. Also, if I downgrade to 
Flink 1.9, things are fast (there's no shared memory there).

I have a tester program that clearly illustrates the issue and tests results in 
a Google Sheet. The tester is too complex to be included inline here. Should I 
file a JIRA ticket or where should I put the test code?

Regards,
Juha

Reply via email to