[jira] [Commented] (FLINK-15368) Add end-to-end test for controlling RocksDB memory usage

Yu Li (Jira) Tue, 07 Jan 2020 08:39:40 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009885#comment-17009885
 ]


Yu Li commented on FLINK-15368:
-------------------------------

bq. The new model can lead to situations where out of the box less memory being 
dedicated to RocksDB, compared to before... Again, something to mention in the 
release notes.
True, will remember to mention this in release note and document.

bq. If we increase the managed memory to match the previously dedicated memory, 
would we still expect a performance regression.
I think there won't be any performance regression if user could tune up the 
managed memory accordingly, since we didn't modify the flushing policy in our 
new frocksdb version, and in previous version the total size of write buffer 
may also go up as much as 150% of {{WriteBufferManager}} limitation. Actually 
in container environment it's quite possible that user dedicated more than 
required memory to RocksDB, and now we supply a way to test and control it more 
finer-grained.

bq. Is there a reason why we do not set the write buffer manager ratio to a 
higher value by default (0.9)? The way I understand it, it will not reserve 
that much memory from the cache eagerly
I agree that it won't reserve that much memory eagerly. However, according to 
the deduced formula ({{size_of_cache=3N/(3+R)}}), the higher the write ratio 
(R) is, the smaller the cache size would be (out of the box less memory being 
dedicated to RocksDB). On the other hand, according to the LSM design, as long 
as the write buffer is not too small (which would lead to too many merge-read 
before compaction), giving more memory to block cache would help more. However, 
I think it should be a case-by-case decision, and suggest we decide the default 
ratio with our most common access pattern in Flink (considering slot-sharing 
vs. non-slot-sharing, common number of CF/state per slot, etc).

bq. After the release, should we pick the discussion again to have one column 
family per task, see if that could help us
Totally agreed, we could list out the pros. and cons. in details and have a 
thorough discussion. I suggest to open a dedicated JIRA for this and move 
further discussions around this topic there, what do you think?

Thanks.

> Add end-to-end test for controlling RocksDB memory usage
> --------------------------------------------------------
>
>                 Key: FLINK-15368
>                 URL: https://issues.apache.org/jira/browse/FLINK-15368
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>    Affects Versions: 1.10.0
>            Reporter: Yu Li
>            Assignee: Yun Tang
>            Priority: Critical
>             Fix For: 1.10.0
>
>
> We need to add an end-to-end test to make sure the RocksDB memory usage 
> control works well, especially under the slot sharing case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15368) Add end-to-end test for controlling RocksDB memory usage

Reply via email to