[ 
https://issues.apache.org/jira/browse/FLINK-28390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564890#comment-17564890
 ] 

ming li commented on FLINK-28390:
---------------------------------

Hi, [~masteryhx], [~Zhanghao Chen]

Yes, although we currently have the compaction configuration of FIFO, it is 
actually unusable (the TTL and MAX_SIZE of FIFO cannot be configured). In 
addition, we do not recommend users to use it, and there is potential data 
loss. So I think we have the following work to do:
1.  Add FIFO related JNI, we can refer to 
https://github.com/facebook/rocksdb/wiki/FIFO-compaction-style;
2. Add the documentation and precautions for using FIFO.

In addition, when we used the FIFO of RocksDB internally, we also found a 
potential bug, which also needs to be fixed on the RocksDB branch of Flink. We 
can refer to https://github.com/facebook/rocksdb/issues/10133

> Allows RocksDB to configure FIFO Compaction to reduce CPU overhead.
> -------------------------------------------------------------------
>
>                 Key: FLINK-28390
>                 URL: https://issues.apache.org/jira/browse/FLINK-28390
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>            Reporter: ming li
>            Priority: Major
>
> We know that the fifo compaction strategy may silently delete data and may 
> lose data for the business. But in some scenarios, FIFO compaction can be a 
> very effective way to reduce CPU usage.
>  
> Flink's Taskmanager is usually some small-scale processes, such as allocating 
> 4 CPUs and 16G memory. When the state size is small, the CPU overhead 
> occupied by RocksDB is not high, and as the state increases, RocksDB may 
> frequently be in the compaction operation, which will occupy a large amount 
> of CPU and affect the computing operation.
>  
> We usually configure a TTL for the state, so when using FIFO we can configure 
> it to be slightly longer than the TTL, so that the upper layer is the same as 
> before. 
>  
> Although the FIFO Compaction strategy may bring space amplification, the disk 
> is cheaper than the CPU after all, so the overall cost is reduced.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to