Hi!

I'd try re-running the SSD test with the following config options:

state.backend.rocksdb.thread.num: 4
state.backend.rocksdb.predefined-options: FLASH_SSD_OPTIMIZED


On Thu, Jul 21, 2022 at 4:11 AM vtygoss <vtyg...@126.com> wrote:

> Hi, community!
>
>
> I am doing some performance tests based on my scene.
>
>
> 1. Environment
>
> - Flink: 1.13.5
>
> - StateBackend: RocksDB, incremental
>
> - user case: complex sql contains 7 joins and 2 aggregation, input data
> 30,000,000 records and output 60,000,000 records about 80GB.
>
> - resource: flink on yarn. JM 2G, one TM 24G(8G on-heap, 16G off-heap). 3
> slots per TM
>
> - only difference: different config 'state.backend.rocksdb.localdir', one
> SATA disk or one SSD disk.
>
>
> 2. rand write performance difference between SATA and SSD
>
>    4.8M/s is archived using SATA, while 48.2M/s using SSD.
>
>    ```
>
>    fio -direct=1 -iodepth 64 -thread -rw=randwrite -ioengine=sync
>  -fsync=1 -runtime=300 -group_reporting -name=xxx -size=100G
> --allow_mounted_write=1 -bs=8k  -numjobs=64 -filename=/mnt/disk11/xx
>
>    ```
>
>
> 3. In my use case, Flink SQL application finished in 41minutes using SATA,
> while 45minutes using SSD.
>
>
> Does this comparision suggest that the way to improve RocksDB performance
> by using SSD is not effective?
>
> The direct downstream of the BackPressure operator is HdfsSink, does that
> mean the best target to improve application performance is HDFS?
>
>
> Thanks for your any replies or suggestions.
>
>
> Best Regards!
>
>
>
>
>
>
>
>
>

Reply via email to