Hi! I'd try re-running the SSD test with the following config options:
state.backend.rocksdb.thread.num: 4 state.backend.rocksdb.predefined-options: FLASH_SSD_OPTIMIZED On Thu, Jul 21, 2022 at 4:11 AM vtygoss <vtyg...@126.com> wrote: > Hi, community! > > > I am doing some performance tests based on my scene. > > > 1. Environment > > - Flink: 1.13.5 > > - StateBackend: RocksDB, incremental > > - user case: complex sql contains 7 joins and 2 aggregation, input data > 30,000,000 records and output 60,000,000 records about 80GB. > > - resource: flink on yarn. JM 2G, one TM 24G(8G on-heap, 16G off-heap). 3 > slots per TM > > - only difference: different config 'state.backend.rocksdb.localdir', one > SATA disk or one SSD disk. > > > 2. rand write performance difference between SATA and SSD > > 4.8M/s is archived using SATA, while 48.2M/s using SSD. > > ``` > > fio -direct=1 -iodepth 64 -thread -rw=randwrite -ioengine=sync > -fsync=1 -runtime=300 -group_reporting -name=xxx -size=100G > --allow_mounted_write=1 -bs=8k -numjobs=64 -filename=/mnt/disk11/xx > > ``` > > > 3. In my use case, Flink SQL application finished in 41minutes using SATA, > while 45minutes using SSD. > > > Does this comparision suggest that the way to improve RocksDB performance > by using SSD is not effective? > > The direct downstream of the BackPressure operator is HdfsSink, does that > mean the best target to improve application performance is HDFS? > > > Thanks for your any replies or suggestions. > > > Best Regards! > > > > > > > > >