Re: Flink RocksDB Performance

2021-07-20 Thread Robert Metzger
Your understanding of the problem is correct -- the serialization cost is the reason for the high CPU usage. What you can also try to optimize is the serializers you are using (by using data types that are efficient to serialize). See also this blog post: https://flink.apache.org/news/2020/04/15/f

Re: Flink RocksDB Performance

2021-07-16 Thread Vijay Bhaskar
Yes absolutely. Unless we need a very large state order of GB rocks DB is not required. RocksDB is good only because the Filesystem is very bad at LargeState. In other words FileSystem performs much better than RocksDB upto GB's. After that the file system degrades compared to RocksDB. Its not that

Re: Flink RocksDB Performance

2021-07-16 Thread Zakelly Lan
Hi Li Jim, Filesystem performs much better than rocksdb (by multiple times), but it is only suitable for small states. Rocksdb will consume more CPU on background tasks, cache management, serialization/deserialization and compression/decompression. In most cases, performance of the Rocksdb will mee