Hi I'll give some information from my side: 1. The performance for RocksDB is mainly related to the (de)serialization and disk read/write. 2. MapState just need to (de)serialize the single mapkey/mapvalue when read/write state, ValueState need to (de)serialize the whole state when read/write the state 3. disk read/write is somewhat about the whole state size
Best, Congxian KristoffSC <krzysiek.chmielew...@gmail.com> 于2020年4月8日周三 上午2:41写道: > Hi, > I would to ask about what has more memory footprint and what could be more > efficient regarding > less keys with bigger keyState vs many keys with smaller keyState > > For this use case I'm using RocksDB StateBackend and state TTL is, well.. > infinitive. So I'm keeping the state forever in Flink. > > The use case: > I have a stream of messages that I have to process it in some custom way. > I can take one of two approaches > > 1. use a keyBy that will give me some number of distinct keys but for each > key, the state size will be significant. It will be MapState in this case. > The keyBy I used will still give me ability to spread operations across > operator instances. > > 2. In second approach I can use a different keyBy, where I would have huge > number of distinct keys, but each keyState will be very small and it will > be > a ValueState in this case. > > To sum up: > "reasonable" number of keys with very big keySatte VS huge number of keys > with very small state each. > > What are the pros and cons for both? > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >