Re: question on ValueState

2021-02-10 Thread Roman Khachatryan
Right, in this case FileSystemStateBackend is the right choice. The state size is limited by TM memory as you said. Regards, Roman On Tue, Feb 9, 2021 at 8:54 AM yidan zhao wrote: > What I am interested in is whether I should use rocksDB to replace > fileBackend. > RocksDB's performance is not

Re: question on ValueState

2021-02-08 Thread yidan zhao
I have a related question. Since fileStateBackend uses heap as the state storage and the checkpoint is finally stored in the filesystem, so whether the JobManager/TaskManager memory will limit the state size? The state size is limited by TM's memory * number of TMs? or limited by JM's memory. Kha

Re: question on ValueState

2021-02-08 Thread yidan zhao
What I am interested in is whether I should use rocksDB to replace fileBackend. RocksDB's performance is not good, while it's state size can be very large. Currently, my job's state is about 10GB, and I use 10 TaskManagers in different machines, each 100G memory. I do not think I should use rocksDB

Re: question on ValueState

2021-02-08 Thread Khachatryan Roman
Hi, I think Yun Tang is right, HeapStateBackend doesn't (de)serialize the value on update. As for "value()", it may (de)serialize it and return a copy if there is an ongoing async snapshot in progress (to protect from modifications). This shouldn't happen often though. Regards, Roman On Mon, Fe

Re: question on ValueState

2021-02-07 Thread Yun Tang
Hi, MemoryStateBackend and FsStateBackend both hold keyed state in HeapKeyedStateBackend [1], and the main structure to store data is StateTable [2] which holds POJO format objects. That is to say, the object would not be serialized when calling update(). On the other hand, RocksDB statebackend