Right, in this case FileSystemStateBackend is the right choice.
The state size is limited by TM memory as you said.
Regards,
Roman
On Tue, Feb 9, 2021 at 8:54 AM yidan zhao wrote:
> What I am interested in is whether I should use rocksDB to replace
> fileBackend.
> RocksDB's performance is not
I have a related question.
Since fileStateBackend uses heap as the state storage and the checkpoint is
finally stored in the filesystem, so whether the JobManager/TaskManager
memory will limit the state size? The state size is limited by TM's memory
* number of TMs? or limited by JM's memory.
Kha
What I am interested in is whether I should use rocksDB to replace
fileBackend.
RocksDB's performance is not good, while it's state size can be very large.
Currently, my job's state is about 10GB, and I use 10 TaskManagers in
different machines, each 100G memory. I do not think I should use rocksDB
Hi,
I think Yun Tang is right, HeapStateBackend doesn't (de)serialize the value
on update.
As for "value()", it may (de)serialize it and return a copy if there is an
ongoing async snapshot in progress (to protect from modifications). This
shouldn't happen often though.
Regards,
Roman
On Mon, Fe
Hi,
MemoryStateBackend and FsStateBackend both hold keyed state in
HeapKeyedStateBackend [1], and the main structure to store data is StateTable
[2] which holds POJO format objects. That is to say, the object would not be
serialized when calling update().
On the other hand, RocksDB statebackend