thanks for everyone. I will increase the parallelism to solve the
problem.Besides, I am looking forward to support remote state.
Best,
Hjw
------------------ ???????? ------------------
??????:
"Yuan Mei"
<[email protected]>;
????????: 2022??9??8??(??????) ????11:26
??????: "Alexander Fedulov"<[email protected]>;
????: "hjw"<[email protected]>;"user"<[email protected]>;
????: Re: Where will the state be stored in the taskmanager when using
rocksdbstatebend??
Hey Hjw,
Under the current Flink architecture (i.e., task states are stored locally and
periodically uploaded to remote durable storage during checkpointing), there is
no other way rather than scaling out your application to solve the problem.
This is equivalent to making the state size in each task smaller so that it can
fit into a single container.
We have seen similar issues from other users/customers, and have plans to solve
this problem in a more fundamental way to support remote states as well (when
the local quota is used up, the state can also directly writes
remotely).
For now, I would suggest increasing the parallelism of your job to solve
this problem.
Best
Yuan
On Tue, Sep 6, 2022 at 7:59 PM Alexander Fedulov <[email protected]> wrote:
Well, in that case, it is similar to the situation of hitting the limits of
vertical scaling - you'll have to scale out horizontally.You could consider
sizing down the number of CPU and RAM you allocate to each task manager, but
instead increase their count (and your job's parallelism).
It might come with its own downsides, so measure as you go. This might also be
problematic if you have significant key skew for some of your key ranges.
Best,
Alex
On Tue, Sep 6, 2022 at 8:09 AM hjw <[email protected]> wrote:
Hi,Alexander
When Flink job deployed on Native k8s, taskmanager is a Pod.The data directory
size of a single container is limited in our company.Are there any idea to deal
with this ?
Best,
Hjw
------------------ ???????? ------------------
??????:
"Alexander Fedulov"
<[email protected]>;
????????: 2022??9??6??(??????) ????3:19
??????: "hjw"<[email protected]>;
????: "user"<[email protected]>;
????: Re: Where will the state be stored in the taskmanager when using
rocksdbstatebend??
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#state-backend-rocksdb-localdir
Make sure to use a local SSD disk (not NFS/EBS).
Best,
Alexander Fedulov
On Mon, Sep 5, 2022 at 7:24 PM hjw <[email protected]> wrote:
The EmbeddedRocksDBStateBackend holds in-flight data in
a RocksDB database that is (per default) stored in the TaskManager
local data directories.
Which path does local data directories store RocksDB database in
TaskManager point to in operating system?
If the job state is very large, I think I should take some measures to
deal with it.(mount a volume for local data directories store RocksDB database
etc...)
thx.
Best,
Hjw