thanks for everyone. I will  increase the parallelism to solve the 
problem.Besides, I am  looking forward to support remote state.

Best,
Hjw


 




------------------ ???????? ------------------
??????:                                                                         
                                               "Yuan Mei"                       
                                                             
<yuanmei.w...@gmail.com&gt;;
????????:&nbsp;2022??9??8??(??????) ????11:26
??????:&nbsp;"Alexander Fedulov"<a...@deltastream.io&gt;;
????:&nbsp;"hjw"<1010445...@qq.com&gt;;"user"<user@flink.apache.org&gt;;
????:&nbsp;Re: Where will the state be stored in the taskmanager when using 
rocksdbstatebend??



Hey&nbsp;Hjw,

Under the current Flink architecture (i.e., task states are stored locally and 
periodically uploaded to remote durable storage during checkpointing), there is 
no other way rather than scaling out your application to solve the problem. 
This is equivalent to making the state size in each task smaller so that it can 
fit into a single container.


We have seen similar issues from other users/customers, and have plans to solve 
this problem in a more fundamental way to support remote states as well (when 
the local quota is used up, the&nbsp;state can also directly writes 
remotely).&nbsp;


For now, I would suggest increasing the parallelism&nbsp;of your job to solve 
this problem.


Best
Yuan


On Tue, Sep 6, 2022 at 7:59 PM Alexander Fedulov <a...@deltastream.io&gt; wrote:

Well, in that case, it is similar to the situation of hitting the limits of 
vertical scaling - you'll have to scale out horizontally.You could consider 
sizing down the number of CPU and RAM you allocate to each task manager, but 
instead increase their count (and your job's parallelism).
It might come with its own downsides, so measure as you go. This might also be 
problematic if you have significant key skew for some of your key ranges.


Best,
Alex


On Tue, Sep 6, 2022 at 8:09 AM hjw <1010445...@qq.com&gt; wrote:

Hi,Alexander


When Flink job deployed on Native k8s, taskmanager is a Pod.The data directory 
size of a single container is limited in our company.Are there any idea to deal 
with this ?




Best,
Hjw


&nbsp;




------------------&nbsp;????????&nbsp;------------------
??????:                                                                         
                                               "Alexander Fedulov"              
                                                                      
<a...@deltastream.io&gt;;
????????:&nbsp;2022??9??6??(??????) ????3:19
??????:&nbsp;"hjw"<1010445...@qq.com&gt;;
????:&nbsp;"user"<user@flink.apache.org&gt;;
????:&nbsp;Re: Where will the state be stored in the taskmanager when using 
rocksdbstatebend??



https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#state-backend-rocksdb-localdir
Make sure to use a local SSD disk (not NFS/EBS).


Best,
Alexander Fedulov


On Mon, Sep 5, 2022 at 7:24 PM hjw <1010445...@qq.com&gt; wrote:

The EmbeddedRocksDBStateBackend holds in-flight data in 
a&nbsp;RocksDB&nbsp;database that is (per default) stored in the TaskManager 
local data directories.
Which path does local data directories store RocksDB database in 
TaskManager&nbsp;point to in operating system?
If the job state is very large, I think I should &nbsp;take some measures to 
deal with it.(mount a volume for local data directories store RocksDB database 
etc...)


thx.



Best,
Hjw

Reply via email to