Hey Hjw,

Under the current Flink architecture (i.e., task states are stored locally
and periodically uploaded to remote durable storage during checkpointing),
there is no other way rather than scaling out your application to solve the
problem. This is equivalent to making the state size in each task smaller
so that it can fit into a single container.

We have seen similar issues from other users/customers, and have plans to
solve this problem in a more fundamental way to support remote states as
well (when the local quota is used up, the state can also directly writes
remotely).

For now, I would suggest increasing the parallelism of your job to solve
this problem.

Best
Yuan

On Tue, Sep 6, 2022 at 7:59 PM Alexander Fedulov <a...@deltastream.io>
wrote:

> Well, in that case, it is similar to the situation of hitting the limits
> of vertical scaling - you'll have to scale out horizontally.
> You could consider sizing down the number of CPU and RAM you allocate to
> each task manager, but instead increase their count (and your job's
> parallelism).
> It might come with its own downsides, so measure as you go. This might
> also be problematic if you have significant key skew for some of your key
> ranges.
>
> Best,
> Alex
>
> On Tue, Sep 6, 2022 at 8:09 AM hjw <1010445...@qq.com> wrote:
>
>> Hi,Alexander
>>
>> When Flink job deployed on Native k8s, taskmanager is a Pod.The data
>> directory size of a single container is limited in our company.Are there
>> any idea to deal with this ?
>>
>> ------------------------------
>> Best,
>> Hjw
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Alexander Fedulov" <a...@deltastream.io>;
>> *发送时间:* 2022年9月6日(星期二) 凌晨3:19
>> *收件人:* "hjw"<1010445...@qq.com>;
>> *抄送:* "user"<user@flink.apache.org>;
>> *主题:* Re: Where will the state be stored in the taskmanager when using
>> rocksdbstatebend?
>>
>>
>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#state-backend-rocksdb-localdir
>> Make sure to use a local SSD disk (not NFS/EBS).
>>
>> Best,
>> Alexander Fedulov
>>
>> On Mon, Sep 5, 2022 at 7:24 PM hjw <1010445...@qq.com> wrote:
>>
>>> The EmbeddedRocksDBStateBackend holds in-flight data in a RocksDB
>>> <http://rocksdb.org/> database that is (per default) stored in the
>>> TaskManager local data directories.
>>> Which path does local data directories store RocksDB database in
>>> TaskManager point to in operating system?
>>> If the job state is very large, I think I should  take some measures to
>>> deal with it.(mount a volume for local data directories store RocksDB
>>> database etc...)
>>>
>>> thx.
>>>
>>> ------------------------------
>>> Best,
>>> Hjw
>>>
>>

Reply via email to