Re: ForSt State backend seem to try to download all state locally

Zakelly Lan Fri, 04 Apr 2025 04:44:59 -0700

(I missed to cc user ML so I re-send this)

Hi Gyula,


Just because the Sync mode is basically inherited from rocksdb, and the
async mode is a completely new code path. You are right, the state backend
should have the remote storage even for synchronous state access. I'll open
a ticket for this.


Best,
Zakelly

On Fri, Apr 4, 2025 at 7:43 PM Zakelly Lan <[email protected]> wrote:

> Hi Gyula,
>
> Just because the Sync mode is basically inherited from rocksdb, and the
> async mode is a completely new code path. You are right, the state backend
> should have the remote storage even for synchronous state access. I'll open
> a ticket for this.
>
>
> Best,
> Zakelly
>
> On Fri, Apr 4, 2025 at 7:36 PM <[email protected]> wrote:
>
>> Hi!
>>
>> This job is definitely using the old , sync data access.
>>
>> Where is this limitation mentioned in the docs? It sounds a bit strange
>> that a fundamental behavior of the state backend depends on this. I assumed
>> without the new async api it would be slower but the general
>> characteristics of remote storage would remain the same.
>>
>> Thanks
>> Gyula
>>
>> Sent from my iPhone
>>
>> On 4 Apr 2025, at 13:24, Zakelly Lan <[email protected]> wrote:
>>
>> 
>> Hi Gyula,
>>
>> It seems the ForSt is downloading even for a no-rescale start.
>>
>> It came to me that there is a limitation: the ForSt won't store state
>> files on remote if the synchronous state APIs are using. So is it a
>> datastream job using old state APIs (not state V2), or is it a SQL job
>> without asynchronous state support (listed in [1]). Would you please check
>> the taskmanager log and see if there is 'ForStSync' showing, which means
>> ForSt is running in sync mode with pure local state.
>>
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/ops/state/disaggregated_state/#for-sql-jobs
>>
>> Best,
>> Zakelly
>>
>> On Fri, Apr 4, 2025 at 6:41 PM Gyula Fóra <[email protected]> wrote:
>>
>>> This is the flamegrapgh during the no-rescale restart. I couldnt attach
>>> it for the mailing list
>>>
>>> On Fri, Apr 4, 2025 at 12:24 PM Zakelly Lan <[email protected]>
>>> wrote:
>>>
>>>> Hi Gyula,
>>>>
>>>> I assumed it will only download at most 10GB and just start reading
>>>>> from remote and the job should start up "immediately".
>>>>
>>>>
>>>> It won't start up immediately, instead it clips the state before
>>>> running. This clipping process is primarily performed on the remote
>>>> side. This may involve writing new state files, which could be cached on
>>>> the local disk, but it should not exceed the 10GB limit.
>>>>
>>>> May I ask what checkpoint storage are you using? And would you please
>>>> try to start the job without a rescale and see if it could start
>>>> running immediately? And it would be great if you could provide some logs
>>>> from the taskmanager during the restore. I suspect that state clipping may
>>>> involve too much file rewriting affecting the speed. I'll do a similar
>>>> experiment.
>>>>
>>>>
>>>> Best,
>>>> Zakelly
>>>>
>>>> On Fri, Apr 4, 2025 at 4:28 PM Gyula Fóra <[email protected]> wrote:
>>>>
>>>>> Hi All!
>>>>>
>>>>> I am experimenting with the ForSt state backend on 2.0.0 and I noticed
>>>>> the following thing.
>>>>>
>>>>> If I have a job with a larger state, let's say 500GB and now I want to
>>>>> start the job with a lower parallelism on a single TaskManager, the job
>>>>> will simply not start as the ForStIncrementalRestoreOperation tries to
>>>>> download all states locally (there is not enough disk space)
>>>>>
>>>>> I have these configs:
>>>>>
>>>>> "state.backend.type": "forst"
>>>>> "state.backend.forst.cache.size-based-limit": "10GB"
>>>>>
>>>>> I assumed it will only download at most 10GB and just start reading
>>>>> from remote and the job should start up "immediately".
>>>>>
>>>>> What am I missing?
>>>>>
>>>>> Gyula
>>>>>
>>>>

Re: ForSt State backend seem to try to download all state locally

Reply via email to