Re: ForSt State backend seem to try to download all state locally

2025-04-05 Thread Gyula Fóra
Hi Zakelly! Backend is S3 I have performed a simpler experiment. Single Taskmanager 6 parallelism. I let it accumulate about 40 GB of state then restarted it with a smaller local disk size (15GB). The job takes a very long time trying to download everything but eventually the TM crashes before th

Re: ForSt State backend seem to try to download all state locally

2025-04-04 Thread gyula . fora
Hi!This job is definitely using the old , sync data access. Where is this limitation mentioned in the docs? It sounds a bit strange that a fundamental behavior of the state backend depends on this. I assumed without the new async api it would be slower but the general characteristics of remote stor

Re: ForSt State backend seem to try to download all state locally

2025-04-04 Thread Zakelly Lan
Hi Gyula, It seems the ForSt is downloading even for a no-rescale start. It came to me that there is a limitation: the ForSt won't store state files on remote if the synchronous state APIs are using. So is it a datastream job using old state APIs (not state V2), or is it a SQL job without asynchr

Re: ForSt State backend seem to try to download all state locally

2025-04-04 Thread Zakelly Lan
(I missed to cc user ML so I re-send this) Hi Gyula, Just because the Sync mode is basically inherited from rocksdb, and the async mode is a completely new code path. You are right, the state backend should have the remote storage even for synchronous state access. I'll open a ticket for this.

Re: ForSt State backend seem to try to download all state locally

2025-04-04 Thread Zakelly Lan
Hi Gyula, I assumed it will only download at most 10GB and just start reading from > remote and the job should start up "immediately". It won't start up immediately, instead it clips the state before running. This clipping process is primarily performed on the remote side. This may involve writi

ForSt State backend seem to try to download all state locally

2025-04-04 Thread Gyula Fóra
Hi All! I am experimenting with the ForSt state backend on 2.0.0 and I noticed the following thing. If I have a job with a larger state, let's say 500GB and now I want to start the job with a lower parallelism on a single TaskManager, the job will simply not start as the ForStIncrementalRestoreOp