Hello David, Thanks for the information! So the two main takeaways from your email are to
- Move to something supporting application mode. Is https://github.com/apache/flink-kubernetes-operator already ready enough for production deployments ? - wait for flink 1.15 thanks! Frank On Mon, Mar 28, 2022 at 9:16 AM David Morávek <d...@apache.org> wrote: > Hi Frank, > > I'm not really familiar with the internal workings of the Spotify's > operator, but here are few general notes: > > - You only need the JM process for the REST API to become available (TMs > can join in asynchronously). I'd personally aim for < 1m for this step, if > it takes longer it could signal a problem with your infrastructure (eg. > images taking long time to pull, incorrect setup of liveness / readiness > probes, not enough resources). > > The job is packaged as a fat jar, but it is already baked in the docker >> images we use (so technically there would be no need to "submit" it from a >> separate pod). >> > > That's where the application mode comes in. Please note that this might be > also one of the reasons for previous steps taking too long (as all pods are > pulling an image with your fat jar that might not be cached). > > Then the application needs to start up and load its state from the latest >> savepoint, which again takes a couple of minutes >> > > This really depends on the state size, state backend (eg. rocksdb restore > might take longer), object store throughput / rate limit. The > native-savepoint feature that will come out with 1.15 might help to shave > off some time here as the there is no conversion into the state backend > structures. > > Best, > D. > > - > > > On Fri, Mar 25, 2022 at 9:46 AM Frank Dekervel <fr...@kapernikov.com> > wrote: > >> Hello, >> >> We run flink using the spotify flink Kubernetes operator (job cluster >> mode). Everything works fine, including upgrades and crash recovery. We do >> not run the job manager in HA mode. >> >> One of the problems we have is that upon upgrades (or during testing), >> the startup time of the flink cluster takes a very long time: >> >> - First the operator needs to create the cluster (JM+TM), and wait >> for it to respond for api requests. This already takes a couple of >> minutes. >> - Then the operator creates a job-submitter pod that submits the job >> to the cluster. The job is packaged as a fat jar, but it is already baked >> in the docker images we use (so technically there would be no need to >> "submit" it from a separate pod). The submission goes rather fast tho (the >> time between the job submitter seeing the cluster is online and the >> "hello" >> log from the main program is <1min) >> - Then the application needs to start up and load its state from the >> latest savepoint, which again takes a couple of minutes >> >> All steps take quite some time, and we are looking to reduce the startup >> time to allow for easier testing but also less downtime during upgrades. So >> i have some questions: >> >> - I wonder if the situation is the same for all kubernetes >> operators. I really need some kind of operator because i otherwise i have >> to set which savepoint to load from myself every startup. >> - What cluster startup time is considered to be acceptable / best >> practise ? >> - If there are other tricks to reduce startup time, i would be very >> interested in knowing them :-) >> >> There is also a discussion ongoing on running flink on spot nodes. I >> guess the startup time is relevant there too. >> >> Thanks already >> Frank >> >> >> >> >> >> -- [image: Kapernikov] <https://kapernikov.com/> Frank Dekervel +32 473 94 34 21 <+32473943421> www.kapernikov.com <https://kapernikov.com/> [image: Blog] <https://www.kapernikov.com/emailsigs/blog/redirector.php>