Hi Yang, Thank you for the clarification. In general I think we will have time to experiment with this until it will be removed totally and migrate our solution to use application mode.
Regards, F On 2022/01/26 02:42:24 Yang Wang wrote: > Hi all, > > I remember the application mode was initially named "cluster mode". As a > contrast, the per-job mode is the "client mode". > So I believe application mode should cover all the functionalities of > per-job except where we are running the user main code. > In the containerized or the Kubernetes world, the application mode is more > native and easy to use since all the Flink and user > jars are bundled in the image. I am also in favor of deprecating and > removing the per-job in the long run. > > > > @Ferenc > IIRC, the YARN application mode could ship user jars and dependencies via > "yarn.ship-files" config option. The only > limitation is that we could not ship and load the user dependencies with > user classloader, not the parent classloader. > FLINK-24897 is trying to fix this via supporting "usrlib" directory > automatically. > > > Best, > Yang > > > > Ferenc Csaky <fe...@pm.me.invalid> 于2022年1月25日周二 22:05写道: > > > Hi Konstantin, > > > > First of all, sorry for the delay. We at Cloudera are currently relying on > > per-job mode deploying Flink applications over YARN. > > > > Specifically, we allow users to upload connector jars and other artifacts. > > There are also some default jars that we need to ship. These are all stored > > on the local file system of our service’s node. The Flink job is submitted > > on the users’ behalf by our service, which also specifies the jars to ship. > > The service runs on a single node, not on all nodes with Flink TM/JM. It > > would thus be difficult to manage the jars on every node. > > > > We are not familiar with the reasoning behind why application mode > > currently doesn’t ship the user jars, besides the deployment being faster > > this way. Would it be possible for the application mode to (optionally, > > enabled by some config) distribute these, or are there some technical > > limitations? > > > > For us it would be crucial to achieve the functionality we have at the > > moment over YARN. We started to track > > https://issues.apache.org/jira/browse/FLINK-24897 that Biao Geng > > mentioned as well. > > > > Considering the above, for us the more soonish removal does not sound > > really well. We can live with this feature as deprecated of course, but it > > would be nice to have some time to figure out how we can utilize > > Application Mode exactly and make necessary changes if required. > > > > Thank you, > > F > > > > On 2022/01/13 08:30:48 Konstantin Knauf wrote: > > > Hi everyone, > > > > > > I would like to discuss and understand if the benefits of having Per-Job > > > Mode in Apache Flink outweigh its drawbacks. > > > > > > > > > *# Background: Flink's Deployment Modes* > > > Flink currently has three deployment modes. They differ in the following > > > dimensions: > > > * main() method executed on Jobmanager or Client > > > * dependencies shipped by client or bundled with all nodes > > > * number of jobs per cluster & relationship between job and cluster > > > lifecycle* (supported resource providers) > > > > > > ## Application Mode > > > * main() method executed on Jobmanager > > > * dependencies already need to be available on all nodes > > > * dedicated cluster for all jobs executed from the same main()-method > > > (Note: applications with more than one job, currently still significant > > > limitations like missing high-availability). Technically, a session > > cluster > > > dedicated to all jobs submitted from the same main() method. > > > * supported by standalone, native kubernetes, YARN > > > > > > ## Session Mode > > > * main() method executed in client > > > * dependencies are distributed from and by the client to all nodes > > > * cluster is shared by multiple jobs submitted from different clients, > > > independent lifecycle > > > * supported by standalone, Native Kubernetes, YARN > > > > > > ## Per-Job Mode > > > * main() method executed in client > > > * dependencies are distributed from and by the client to all nodes > > > * dedicated cluster for a single job > > > * supported by YARN only > > > > > > > > > *# Reasons to Keep** There are use cases where you might need the > > > combination of a single job per cluster, but main() method execution in > > the > > > client. This combination is only supported by per-job mode. > > > * It currently exists. Existing users will need to migrate to either > > > session or application mode. > > > > > > > > > *# Reasons to Drop** With Per-Job Mode and Application Mode we have two > > > modes that for most users probably do the same thing. Specifically, for > > > those users that don't care where the main() method is executed and want > > to > > > submit a single job per cluster. Having two ways to do the same thing is > > > confusing. > > > * Per-Job Mode is only supported by YARN anyway. If we keep it, we should > > > work towards support in Kubernetes and Standalone, too, to reduce special > > > casing. > > > * Dropping per-job mode would reduce complexity in the code and allow us > > to > > > dedicate more resources to the other two deployment modes. > > > * I believe with session mode and application mode we have to easily > > > distinguishable and understandable deployment modes that cover Flink's > > use > > > cases: > > > * session mode: olap-style, interactive jobs/queries, short lived batch > > > jobs, very small jobs, traditional cluster-centric deployment mode (fits > > > the "Hadoop world") > > > * application mode: long-running streaming jobs, large scale & > > > heterogenous jobs (resource isolation!), application-centric deployment > > > mode (fits the "Kubernetes world") > > > > > > > > > *# Call to Action* > > > * Do you use per-job mode? If so, why & would you be able to migrate to > > one > > > of the other methods? > > > * Am I missing any pros/cons? > > > * Are you in favor of dropping per-job mode midterm? > > > > > > Cheers and thank you, > > > > > > Konstantin > > > > > > -- > > > > > > Konstantin Knauf > > > > > > https://twitter.com/snntrable > > > > > > https://github.com/knaufk > > > >