Re: Re: [DISCUSS] Future of Per-Job Mode

Jark Wu Wed, 16 Feb 2022 04:00:02 -0800

I think this mode is still limited and maybe not easy to extend.
Could the application mode provide an interface to execute?
So that clients can implement the interface and pass arbitrary parameters
(e.g. SQL scripts) ?


Best,
Jark

On Wed, 16 Feb 2022 at 18:54, Konstantin Knauf <kna...@apache.org> wrote:

> Hi Jark,
>
> I think you are raising a very good point. I think we need an application
> mode for SQL that would work along the lines of executing a SQL script
> (incl. init scripts) located in a particular directory in the Docker Image.
> Details to be discussed.
>
> Do you think Zeppelin/SQL CLI could work with such a mode for
> non-interactive queries (interactive queries would use a session cluster)?
>
> Best,
>
> Konstantin
>
>
> On Sat, Feb 12, 2022 at 4:31 AM Jark Wu <imj...@gmail.com> wrote:
>
> > Hi David,
> >
> > Zeppelin and SQL CLI also support submitting long-running streaming SQL
> > jobs. So the session cluster is not a fit mode.
> >
> > Best,
> > Jark
> >
> > On Fri, 11 Feb 2022 at 22:42, David Morávek <d...@apache.org> wrote:
> >
> > > Hi Jark, can you please elaborate about the current need of the per-job
> > > mode for interactive clients (eg. Zeppelin that you've mentioned)?
> Aren't
> > > these a natural fit for the session cluster?
> > >
> > > D.
> > >
> > > On Fri, Feb 11, 2022 at 3:25 PM Jark Wu <imj...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > I'm not very familiar with the implementation of per-job mode and
> > > > application mode.
> > > > But is there any instruction for users abou how to migrate
> > platforms/jobs
> > > > to application mode?
> > > > IIUC, the biggest difference between the two modes is where the
> main()
> > > > method is executed.
> > > > However, SQL jobs are not jar applications and don't have the main()
> > > > method.
> > > > For example, SQL CLI submits SQL jobs by invoking
> > > > `StreamExecutionEnvironment#executeAsync(StreamGraph)`.
> > > > How SQL Client and SQL platforms (e.g. Zeppelin) support application
> > > mode?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > > On Fri, 28 Jan 2022 at 23:33, Konstantin Knauf <kna...@apache.org>
> > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thank you for sharing your perspectives. I was not aware of
> > > > > these limitations of per-job mode on YARN. It seems that there is a
> > > > general
> > > > > agreement to deprecate per-job mode and to drop it once the
> > limitations
> > > > > around YARN are resolved. I've started a corresponding vote in [1].
> > > > >
> > > > > Thanks again,
> > > > >
> > > > > Konstantin
> > > > >
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/v6oz92dfp95qcox45l0f8393089oyjv4
> > > > >
> > > > > On Fri, Jan 28, 2022 at 1:53 PM Ferenc Csaky
> > > <ferenc.cs...@pm.me.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Yang,
> > > > > >
> > > > > > Thank you for the clarification. In general I think we will have
> > time
> > > > to
> > > > > > experiment with this until it will be removed totally and migrate
> > our
> > > > > > solution to use application mode.
> > > > > >
> > > > > > Regards,
> > > > > > F
> > > > > >
> > > > > > On 2022/01/26 02:42:24 Yang Wang wrote:
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I remember the application mode was initially named "cluster
> > mode".
> > > > As
> > > > > a
> > > > > > > contrast, the per-job mode is the "client mode".
> > > > > > > So I believe application mode should cover all the
> > functionalities
> > > of
> > > > > > > per-job except where we are running the user main code.
> > > > > > > In the containerized or the Kubernetes world, the application
> > mode
> > > is
> > > > > > more
> > > > > > > native and easy to use since all the Flink and user
> > > > > > > jars are bundled in the image. I am also in favor of
> deprecating
> > > and
> > > > > > > removing the per-job in the long run.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > @Ferenc
> > > > > > > IIRC, the YARN application mode could ship user jars and
> > > dependencies
> > > > > via
> > > > > > > "yarn.ship-files" config option. The only
> > > > > > > limitation is that we could not ship and load the user
> > dependencies
> > > > > with
> > > > > > > user classloader, not the parent classloader.
> > > > > > > FLINK-24897 is trying to fix this via supporting "usrlib"
> > directory
> > > > > > > automatically.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Ferenc Csaky <fe...@pm.me.invalid> 于2022年1月25日周二 22:05写道：
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > First of all, sorry for the delay. We at Cloudera are
> currently
> > > > > > relying on
> > > > > > > > per-job mode deploying Flink applications over YARN.
> > > > > > > >
> > > > > > > > Specifically, we allow users to upload connector jars and
> other
> > > > > > artifacts.
> > > > > > > > There are also some default jars that we need to ship. These
> > are
> > > > all
> > > > > > stored
> > > > > > > > on the local file system of our service’s node. The Flink job
> > is
> > > > > > submitted
> > > > > > > > on the users’ behalf by our service, which also specifies the
> > > jars
> > > > to
> > > > > > ship.
> > > > > > > > The service runs on a single node, not on all nodes with
> Flink
> > > > TM/JM.
> > > > > > It
> > > > > > > > would thus be difficult to manage the jars on every node.
> > > > > > > >
> > > > > > > > We are not familiar with the reasoning behind why application
> > > mode
> > > > > > > > currently doesn’t ship the user jars, besides the deployment
> > > being
> > > > > > faster
> > > > > > > > this way. Would it be possible for the application mode to
> > > > > (optionally,
> > > > > > > > enabled by some config) distribute these, or are there some
> > > > technical
> > > > > > > > limitations?
> > > > > > > >
> > > > > > > > For us it would be crucial to achieve the functionality we
> have
> > > at
> > > > > the
> > > > > > > > moment over YARN. We started to track
> > > > > > > > https://issues.apache.org/jira/browse/FLINK-24897 that Biao
> > Geng
> > > > > > > > mentioned as well.
> > > > > > > >
> > > > > > > > Considering the above, for us the more soonish removal does
> not
> > > > sound
> > > > > > > > really well. We can live with this feature as deprecated of
> > > course,
> > > > > > but it
> > > > > > > > would be nice to have some time to figure out how we can
> > utilize
> > > > > > > > Application Mode exactly and make necessary changes if
> > required.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > F
> > > > > > > >
> > > > > > > > On 2022/01/13 08:30:48 Konstantin Knauf wrote:
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I would like to discuss and understand if the benefits of
> > > having
> > > > > > Per-Job
> > > > > > > > > Mode in Apache Flink outweigh its drawbacks.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *# Background: Flink's Deployment Modes*
> > > > > > > > > Flink currently has three deployment modes. They differ in
> > the
> > > > > > following
> > > > > > > > > dimensions:
> > > > > > > > > * main() method executed on Jobmanager or Client
> > > > > > > > > * dependencies shipped by client or bundled with all nodes
> > > > > > > > > * number of jobs per cluster & relationship between job and
> > > > cluster
> > > > > > > > > lifecycle* (supported resource providers)
> > > > > > > > >
> > > > > > > > > ## Application Mode
> > > > > > > > > * main() method executed on Jobmanager
> > > > > > > > > * dependencies already need to be available on all nodes
> > > > > > > > > * dedicated cluster for all jobs executed from the same
> > > > > main()-method
> > > > > > > > > (Note: applications with more than one job, currently still
> > > > > > significant
> > > > > > > > > limitations like missing high-availability). Technically, a
> > > > session
> > > > > > > > cluster
> > > > > > > > > dedicated to all jobs submitted from the same main()
> method.
> > > > > > > > > * supported by standalone, native kubernetes, YARN
> > > > > > > > >
> > > > > > > > > ## Session Mode
> > > > > > > > > * main() method executed in client
> > > > > > > > > * dependencies are distributed from and by the client to
> all
> > > > nodes
> > > > > > > > > * cluster is shared by multiple jobs submitted from
> different
> > > > > > clients,
> > > > > > > > > independent lifecycle
> > > > > > > > > * supported by standalone, Native Kubernetes, YARN
> > > > > > > > >
> > > > > > > > > ## Per-Job Mode
> > > > > > > > > * main() method executed in client
> > > > > > > > > * dependencies are distributed from and by the client to
> all
> > > > nodes
> > > > > > > > > * dedicated cluster for a single job
> > > > > > > > > * supported by YARN only
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *# Reasons to Keep** There are use cases where you might
> need
> > > the
> > > > > > > > > combination of a single job per cluster, but main() method
> > > > > execution
> > > > > > in
> > > > > > > > the
> > > > > > > > > client. This combination is only supported by per-job mode.
> > > > > > > > > * It currently exists. Existing users will need to migrate
> to
> > > > > either
> > > > > > > > > session or application mode.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *# Reasons to Drop** With Per-Job Mode and Application Mode
> > we
> > > > have
> > > > > > two
> > > > > > > > > modes that for most users probably do the same thing.
> > > > Specifically,
> > > > > > for
> > > > > > > > > those users that don't care where the main() method is
> > executed
> > > > and
> > > > > > want
> > > > > > > > to
> > > > > > > > > submit a single job per cluster. Having two ways to do the
> > same
> > > > > > thing is
> > > > > > > > > confusing.
> > > > > > > > > * Per-Job Mode is only supported by YARN anyway. If we keep
> > it,
> > > > we
> > > > > > should
> > > > > > > > > work towards support in Kubernetes and Standalone, too, to
> > > reduce
> > > > > > special
> > > > > > > > > casing.
> > > > > > > > > * Dropping per-job mode would reduce complexity in the code
> > and
> > > > > > allow us
> > > > > > > > to
> > > > > > > > > dedicate more resources to the other two deployment modes.
> > > > > > > > > * I believe with session mode and application mode we have
> to
> > > > > easily
> > > > > > > > > distinguishable and understandable deployment modes that
> > cover
> > > > > > Flink's
> > > > > > > > use
> > > > > > > > > cases:
> > > > > > > > > * session mode: olap-style, interactive jobs/queries, short
> > > lived
> > > > > > batch
> > > > > > > > > jobs, very small jobs, traditional cluster-centric
> deployment
> > > > mode
> > > > > > (fits
> > > > > > > > > the "Hadoop world")
> > > > > > > > > * application mode: long-running streaming jobs, large
> scale
> > &
> > > > > > > > > heterogenous jobs (resource isolation!),
> application-centric
> > > > > > deployment
> > > > > > > > > mode (fits the "Kubernetes world")
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *# Call to Action*
> > > > > > > > > * Do you use per-job mode? If so, why & would you be able
> to
> > > > > migrate
> > > > > > to
> > > > > > > > one
> > > > > > > > > of the other methods?
> > > > > > > > > * Am I missing any pros/cons?
> > > > > > > > > * Are you in favor of dropping per-job mode midterm?
> > > > > > > > >
> > > > > > > > > Cheers and thank you,
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Konstantin Knauf
> > > > > > > > >
> > > > > > > > > https://twitter.com/snntrable
> > > > > > > > >
> > > > > > > > > https://github.com/knaufk
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Konstantin Knauf
> > > > >
> > > > > https://twitter.com/snntrable
> > > > >
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>

Re: Re: [DISCUSS] Future of Per-Job Mode

Reply via email to