I am not sure whether the sql script could also be submitted like python.
We will need a sql-runner jar, which plays as the user jar and has the sql
script as the argument.

./bin/flink run-application \      --target kubernetes-application \
   -Dkubernetes.cluster-id=<ClusterId> \
-Dkubernetes.container.image=<FlinkImageName> \      --sqlFiles
/opt/flink/examples/sql/word_count.sql

Best,
Yang

Jark Wu <imj...@gmail.com> 于2022年2月16日周三 20:00写道:

> I think this mode is still limited and maybe not easy to extend.
> Could the application mode provide an interface to execute?
> So that clients can implement the interface and pass arbitrary parameters
> (e.g. SQL scripts) ?
>
> Best,
> Jark
>
> On Wed, 16 Feb 2022 at 18:54, Konstantin Knauf <kna...@apache.org> wrote:
>
> > Hi Jark,
> >
> > I think you are raising a very good point. I think we need an application
> > mode for SQL that would work along the lines of executing a SQL script
> > (incl. init scripts) located in a particular directory in the Docker
> Image.
> > Details to be discussed.
> >
> > Do you think Zeppelin/SQL CLI could work with such a mode for
> > non-interactive queries (interactive queries would use a session
> cluster)?
> >
> > Best,
> >
> > Konstantin
> >
> >
> > On Sat, Feb 12, 2022 at 4:31 AM Jark Wu <imj...@gmail.com> wrote:
> >
> > > Hi David,
> > >
> > > Zeppelin and SQL CLI also support submitting long-running streaming SQL
> > > jobs. So the session cluster is not a fit mode.
> > >
> > > Best,
> > > Jark
> > >
> > > On Fri, 11 Feb 2022 at 22:42, David Morávek <d...@apache.org> wrote:
> > >
> > > > Hi Jark, can you please elaborate about the current need of the
> per-job
> > > > mode for interactive clients (eg. Zeppelin that you've mentioned)?
> > Aren't
> > > > these a natural fit for the session cluster?
> > > >
> > > > D.
> > > >
> > > > On Fri, Feb 11, 2022 at 3:25 PM Jark Wu <imj...@gmail.com> wrote:
> > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > I'm not very familiar with the implementation of per-job mode and
> > > > > application mode.
> > > > > But is there any instruction for users abou how to migrate
> > > platforms/jobs
> > > > > to application mode?
> > > > > IIUC, the biggest difference between the two modes is where the
> > main()
> > > > > method is executed.
> > > > > However, SQL jobs are not jar applications and don't have the
> main()
> > > > > method.
> > > > > For example, SQL CLI submits SQL jobs by invoking
> > > > > `StreamExecutionEnvironment#executeAsync(StreamGraph)`.
> > > > > How SQL Client and SQL platforms (e.g. Zeppelin) support
> application
> > > > mode?
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > > On Fri, 28 Jan 2022 at 23:33, Konstantin Knauf <kna...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > Thank you for sharing your perspectives. I was not aware of
> > > > > > these limitations of per-job mode on YARN. It seems that there
> is a
> > > > > general
> > > > > > agreement to deprecate per-job mode and to drop it once the
> > > limitations
> > > > > > around YARN are resolved. I've started a corresponding vote in
> [1].
> > > > > >
> > > > > > Thanks again,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > >
> > > > > > [1]
> > https://lists.apache.org/thread/v6oz92dfp95qcox45l0f8393089oyjv4
> > > > > >
> > > > > > On Fri, Jan 28, 2022 at 1:53 PM Ferenc Csaky
> > > > <ferenc.cs...@pm.me.invalid
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Yang,
> > > > > > >
> > > > > > > Thank you for the clarification. In general I think we will
> have
> > > time
> > > > > to
> > > > > > > experiment with this until it will be removed totally and
> migrate
> > > our
> > > > > > > solution to use application mode.
> > > > > > >
> > > > > > > Regards,
> > > > > > > F
> > > > > > >
> > > > > > > On 2022/01/26 02:42:24 Yang Wang wrote:
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I remember the application mode was initially named "cluster
> > > mode".
> > > > > As
> > > > > > a
> > > > > > > > contrast, the per-job mode is the "client mode".
> > > > > > > > So I believe application mode should cover all the
> > > functionalities
> > > > of
> > > > > > > > per-job except where we are running the user main code.
> > > > > > > > In the containerized or the Kubernetes world, the application
> > > mode
> > > > is
> > > > > > > more
> > > > > > > > native and easy to use since all the Flink and user
> > > > > > > > jars are bundled in the image. I am also in favor of
> > deprecating
> > > > and
> > > > > > > > removing the per-job in the long run.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > @Ferenc
> > > > > > > > IIRC, the YARN application mode could ship user jars and
> > > > dependencies
> > > > > > via
> > > > > > > > "yarn.ship-files" config option. The only
> > > > > > > > limitation is that we could not ship and load the user
> > > dependencies
> > > > > > with
> > > > > > > > user classloader, not the parent classloader.
> > > > > > > > FLINK-24897 is trying to fix this via supporting "usrlib"
> > > directory
> > > > > > > > automatically.
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yang
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Ferenc Csaky <fe...@pm.me.invalid> 于2022年1月25日周二 22:05写道:
> > > > > > > >
> > > > > > > > > Hi Konstantin,
> > > > > > > > >
> > > > > > > > > First of all, sorry for the delay. We at Cloudera are
> > currently
> > > > > > > relying on
> > > > > > > > > per-job mode deploying Flink applications over YARN.
> > > > > > > > >
> > > > > > > > > Specifically, we allow users to upload connector jars and
> > other
> > > > > > > artifacts.
> > > > > > > > > There are also some default jars that we need to ship.
> These
> > > are
> > > > > all
> > > > > > > stored
> > > > > > > > > on the local file system of our service’s node. The Flink
> job
> > > is
> > > > > > > submitted
> > > > > > > > > on the users’ behalf by our service, which also specifies
> the
> > > > jars
> > > > > to
> > > > > > > ship.
> > > > > > > > > The service runs on a single node, not on all nodes with
> > Flink
> > > > > TM/JM.
> > > > > > > It
> > > > > > > > > would thus be difficult to manage the jars on every node.
> > > > > > > > >
> > > > > > > > > We are not familiar with the reasoning behind why
> application
> > > > mode
> > > > > > > > > currently doesn’t ship the user jars, besides the
> deployment
> > > > being
> > > > > > > faster
> > > > > > > > > this way. Would it be possible for the application mode to
> > > > > > (optionally,
> > > > > > > > > enabled by some config) distribute these, or are there some
> > > > > technical
> > > > > > > > > limitations?
> > > > > > > > >
> > > > > > > > > For us it would be crucial to achieve the functionality we
> > have
> > > > at
> > > > > > the
> > > > > > > > > moment over YARN. We started to track
> > > > > > > > > https://issues.apache.org/jira/browse/FLINK-24897 that
> Biao
> > > Geng
> > > > > > > > > mentioned as well.
> > > > > > > > >
> > > > > > > > > Considering the above, for us the more soonish removal does
> > not
> > > > > sound
> > > > > > > > > really well. We can live with this feature as deprecated of
> > > > course,
> > > > > > > but it
> > > > > > > > > would be nice to have some time to figure out how we can
> > > utilize
> > > > > > > > > Application Mode exactly and make necessary changes if
> > > required.
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > F
> > > > > > > > >
> > > > > > > > > On 2022/01/13 08:30:48 Konstantin Knauf wrote:
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I would like to discuss and understand if the benefits of
> > > > having
> > > > > > > Per-Job
> > > > > > > > > > Mode in Apache Flink outweigh its drawbacks.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *# Background: Flink's Deployment Modes*
> > > > > > > > > > Flink currently has three deployment modes. They differ
> in
> > > the
> > > > > > > following
> > > > > > > > > > dimensions:
> > > > > > > > > > * main() method executed on Jobmanager or Client
> > > > > > > > > > * dependencies shipped by client or bundled with all
> nodes
> > > > > > > > > > * number of jobs per cluster & relationship between job
> and
> > > > > cluster
> > > > > > > > > > lifecycle* (supported resource providers)
> > > > > > > > > >
> > > > > > > > > > ## Application Mode
> > > > > > > > > > * main() method executed on Jobmanager
> > > > > > > > > > * dependencies already need to be available on all nodes
> > > > > > > > > > * dedicated cluster for all jobs executed from the same
> > > > > > main()-method
> > > > > > > > > > (Note: applications with more than one job, currently
> still
> > > > > > > significant
> > > > > > > > > > limitations like missing high-availability).
> Technically, a
> > > > > session
> > > > > > > > > cluster
> > > > > > > > > > dedicated to all jobs submitted from the same main()
> > method.
> > > > > > > > > > * supported by standalone, native kubernetes, YARN
> > > > > > > > > >
> > > > > > > > > > ## Session Mode
> > > > > > > > > > * main() method executed in client
> > > > > > > > > > * dependencies are distributed from and by the client to
> > all
> > > > > nodes
> > > > > > > > > > * cluster is shared by multiple jobs submitted from
> > different
> > > > > > > clients,
> > > > > > > > > > independent lifecycle
> > > > > > > > > > * supported by standalone, Native Kubernetes, YARN
> > > > > > > > > >
> > > > > > > > > > ## Per-Job Mode
> > > > > > > > > > * main() method executed in client
> > > > > > > > > > * dependencies are distributed from and by the client to
> > all
> > > > > nodes
> > > > > > > > > > * dedicated cluster for a single job
> > > > > > > > > > * supported by YARN only
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *# Reasons to Keep** There are use cases where you might
> > need
> > > > the
> > > > > > > > > > combination of a single job per cluster, but main()
> method
> > > > > > execution
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > client. This combination is only supported by per-job
> mode.
> > > > > > > > > > * It currently exists. Existing users will need to
> migrate
> > to
> > > > > > either
> > > > > > > > > > session or application mode.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *# Reasons to Drop** With Per-Job Mode and Application
> Mode
> > > we
> > > > > have
> > > > > > > two
> > > > > > > > > > modes that for most users probably do the same thing.
> > > > > Specifically,
> > > > > > > for
> > > > > > > > > > those users that don't care where the main() method is
> > > executed
> > > > > and
> > > > > > > want
> > > > > > > > > to
> > > > > > > > > > submit a single job per cluster. Having two ways to do
> the
> > > same
> > > > > > > thing is
> > > > > > > > > > confusing.
> > > > > > > > > > * Per-Job Mode is only supported by YARN anyway. If we
> keep
> > > it,
> > > > > we
> > > > > > > should
> > > > > > > > > > work towards support in Kubernetes and Standalone, too,
> to
> > > > reduce
> > > > > > > special
> > > > > > > > > > casing.
> > > > > > > > > > * Dropping per-job mode would reduce complexity in the
> code
> > > and
> > > > > > > allow us
> > > > > > > > > to
> > > > > > > > > > dedicate more resources to the other two deployment
> modes.
> > > > > > > > > > * I believe with session mode and application mode we
> have
> > to
> > > > > > easily
> > > > > > > > > > distinguishable and understandable deployment modes that
> > > cover
> > > > > > > Flink's
> > > > > > > > > use
> > > > > > > > > > cases:
> > > > > > > > > > * session mode: olap-style, interactive jobs/queries,
> short
> > > > lived
> > > > > > > batch
> > > > > > > > > > jobs, very small jobs, traditional cluster-centric
> > deployment
> > > > > mode
> > > > > > > (fits
> > > > > > > > > > the "Hadoop world")
> > > > > > > > > > * application mode: long-running streaming jobs, large
> > scale
> > > &
> > > > > > > > > > heterogenous jobs (resource isolation!),
> > application-centric
> > > > > > > deployment
> > > > > > > > > > mode (fits the "Kubernetes world")
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *# Call to Action*
> > > > > > > > > > * Do you use per-job mode? If so, why & would you be able
> > to
> > > > > > migrate
> > > > > > > to
> > > > > > > > > one
> > > > > > > > > > of the other methods?
> > > > > > > > > > * Am I missing any pros/cons?
> > > > > > > > > > * Are you in favor of dropping per-job mode midterm?
> > > > > > > > > >
> > > > > > > > > > Cheers and thank you,
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Konstantin Knauf
> > > > > > > > > >
> > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > >
> > > > > > > > > > https://github.com/knaufk
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Konstantin Knauf
> > > > > >
> > > > > > https://twitter.com/snntrable
> > > > > >
> > > > > > https://github.com/knaufk
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>

Reply via email to