Hi, Gyula.

Thanks a lot for your response.

> In a production environment the gateway would hold the credentials to the
different catalogs and may even contain temporary tables etc.

In the FLIP-295[1], SQL Gateway has already supported using catalog stores
to initialize the catalog lazily. I think we can fetch the required
credentials in need rather than put all credentials into the JobManager.

I agree we need 2 different runners or designs here. But I don't think we
should only support json plan in the sql-gateway because json plan is much
limited comparing to than script even though we supports to ship the
temporary objects definition:

1. json plan can only contain one job but script can submit multiple jobs.
Running multiple jobs in a cluster is useful in batch mode because users
usually run a simple query as broadcast variables to speed up the later
job. You can refer to the spark document for more details[2].
In Flink, users may run ANALYZE TABLE[3] to calculate the statistics and
then use the collected statistics to get a better execution plan for the
later job.

2. json plan does not work for some DMLs. For example, CREATE TABLE AS
SELECT syntax doesn't have a valid json plan because planner doesn't
support getting a json plan if the target table is not present.

Many frameworks support compiling scripts in the cluster, e.g.
Spark(Driver) or Trino(Coordinator). I don't think Flink JM is limited to
managing the job only.

Finally, FLIP-480 does not conflict with FLIP-316 and we can support both
in the sql-gateway. In our inner implementation, we use script to
initialize the table environment(register required temporary objects) and
then use the json plan to create the job. I think we can continue our
discussion about the compiled plan in FLIP-316.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
[2]
https://medium.com/@ARishi/broadcast-variables-in-spark-work-like-they-sound-c08097b80ac4
[3]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/analyze/

Gyula Fóra <gyula.f...@gmail.com> 于2024年11月8日周五 18:11写道:

> Hey!
>
> I am a bit concerned about the design for gateway based submission model.
>
> I think it's not a good model to access catalog information on the
> JobManager side. In a production environment the gateway would hold the
> credentials to the different catalogs and may even contain temporary tables
> etc.
> The JobManager (user environment) would then need to get access to all the
> catalog credentials and what not, we would also need to ship temporary
> tables from the users current gateway session somehow as part of the
> script.
>
> I think we should leverage the table environment compilePlan feature to
> generate the plan on the gateway side and execute that on the JM, then
> catalogs wouldn't need to be accessed from the job.
>
> In other words, I think we need 2 different runners/designs. The direct app
> submission should be able to run the script as is, but in the gateway I
> think we need to have a runner based on the compiled sql plan instead.
>
> Cheers,
> Gyula
>
>
> On Thu, Nov 7, 2024 at 2:49 AM Shengkai Fang <fskm...@gmail.com> wrote:
>
> > Hi, eveyone.
> >
> > The discussion doesn't receive any response for a while. I will close the
> > discussion and start the vote tomorrow if the discussion doesn't receive
> > any response today. Thanks all yours response!
> >
> > Best,
> > Shengkai
> >
> > Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 14:00写道:
> >
> > > Hi, Lincoln. Thanks for your response.
> > >
> > > > since both scriptPath and script(statements) can be null,  we need to
> > > clarify the behavior when both are empty, such as throwing an error
> > >
> > > Yes, you are correct. I have updated the FLIP about this. When these
> > > fields are both empty, the server throws an exception to notify users.
> > >
> > > > use a unified name for script vs statements, like 'script'?
> > >
> > > Updated.
> > >
> > > > Regarding Python UDFs, should we change it to a description of
> > > "Additional Python resources," corresponding to "Additional Jar
> > Resources"
> > >
> > > Updated.
> > >
> > > Best,
> > > Shengkai
> > >
> > >
> > >
> > > Lincoln Lee <lincoln.8...@gmail.com> 于2024年11月5日周二 11:17写道:
> > >
> > >> Thanks Shengkai for driving this! Overall, looks good!\
> > >>
> > >> I have two minor questions:
> > >> 1. Regarding the interface parameters (including REST API
> > >> & Java interfaces), since both scriptPath and script(statements)
> > >> can be null, we need to clarify the behavior when both are
> > >> empty, such as throwing an error?
> > >> Also use a unified name for script vs statements, like 'script'?
> > >>
> > >> 2. Regarding Python UDFs, should we change it to a
> > >> description of "Additional Python resources," corresponding
> > >> to "Additional Jar Resources"?
> > >>
> > >>
> > >> Best,
> > >> Lincoln Lee
> > >>
> > >>
> > >> Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 10:16写道:
> > >>
> > >> > Hi, Ferenc.
> > >> >
> > >> > Thanks for your clarification. We can hard code these different
> > options
> > >> in
> > >> > the sql-gateway module. I have updated the FLIP and PoC branch about
> > >> this
> > >> > part. But I think we should provide a unified API to ship artifacts
> to
> > >> > different deployment.
> > >> >
> > >> > Best,
> > >> > Shengkai
> > >> >
> > >> >
> > >> >
> > >> > Ferenc Csaky <ferenc.cs...@pm.me.invalid> 于2024年11月4日周一 21:05写道:
> > >> >
> > >> > > Hi Shengkai,
> > >> > >
> > >> > > Thank you for driving this FLIP! I think this is a good way to
> > >> > > close this gap on the short-term until FLIP-316 can be finished.
> > >> > >
> > >> > > I would only like to add one thing: YARN has a `yarn.ship-files`
> > >> > > config option that ships local or DFS files/directories to the
> > >> > > YARN cluster [1].
> > >> > >
> > >> > > Best,
> > >> > > Ferenc
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/deployment/config/
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Monday, November 4th, 2024 at 10:11, Xuyang <
> xyzhong...@163.com>
> > >> > wrote:
> > >> > >
> > >> > > >
> > >> > > >
> > >> > > > Hi, Shegnkai.
> > >> > > >
> > >> > > > Thank you for your answer. I have no further questions.
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > >
> > >> > > > Best!
> > >> > > > Xuyang
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > At 2024-11-04 10:00:32, "Shengkai Fang" fskm...@gmail.com
> wrote:
> > >> > > >
> > >> > > > > Hi, Xuyang. Thanks a lot for your response!
> > >> > > > >
> > >> > > > > > Does that means we will support multi DMLs, multi DQLs,
> mixed
> > >> DMLs
> > >> > &
> > >> > > DQLs
> > >> > > > > > in one sql script?
> > >> > > > >
> > >> > > > > According to the doc[1], application mode only supports one
> job
> > >> in ha
> > >> > > > > mode[2]. If users submit multiple jobs, dispatcher throws a
> > >> > > > > DuplicateJobSubmissionException to notify users.
> > >> > > > >
> > >> > > > > In non-ha mode, the application mode doesn't have job number
> > >> > > limitation.
> > >> > > > > The SQL driver runs statement one by one and it is similar to
> > >> > > submitting
> > >> > > > > job to a session cluster. But just as the doc says, when any
> of
> > >> > > multiple
> > >> > > > > running jobs in Application Mode (submitted for example using
> > >> > > > > executeAsync()) gets cancelled, all jobs will be stopped and
> the
> > >> > > JobManager
> > >> > > > > will shut down.
> > >> > > > >
> > >> > > > > [1]
> > >> > > > >
> > >> > >
> > >> >
> > >>
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode
> > >> > > > > [2]
> > >> > > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L218
> > >> > > > >
> > >> > > > > Best,
> > >> > > > > Shengkai
> > >> > > > >
> > >> > > > > Xuyang xyzhong...@163.com 于2024年10月31日周四 17:10写道:
> > >> > > > >
> > >> > > > > > Hi, Shengkai.
> > >> > > > > >
> > >> > > > > > Thanks for driving this great work. LGTM overall, I just
> have
> > >> one
> > >> > > > > > question.
> > >> > > > > >
> > >> > > > > > IIUC, application mode supports to run multi-execute in a
> > single
> > >> > > `main`
> > >> > > > > > function[1]. Does that means
> > >> > > > > >
> > >> > > > > > we will support multi DMLs, multi DQLs, mixed DMLs & DQLs in
> > one
> > >> > sql
> > >> > > > > > script? If yes, can you explain
> > >> > > > > >
> > >> > > > > > a little about how do they work?
> > >> > > > > >
> > >> > > > > > [1]
> > >> > > > > >
> > >> > >
> > >> >
> > >>
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > >
> > >> > > > > > Best!
> > >> > > > > > Xuyang
> > >> > > > > >
> > >> > > > > > 在 2024-10-31 10:18:13,"Ron Liu" ron9....@gmail.com 写道:
> > >> > > > > >
> > >> > > > > > > Hi, Shengkai
> > >> > > > > > >
> > >> > > > > > > Thanks for your quick response. It looks good to me.
> > >> > > > > > >
> > >> > > > > > > Best
> > >> > > > > > > Ron
> > >> > > > > > >
> > >> > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 10:08写道:
> > >> > > > > > >
> > >> > > > > > > > Hi, Ron!
> > >> > > > > > > >
> > >> > > > > > > > > I noticed that you say this FLIP focuses on supporting
> > >> deploy
> > >> > > sql
> > >> > > > > > > > > scripts to the application cluster, does it mean that
> it
> > >> only
> > >> > > supports
> > >> > > > > > > > > non-interactive gateway mode?
> > >> > > > > > > >
> > >> > > > > > > > Yes. This FLIP only supports to deploy a script in
> > >> > > non-interactive mode.
> > >> > > > > > > >
> > >> > > > > > > > > Whether all SQL commands such as DDL & DML & SELECT
> are
> > >> > > supported.
> > >> > > > > > > >
> > >> > > > > > > > We supports all SQL commands and the execution results
> are
> > >> > > visible in
> > >> > > > > > > > the
> > >> > > > > > > > JM log. But application cluster has some limitations
> that
> > >> only
> > >> > > one job
> > >> > > > > > > > is
> > >> > > > > > > > allowed to run in the dedicated cluster.
> > >> > > > > > > >
> > >> > > > > > > > > How to dynamically download the JAR specified by the
> > user
> > >> > when
> > >> > > > > > > > > submitting the sql script, and whether it is possible
> to
> > >> > > specify a local
> > >> > > > > > > > > jar?
> > >> > > > > > > >
> > >> > > > > > > > This is a good question. I think it's totally up to the
> > >> > > deployment api.
> > >> > > > > > > > For
> > >> > > > > > > > example, kubernetes deployment provides the option
> > >> > > > > > > > `kubernetes-artifacts-local-upload-enabled`[1] to upload
> > the
> > >> > > artifact to
> > >> > > > > > > > the DFS but yarn deployment doesn't support to ship the
> > >> > > artifacts to
> > >> > > > > > > > DFS in
> > >> > > > > > > > application mode. If runtime API can provide unified
> > >> interface,
> > >> > > I think
> > >> > > > > > > > we
> > >> > > > > > > > can use the unified API to upload local artifacts.
> > >> > > Alternatively, we can
> > >> > > > > > > > provide a special service that allows sql-gateway to
> > support
> > >> > > pulling
> > >> > > > > > > > jar.
> > >> > > > > > > > You can read the future work for more details.
> > >> > > > > > > >
> > >> > > > > > > > [1]
> > >> > > > > >
> > >> > > > > >
> > >> > >
> > >> >
> > >>
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-artifacts-local-upload-enabled
> > >> > > > > >
> > >> > > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 09:30写道:
> > >> > > > > > > >
> > >> > > > > > > > > Hi, Feng!
> > >> > > > > > > > >
> > >> > > > > > > > > > if only clusterID is available, it may not be very
> > >> > > convenient to
> > >> > > > > > > > > > connect
> > >> > > > > > > > > > to this application later on.
> > >> > > > > > > > >
> > >> > > > > > > > > If FLIP-479 is accepted, I think we can just adapt the
> > >> > > sql-gateway
> > >> > > > > > > > > behaviour to the behaviour that FLIP-479 mentioned.
> > >> > > > > > > > >
> > >> > > > > > > > > Best,
> > >> > > > > > > > > Shengkai
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to