Hi, Gyula. Thanks a lot for your response.
> In a production environment the gateway would hold the credentials to the different catalogs and may even contain temporary tables etc. In the FLIP-295[1], SQL Gateway has already supported using catalog stores to initialize the catalog lazily. I think we can fetch the required credentials in need rather than put all credentials into the JobManager. I agree we need 2 different runners or designs here. But I don't think we should only support json plan in the sql-gateway because json plan is much limited comparing to than script even though we supports to ship the temporary objects definition: 1. json plan can only contain one job but script can submit multiple jobs. Running multiple jobs in a cluster is useful in batch mode because users usually run a simple query as broadcast variables to speed up the later job. You can refer to the spark document for more details[2]. In Flink, users may run ANALYZE TABLE[3] to calculate the statistics and then use the collected statistics to get a better execution plan for the later job. 2. json plan does not work for some DMLs. For example, CREATE TABLE AS SELECT syntax doesn't have a valid json plan because planner doesn't support getting a json plan if the target table is not present. Many frameworks support compiling scripts in the cluster, e.g. Spark(Driver) or Trino(Coordinator). I don't think Flink JM is limited to managing the job only. Finally, FLIP-480 does not conflict with FLIP-316 and we can support both in the sql-gateway. In our inner implementation, we use script to initialize the table environment(register required temporary objects) and then use the json plan to create the job. I think we can continue our discussion about the compiled plan in FLIP-316. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations [2] https://medium.com/@ARishi/broadcast-variables-in-spark-work-like-they-sound-c08097b80ac4 [3] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/analyze/ Gyula Fóra <gyula.f...@gmail.com> 于2024年11月8日周五 18:11写道: > Hey! > > I am a bit concerned about the design for gateway based submission model. > > I think it's not a good model to access catalog information on the > JobManager side. In a production environment the gateway would hold the > credentials to the different catalogs and may even contain temporary tables > etc. > The JobManager (user environment) would then need to get access to all the > catalog credentials and what not, we would also need to ship temporary > tables from the users current gateway session somehow as part of the > script. > > I think we should leverage the table environment compilePlan feature to > generate the plan on the gateway side and execute that on the JM, then > catalogs wouldn't need to be accessed from the job. > > In other words, I think we need 2 different runners/designs. The direct app > submission should be able to run the script as is, but in the gateway I > think we need to have a runner based on the compiled sql plan instead. > > Cheers, > Gyula > > > On Thu, Nov 7, 2024 at 2:49 AM Shengkai Fang <fskm...@gmail.com> wrote: > > > Hi, eveyone. > > > > The discussion doesn't receive any response for a while. I will close the > > discussion and start the vote tomorrow if the discussion doesn't receive > > any response today. Thanks all yours response! > > > > Best, > > Shengkai > > > > Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 14:00写道: > > > > > Hi, Lincoln. Thanks for your response. > > > > > > > since both scriptPath and script(statements) can be null, we need to > > > clarify the behavior when both are empty, such as throwing an error > > > > > > Yes, you are correct. I have updated the FLIP about this. When these > > > fields are both empty, the server throws an exception to notify users. > > > > > > > use a unified name for script vs statements, like 'script'? > > > > > > Updated. > > > > > > > Regarding Python UDFs, should we change it to a description of > > > "Additional Python resources," corresponding to "Additional Jar > > Resources" > > > > > > Updated. > > > > > > Best, > > > Shengkai > > > > > > > > > > > > Lincoln Lee <lincoln.8...@gmail.com> 于2024年11月5日周二 11:17写道: > > > > > >> Thanks Shengkai for driving this! Overall, looks good!\ > > >> > > >> I have two minor questions: > > >> 1. Regarding the interface parameters (including REST API > > >> & Java interfaces), since both scriptPath and script(statements) > > >> can be null, we need to clarify the behavior when both are > > >> empty, such as throwing an error? > > >> Also use a unified name for script vs statements, like 'script'? > > >> > > >> 2. Regarding Python UDFs, should we change it to a > > >> description of "Additional Python resources," corresponding > > >> to "Additional Jar Resources"? > > >> > > >> > > >> Best, > > >> Lincoln Lee > > >> > > >> > > >> Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 10:16写道: > > >> > > >> > Hi, Ferenc. > > >> > > > >> > Thanks for your clarification. We can hard code these different > > options > > >> in > > >> > the sql-gateway module. I have updated the FLIP and PoC branch about > > >> this > > >> > part. But I think we should provide a unified API to ship artifacts > to > > >> > different deployment. > > >> > > > >> > Best, > > >> > Shengkai > > >> > > > >> > > > >> > > > >> > Ferenc Csaky <ferenc.cs...@pm.me.invalid> 于2024年11月4日周一 21:05写道: > > >> > > > >> > > Hi Shengkai, > > >> > > > > >> > > Thank you for driving this FLIP! I think this is a good way to > > >> > > close this gap on the short-term until FLIP-316 can be finished. > > >> > > > > >> > > I would only like to add one thing: YARN has a `yarn.ship-files` > > >> > > config option that ships local or DFS files/directories to the > > >> > > YARN cluster [1]. > > >> > > > > >> > > Best, > > >> > > Ferenc > > >> > > > > >> > > [1] > > >> > > > > >> > > > >> > > > https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/deployment/config/ > > >> > > > > >> > > > > >> > > > > >> > > On Monday, November 4th, 2024 at 10:11, Xuyang < > xyzhong...@163.com> > > >> > wrote: > > >> > > > > >> > > > > > >> > > > > > >> > > > Hi, Shegnkai. > > >> > > > > > >> > > > Thank you for your answer. I have no further questions. > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > -- > > >> > > > > > >> > > > Best! > > >> > > > Xuyang > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > At 2024-11-04 10:00:32, "Shengkai Fang" fskm...@gmail.com > wrote: > > >> > > > > > >> > > > > Hi, Xuyang. Thanks a lot for your response! > > >> > > > > > > >> > > > > > Does that means we will support multi DMLs, multi DQLs, > mixed > > >> DMLs > > >> > & > > >> > > DQLs > > >> > > > > > in one sql script? > > >> > > > > > > >> > > > > According to the doc[1], application mode only supports one > job > > >> in ha > > >> > > > > mode[2]. If users submit multiple jobs, dispatcher throws a > > >> > > > > DuplicateJobSubmissionException to notify users. > > >> > > > > > > >> > > > > In non-ha mode, the application mode doesn't have job number > > >> > > limitation. > > >> > > > > The SQL driver runs statement one by one and it is similar to > > >> > > submitting > > >> > > > > job to a session cluster. But just as the doc says, when any > of > > >> > > multiple > > >> > > > > running jobs in Application Mode (submitted for example using > > >> > > > > executeAsync()) gets cancelled, all jobs will be stopped and > the > > >> > > JobManager > > >> > > > > will shut down. > > >> > > > > > > >> > > > > [1] > > >> > > > > > > >> > > > > >> > > > >> > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode > > >> > > > > [2] > > >> > > > > > > >> > > > > >> > > > >> > > > https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L218 > > >> > > > > > > >> > > > > Best, > > >> > > > > Shengkai > > >> > > > > > > >> > > > > Xuyang xyzhong...@163.com 于2024年10月31日周四 17:10写道: > > >> > > > > > > >> > > > > > Hi, Shengkai. > > >> > > > > > > > >> > > > > > Thanks for driving this great work. LGTM overall, I just > have > > >> one > > >> > > > > > question. > > >> > > > > > > > >> > > > > > IIUC, application mode supports to run multi-execute in a > > single > > >> > > `main` > > >> > > > > > function[1]. Does that means > > >> > > > > > > > >> > > > > > we will support multi DMLs, multi DQLs, mixed DMLs & DQLs in > > one > > >> > sql > > >> > > > > > script? If yes, can you explain > > >> > > > > > > > >> > > > > > a little about how do they work? > > >> > > > > > > > >> > > > > > [1] > > >> > > > > > > > >> > > > > >> > > > >> > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode > > >> > > > > > > > >> > > > > > -- > > >> > > > > > > > >> > > > > > Best! > > >> > > > > > Xuyang > > >> > > > > > > > >> > > > > > 在 2024-10-31 10:18:13,"Ron Liu" ron9....@gmail.com 写道: > > >> > > > > > > > >> > > > > > > Hi, Shengkai > > >> > > > > > > > > >> > > > > > > Thanks for your quick response. It looks good to me. > > >> > > > > > > > > >> > > > > > > Best > > >> > > > > > > Ron > > >> > > > > > > > > >> > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 10:08写道: > > >> > > > > > > > > >> > > > > > > > Hi, Ron! > > >> > > > > > > > > > >> > > > > > > > > I noticed that you say this FLIP focuses on supporting > > >> deploy > > >> > > sql > > >> > > > > > > > > scripts to the application cluster, does it mean that > it > > >> only > > >> > > supports > > >> > > > > > > > > non-interactive gateway mode? > > >> > > > > > > > > > >> > > > > > > > Yes. This FLIP only supports to deploy a script in > > >> > > non-interactive mode. > > >> > > > > > > > > > >> > > > > > > > > Whether all SQL commands such as DDL & DML & SELECT > are > > >> > > supported. > > >> > > > > > > > > > >> > > > > > > > We supports all SQL commands and the execution results > are > > >> > > visible in > > >> > > > > > > > the > > >> > > > > > > > JM log. But application cluster has some limitations > that > > >> only > > >> > > one job > > >> > > > > > > > is > > >> > > > > > > > allowed to run in the dedicated cluster. > > >> > > > > > > > > > >> > > > > > > > > How to dynamically download the JAR specified by the > > user > > >> > when > > >> > > > > > > > > submitting the sql script, and whether it is possible > to > > >> > > specify a local > > >> > > > > > > > > jar? > > >> > > > > > > > > > >> > > > > > > > This is a good question. I think it's totally up to the > > >> > > deployment api. > > >> > > > > > > > For > > >> > > > > > > > example, kubernetes deployment provides the option > > >> > > > > > > > `kubernetes-artifacts-local-upload-enabled`[1] to upload > > the > > >> > > artifact to > > >> > > > > > > > the DFS but yarn deployment doesn't support to ship the > > >> > > artifacts to > > >> > > > > > > > DFS in > > >> > > > > > > > application mode. If runtime API can provide unified > > >> interface, > > >> > > I think > > >> > > > > > > > we > > >> > > > > > > > can use the unified API to upload local artifacts. > > >> > > Alternatively, we can > > >> > > > > > > > provide a special service that allows sql-gateway to > > support > > >> > > pulling > > >> > > > > > > > jar. > > >> > > > > > > > You can read the future work for more details. > > >> > > > > > > > > > >> > > > > > > > [1] > > >> > > > > > > > >> > > > > > > > >> > > > > >> > > > >> > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-artifacts-local-upload-enabled > > >> > > > > > > > >> > > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 09:30写道: > > >> > > > > > > > > > >> > > > > > > > > Hi, Feng! > > >> > > > > > > > > > > >> > > > > > > > > > if only clusterID is available, it may not be very > > >> > > convenient to > > >> > > > > > > > > > connect > > >> > > > > > > > > > to this application later on. > > >> > > > > > > > > > > >> > > > > > > > > If FLIP-479 is accepted, I think we can just adapt the > > >> > > sql-gateway > > >> > > > > > > > > behaviour to the behaviour that FLIP-479 mentioned. > > >> > > > > > > > > > > >> > > > > > > > > Best, > > >> > > > > > > > > Shengkai > > >> > > > > >> > > > >> > > > > > >