Hey! My point regarding the credentials wasn't about lazy/non-lazy catalog initialization. The problem is that JM may run in an env where you don't necessarily want to have the same credentials at all.
> I agree we need 2 different runners or designs here. But I don't think we > should only support json plan in the sql-gateway because json plan is much > limited comparing to than script even though we supports to ship the > temporary objects definition: Seems like there is a lot of overlap / conflicting design here with FLIP-316. It would be great to have non-overlapping scopes for these Flips because it's going to be strange to implement FLIP-480 first and then a completely separate logic in FLIP-316 for the same thing. I think this FLIP FLIP-480 should not cover the SQL gateway and probably should not change it in any way. The gateway related changes should be consolidated in FLIP-316 and implemented as part of that for Flink 2.0. Cheers, Gyula On Mon, Nov 11, 2024 at 4:26 AM Shengkai Fang <fskm...@gmail.com> wrote: > Hi, Gyula. > > Thanks a lot for your response. > > > In a production environment the gateway would hold the credentials to the > different catalogs and may even contain temporary tables etc. > > In the FLIP-295[1], SQL Gateway has already supported using catalog stores > to initialize the catalog lazily. I think we can fetch the required > credentials in need rather than put all credentials into the JobManager. > > I agree we need 2 different runners or designs here. But I don't think we > should only support json plan in the sql-gateway because json plan is much > limited comparing to than script even though we supports to ship the > temporary objects definition: > > 1. json plan can only contain one job but script can submit multiple jobs. > Running multiple jobs in a cluster is useful in batch mode because users > usually run a simple query as broadcast variables to speed up the later > job. You can refer to the spark document for more details[2]. > In Flink, users may run ANALYZE TABLE[3] to calculate the statistics and > then use the collected statistics to get a better execution plan for the > later job. > > 2. json plan does not work for some DMLs. For example, CREATE TABLE AS > SELECT syntax doesn't have a valid json plan because planner doesn't > support getting a json plan if the target table is not present. > > Many frameworks support compiling scripts in the cluster, e.g. > Spark(Driver) or Trino(Coordinator). I don't think Flink JM is limited to > managing the job only. > > Finally, FLIP-480 does not conflict with FLIP-316 and we can support both > in the sql-gateway. In our inner implementation, we use script to > initialize the table environment(register required temporary objects) and > then use the json plan to create the job. I think we can continue our > discussion about the compiled plan in FLIP-316. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations > [2] > > https://medium.com/@ARishi/broadcast-variables-in-spark-work-like-they-sound-c08097b80ac4 > [3] > > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/analyze/ > > Gyula Fóra <gyula.f...@gmail.com> 于2024年11月8日周五 18:11写道: > > > Hey! > > > > I am a bit concerned about the design for gateway based submission model. > > > > I think it's not a good model to access catalog information on the > > JobManager side. In a production environment the gateway would hold the > > credentials to the different catalogs and may even contain temporary > tables > > etc. > > The JobManager (user environment) would then need to get access to all > the > > catalog credentials and what not, we would also need to ship temporary > > tables from the users current gateway session somehow as part of the > > script. > > > > I think we should leverage the table environment compilePlan feature to > > generate the plan on the gateway side and execute that on the JM, then > > catalogs wouldn't need to be accessed from the job. > > > > In other words, I think we need 2 different runners/designs. The direct > app > > submission should be able to run the script as is, but in the gateway I > > think we need to have a runner based on the compiled sql plan instead. > > > > Cheers, > > Gyula > > > > > > On Thu, Nov 7, 2024 at 2:49 AM Shengkai Fang <fskm...@gmail.com> wrote: > > > > > Hi, eveyone. > > > > > > The discussion doesn't receive any response for a while. I will close > the > > > discussion and start the vote tomorrow if the discussion doesn't > receive > > > any response today. Thanks all yours response! > > > > > > Best, > > > Shengkai > > > > > > Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 14:00写道: > > > > > > > Hi, Lincoln. Thanks for your response. > > > > > > > > > since both scriptPath and script(statements) can be null, we need > to > > > > clarify the behavior when both are empty, such as throwing an error > > > > > > > > Yes, you are correct. I have updated the FLIP about this. When these > > > > fields are both empty, the server throws an exception to notify > users. > > > > > > > > > use a unified name for script vs statements, like 'script'? > > > > > > > > Updated. > > > > > > > > > Regarding Python UDFs, should we change it to a description of > > > > "Additional Python resources," corresponding to "Additional Jar > > > Resources" > > > > > > > > Updated. > > > > > > > > Best, > > > > Shengkai > > > > > > > > > > > > > > > > Lincoln Lee <lincoln.8...@gmail.com> 于2024年11月5日周二 11:17写道: > > > > > > > >> Thanks Shengkai for driving this! Overall, looks good!\ > > > >> > > > >> I have two minor questions: > > > >> 1. Regarding the interface parameters (including REST API > > > >> & Java interfaces), since both scriptPath and script(statements) > > > >> can be null, we need to clarify the behavior when both are > > > >> empty, such as throwing an error? > > > >> Also use a unified name for script vs statements, like 'script'? > > > >> > > > >> 2. Regarding Python UDFs, should we change it to a > > > >> description of "Additional Python resources," corresponding > > > >> to "Additional Jar Resources"? > > > >> > > > >> > > > >> Best, > > > >> Lincoln Lee > > > >> > > > >> > > > >> Shengkai Fang <fskm...@gmail.com> 于2024年11月5日周二 10:16写道: > > > >> > > > >> > Hi, Ferenc. > > > >> > > > > >> > Thanks for your clarification. We can hard code these different > > > options > > > >> in > > > >> > the sql-gateway module. I have updated the FLIP and PoC branch > about > > > >> this > > > >> > part. But I think we should provide a unified API to ship > artifacts > > to > > > >> > different deployment. > > > >> > > > > >> > Best, > > > >> > Shengkai > > > >> > > > > >> > > > > >> > > > > >> > Ferenc Csaky <ferenc.cs...@pm.me.invalid> 于2024年11月4日周一 21:05写道: > > > >> > > > > >> > > Hi Shengkai, > > > >> > > > > > >> > > Thank you for driving this FLIP! I think this is a good way to > > > >> > > close this gap on the short-term until FLIP-316 can be finished. > > > >> > > > > > >> > > I would only like to add one thing: YARN has a `yarn.ship-files` > > > >> > > config option that ships local or DFS files/directories to the > > > >> > > YARN cluster [1]. > > > >> > > > > > >> > > Best, > > > >> > > Ferenc > > > >> > > > > > >> > > [1] > > > >> > > > > > >> > > > > >> > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/deployment/config/ > > > >> > > > > > >> > > > > > >> > > > > > >> > > On Monday, November 4th, 2024 at 10:11, Xuyang < > > xyzhong...@163.com> > > > >> > wrote: > > > >> > > > > > >> > > > > > > >> > > > > > > >> > > > Hi, Shegnkai. > > > >> > > > > > > >> > > > Thank you for your answer. I have no further questions. > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > -- > > > >> > > > > > > >> > > > Best! > > > >> > > > Xuyang > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > At 2024-11-04 10:00:32, "Shengkai Fang" fskm...@gmail.com > > wrote: > > > >> > > > > > > >> > > > > Hi, Xuyang. Thanks a lot for your response! > > > >> > > > > > > > >> > > > > > Does that means we will support multi DMLs, multi DQLs, > > mixed > > > >> DMLs > > > >> > & > > > >> > > DQLs > > > >> > > > > > in one sql script? > > > >> > > > > > > > >> > > > > According to the doc[1], application mode only supports one > > job > > > >> in ha > > > >> > > > > mode[2]. If users submit multiple jobs, dispatcher throws a > > > >> > > > > DuplicateJobSubmissionException to notify users. > > > >> > > > > > > > >> > > > > In non-ha mode, the application mode doesn't have job number > > > >> > > limitation. > > > >> > > > > The SQL driver runs statement one by one and it is similar > to > > > >> > > submitting > > > >> > > > > job to a session cluster. But just as the doc says, when any > > of > > > >> > > multiple > > > >> > > > > running jobs in Application Mode (submitted for example > using > > > >> > > > > executeAsync()) gets cancelled, all jobs will be stopped and > > the > > > >> > > JobManager > > > >> > > > > will shut down. > > > >> > > > > > > > >> > > > > [1] > > > >> > > > > > > > >> > > > > > >> > > > > >> > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode > > > >> > > > > [2] > > > >> > > > > > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L218 > > > >> > > > > > > > >> > > > > Best, > > > >> > > > > Shengkai > > > >> > > > > > > > >> > > > > Xuyang xyzhong...@163.com 于2024年10月31日周四 17:10写道: > > > >> > > > > > > > >> > > > > > Hi, Shengkai. > > > >> > > > > > > > > >> > > > > > Thanks for driving this great work. LGTM overall, I just > > have > > > >> one > > > >> > > > > > question. > > > >> > > > > > > > > >> > > > > > IIUC, application mode supports to run multi-execute in a > > > single > > > >> > > `main` > > > >> > > > > > function[1]. Does that means > > > >> > > > > > > > > >> > > > > > we will support multi DMLs, multi DQLs, mixed DMLs & DQLs > in > > > one > > > >> > sql > > > >> > > > > > script? If yes, can you explain > > > >> > > > > > > > > >> > > > > > a little about how do they work? > > > >> > > > > > > > > >> > > > > > [1] > > > >> > > > > > > > > >> > > > > > >> > > > > >> > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode > > > >> > > > > > > > > >> > > > > > -- > > > >> > > > > > > > > >> > > > > > Best! > > > >> > > > > > Xuyang > > > >> > > > > > > > > >> > > > > > 在 2024-10-31 10:18:13,"Ron Liu" ron9....@gmail.com 写道: > > > >> > > > > > > > > >> > > > > > > Hi, Shengkai > > > >> > > > > > > > > > >> > > > > > > Thanks for your quick response. It looks good to me. > > > >> > > > > > > > > > >> > > > > > > Best > > > >> > > > > > > Ron > > > >> > > > > > > > > > >> > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 10:08写道: > > > >> > > > > > > > > > >> > > > > > > > Hi, Ron! > > > >> > > > > > > > > > > >> > > > > > > > > I noticed that you say this FLIP focuses on > supporting > > > >> deploy > > > >> > > sql > > > >> > > > > > > > > scripts to the application cluster, does it mean > that > > it > > > >> only > > > >> > > supports > > > >> > > > > > > > > non-interactive gateway mode? > > > >> > > > > > > > > > > >> > > > > > > > Yes. This FLIP only supports to deploy a script in > > > >> > > non-interactive mode. > > > >> > > > > > > > > > > >> > > > > > > > > Whether all SQL commands such as DDL & DML & SELECT > > are > > > >> > > supported. > > > >> > > > > > > > > > > >> > > > > > > > We supports all SQL commands and the execution results > > are > > > >> > > visible in > > > >> > > > > > > > the > > > >> > > > > > > > JM log. But application cluster has some limitations > > that > > > >> only > > > >> > > one job > > > >> > > > > > > > is > > > >> > > > > > > > allowed to run in the dedicated cluster. > > > >> > > > > > > > > > > >> > > > > > > > > How to dynamically download the JAR specified by the > > > user > > > >> > when > > > >> > > > > > > > > submitting the sql script, and whether it is > possible > > to > > > >> > > specify a local > > > >> > > > > > > > > jar? > > > >> > > > > > > > > > > >> > > > > > > > This is a good question. I think it's totally up to > the > > > >> > > deployment api. > > > >> > > > > > > > For > > > >> > > > > > > > example, kubernetes deployment provides the option > > > >> > > > > > > > `kubernetes-artifacts-local-upload-enabled`[1] to > upload > > > the > > > >> > > artifact to > > > >> > > > > > > > the DFS but yarn deployment doesn't support to ship > the > > > >> > > artifacts to > > > >> > > > > > > > DFS in > > > >> > > > > > > > application mode. If runtime API can provide unified > > > >> interface, > > > >> > > I think > > > >> > > > > > > > we > > > >> > > > > > > > can use the unified API to upload local artifacts. > > > >> > > Alternatively, we can > > > >> > > > > > > > provide a special service that allows sql-gateway to > > > support > > > >> > > pulling > > > >> > > > > > > > jar. > > > >> > > > > > > > You can read the future work for more details. > > > >> > > > > > > > > > > >> > > > > > > > [1] > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > >> > > > > >> > > > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-artifacts-local-upload-enabled > > > >> > > > > > > > > >> > > > > > > > Shengkai Fang fskm...@gmail.com 于2024年10月31日周四 > 09:30写道: > > > >> > > > > > > > > > > >> > > > > > > > > Hi, Feng! > > > >> > > > > > > > > > > > >> > > > > > > > > > if only clusterID is available, it may not be very > > > >> > > convenient to > > > >> > > > > > > > > > connect > > > >> > > > > > > > > > to this application later on. > > > >> > > > > > > > > > > > >> > > > > > > > > If FLIP-479 is accepted, I think we can just adapt > the > > > >> > > sql-gateway > > > >> > > > > > > > > behaviour to the behaviour that FLIP-479 mentioned. > > > >> > > > > > > > > > > > >> > > > > > > > > Best, > > > >> > > > > > > > > Shengkai > > > >> > > > > > >> > > > > >> > > > > > > > > > >