Hi Mason, I get your point. I'm increasingly feeling the need to introduce a built-in file distribution mechanism for flink-kubernetes module, just like Spark does with `spark.kubernetes.file.upload.path` [1].
I’m assuming the workflow is as follows: - KubernetesClusterDescripter uploads all local resources to a remote storage via Flink filesystem (skips if the resources are already remote). - KubernetesApplicationClusterEntrypoint downloads the resources and put them in the classpath during startup. I wouldn't mind splitting it into another FLIP to ensure that everything is done correctly. cc'ed @Yang to gather more opinions. [1] https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management Best, Paul Lam > 2023年6月8日 12:15,Mason Chen <mas.chen6...@gmail.com> 写道: > > Hi Paul, > > Thanks for your response! > > I agree that utilizing SQL Drivers in Java applications is equally important >> as employing them in SQL Gateway. WRT init containers, I think most >> users use them just as a workaround. For example, wget a jar from the >> maven repo. >> >> We could implement the functionality in SQL Driver in a more graceful >> way and the flink-supported filesystem approach seems to be a >> good choice. >> > > My main point is: can we solve the problem with a design agnostic of SQL > and Stream API? I mentioned a use case where this ability is useful for > Java or Stream API applications. Maybe this is even a non-goal to your FLIP > since you are focusing on the driver entrypoint. > > Jark mentioned some optimizations: > >> This allows SQLGateway to leverage some metadata caching and UDF JAR >> caching for better compiling performance. >> > It would be great to see this even outside the SQLGateway (i.e. UDF JAR > caching). > > Best, > Mason > > On Wed, Jun 7, 2023 at 2:26 AM Shengkai Fang <fskm...@gmail.com> wrote: > >> Hi. Paul. Thanks for your update and the update makes me understand the >> design much better. >> >> But I still have some questions about the FLIP. >> >>> For SQL Gateway, only DMLs need to be delegated to the SQL server >>> Driver. I would think about the details and update the FLIP. Do you have >> some >>> ideas already? >> >> If the applicaiton mode can not support library mode, I think we should >> only execute INSERT INTO and UPDATE/ DELETE statement in the application >> mode. AFAIK, we can not support ANALYZE TABLE and CALL PROCEDURE >> statements. The ANALYZE TABLE syntax need to register the statistic to the >> catalog after job finishes and the CALL PROCEDURE statement doesn't >> generate the ExecNodeGraph. >> >> * Introduce storage via option `sql-gateway.application.storage-dir` >> >> If we can not support to submit the jars through web submission, +1 to >> introduce the options to upload the files. While I think the uploader >> should be responsible to remove the uploaded jars. Can we remove the jars >> if the job is running or gateway exits? >> >> * JobID is not avaliable >> >> Can we use the returned rest client by ApplicationDeployer to query the job >> id? I am concerned that users don't know which job is related to the >> submitted SQL. >> >> * Do we need to introduce a new module named flink-table-sql-runner? >> >> It seems we need to introduce a new module. Will the new module is >> available in the distribution package? I agree with Jark that we don't need >> to introduce this for table-API users and these users have their main >> class. If we want to make users write the k8s operator more easily, I think >> we should modify the k8s operator repo. If we don't need to support SQL >> files, can we make this jar only visible in the sql-gateway like we do in >> the planner loader?[1] >> >> [1] >> >> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95 >> >> Best, >> Shengkai >> >> >> >> >> >> >> >> >> Weihua Hu <huweihua....@gmail.com> 于2023年6月7日周三 10:52写道: >> >>> Hi, >>> >>> Thanks for updating the FLIP. >>> >>> I have two cents on the distribution of SQLs and resources. >>> 1. Should we support a common file distribution mechanism for k8s >>> application mode? >>> I have seen some issues and requirements on the mailing list. >>> In our production environment, we implement the download command in the >>> CliFrontend. >>> And automatically add an init container to the POD for file >> downloading. >>> The advantage of this >>> is that we can use all Flink-supported file systems to store files. >>> >>> This need more discussion. I would appreciate hearing more opinions. >>> >>> 2. In this FLIP, we distribute files in two different ways in YARN and >>> Kubernetes. Can we combine it in one way? >>> If we don't want to implement a common file distribution for k8s >>> application mode. Could we use the SQLDriver >>> to download the files both in YARN and K8S? IMO, this can reduce the >> cost >>> of code maintenance. >>> >>> Best, >>> Weihua >>> >>> >>> On Wed, Jun 7, 2023 at 10:18 AM Paul Lam <paullin3...@gmail.com> wrote: >>> >>>> Hi Mason, >>>> >>>> Thanks for your input! >>>> >>>>> +1 for init containers or a more generalized way of obtaining >> arbitrary >>>>> files. File fetching isn't specific to just SQL--it also matters for >>> Java >>>>> applications if the user doesn't want to rebuild a Flink image and >> just >>>>> wants to modify the user application fat jar. >>>> >>>> I agree that utilizing SQL Drivers in Java applications is equally >>>> important >>>> as employing them in SQL Gateway. WRT init containers, I think most >>>> users use them just as a workaround. For example, wget a jar from the >>>> maven repo. >>>> >>>> We could implement the functionality in SQL Driver in a more graceful >>>> way and the flink-supported filesystem approach seems to be a >>>> good choice. >>>> >>>>> Also, what do you think about prefixing the config options with >>>>> `sql-driver` instead of just `sql` to be more specific? >>>> >>>> LGTM, since SQL Driver is a public interface and the options are >>>> specific to it. >>>> >>>> Best, >>>> Paul Lam >>>> >>>>> 2023年6月6日 06:30,Mason Chen <mas.chen6...@gmail.com> 写道: >>>>> >>>>> Hi Paul, >>>>> >>>>> +1 for this feature and supporting SQL file + JSON plans. We get a >> lot >>> of >>>>> requests to just be able to submit a SQL file, but the JSON plan >>>>> optimizations make sense. >>>>> >>>>> +1 for init containers or a more generalized way of obtaining >> arbitrary >>>>> files. File fetching isn't specific to just SQL--it also matters for >>> Java >>>>> applications if the user doesn't want to rebuild a Flink image and >> just >>>>> wants to modify the user application fat jar. >>>>> >>>>> Please note that we could reuse the checkpoint storage like S3/HDFS, >>>> which >>>>>> should >>>>> >>>>> be required to run Flink in production, so I guess that would be >>>> acceptable >>>>>> for most >>>>> >>>>> users. WDYT? >>>>> >>>>> >>>>> If you do go this route, it would be nice to support writing these >>> files >>>> to >>>>> S3/HDFS via Flink. This makes access control and policy management >>>> simpler. >>>>> >>>>> Also, what do you think about prefixing the config options with >>>>> `sql-driver` instead of just `sql` to be more specific? >>>>> >>>>> Best, >>>>> Mason >>>>> >>>>> On Mon, Jun 5, 2023 at 2:28 AM Paul Lam <paullin3...@gmail.com >>> <mailto: >>>> paullin3...@gmail.com>> wrote: >>>>> >>>>>> Hi Jark, >>>>>> >>>>>> Thanks for your input! Please see my comments inline. >>>>>> >>>>>>> Isn't Table API the same way as DataSream jobs to submit Flink SQL? >>>>>>> DataStream API also doesn't provide a default main class for users, >>>>>>> why do we need to provide such one for SQL? >>>>>> >>>>>> Sorry for the confusion I caused. By DataStream jobs, I mean jobs >>>> submitted >>>>>> via Flink CLI which actually could be DataStream/Table jobs. >>>>>> >>>>>> I think a default main class would be user-friendly which eliminates >>> the >>>>>> need >>>>>> for users to write a main class as SQLRunner in Flink K8s operator >>> [1]. >>>>>> >>>>>>> I thought the proposed SqlDriver was a dedicated main class >> accepting >>>>>> SQL files, is >>>>>>> that correct? >>>>>> >>>>>> Both JSON plans and SQL files are accepted. SQL Gateway should use >>> JSON >>>>>> plans, >>>>>> while CLI users may use either JSON plans or SQL files. >>>>>> >>>>>> Please see the updated FLIP[2] for more details. >>>>>> >>>>>>> Personally, I prefer the way of init containers which doesn't >> depend >>> on >>>>>>> additional components. >>>>>>> This can reduce the moving parts of a production environment. >>>>>>> Depending on a distributed file system makes the testing, demo, and >>>> local >>>>>>> setup harder than init containers. >>>>>> >>>>>> Please note that we could reuse the checkpoint storage like S3/HDFS, >>>> which >>>>>> should >>>>>> be required to run Flink in production, so I guess that would be >>>>>> acceptable for most >>>>>> users. WDYT? >>>>>> >>>>>> WRT testing, demo, and local setups, I think we could support the >>> local >>>>>> filesystem >>>>>> scheme i.e. file://** as the state backends do. It works as long as >>> SQL >>>>>> Gateway >>>>>> and JobManager(or SQL Driver) can access the resource directory >>>> (specified >>>>>> via >>>>>> `sql-gateway.application.storage-dir`). >>>>>> >>>>>> Thanks! >>>>>> >>>>>> [1] >>>>>> >>>> >>> >> https://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.java >>>>>> [2] >>>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver >>>>>> [3] >>>>>> >>>> >>> >> https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161 >>>>>> >>>>>> Best, >>>>>> Paul Lam >>>>>> >>>>>>> 2023年6月3日 12:21,Jark Wu <imj...@gmail.com> 写道: >>>>>>> >>>>>>> Hi Paul, >>>>>>> >>>>>>> Thanks for your reply. I left my comments inline. >>>>>>> >>>>>>>> As the FLIP said, it’s good to have a default main class for Flink >>>> SQLs, >>>>>>>> which allows users to submit Flink SQLs in the same way as >>> DataStream >>>>>>>> jobs, or else users need to write their own main class. >>>>>>> >>>>>>> Isn't Table API the same way as DataSream jobs to submit Flink SQL? >>>>>>> DataStream API also doesn't provide a default main class for users, >>>>>>> why do we need to provide such one for SQL? >>>>>>> >>>>>>>> With the help of ExecNodeGraph, do we still need the serialized >>>>>>>> SessionState? If not, we could make SQL Driver accepts two >>> serialized >>>>>>>> formats: >>>>>>> >>>>>>> No, ExecNodeGraph doesn't need to serialize SessionState. I thought >>> the >>>>>>> proposed SqlDriver was a dedicated main class accepting SQL files, >> is >>>>>>> that correct? >>>>>>> If true, we have to ship the SessionState for this case which is a >>>> large >>>>>>> work. >>>>>>> I think we just need a JsonPlanDriver which is a main class that >>>> accepts >>>>>>> JsonPlan as the parameter. >>>>>>> >>>>>>> >>>>>>>> The common solutions I know is to use distributed file systems or >>> use >>>>>>>> init containers to localize the resources. >>>>>>> >>>>>>> Personally, I prefer the way of init containers which doesn't >> depend >>> on >>>>>>> additional components. >>>>>>> This can reduce the moving parts of a production environment. >>>>>>> Depending on a distributed file system makes the testing, demo, and >>>> local >>>>>>> setup harder than init containers. >>>>>>> >>>>>>> Best, >>>>>>> Jark >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, 2 Jun 2023 at 18:10, Paul Lam <paullin3...@gmail.com >>> <mailto: >>>> paullin3...@gmail.com> <mailto: >>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> wrote: >>>>>>> >>>>>>>> The FLIP is in the early phase and some details are not included, >>> but >>>>>>>> fortunately, we got lots of valuable ideas from the discussion. >>>>>>>> >>>>>>>> Thanks to everyone who joined the dissuasion! >>>>>>>> @Weihua @Shanmon @Shengkai @Biao @Jark >>>>>>>> >>>>>>>> This weekend I’m gonna revisit and update the FLIP, adding more >>>>>>>> details. Hopefully, we can further align our opinions. >>>>>>>> >>>>>>>> Best, >>>>>>>> Paul Lam >>>>>>>> >>>>>>>>> 2023年6月2日 18:02,Paul Lam <paullin3...@gmail.com <mailto: >>>> paullin3...@gmail.com>> 写道: >>>>>>>>> >>>>>>>>> Hi Jark, >>>>>>>>> >>>>>>>>> Thanks a lot for your input! >>>>>>>>> >>>>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it >>>> still >>>>>>>>>> necessary to support SQL Driver? >>>>>>>>> >>>>>>>>> I think so. Apart from usage in SQL Gateway, SQL Driver could >>>> simplify >>>>>>>>> Flink SQL execution with Flink CLI. >>>>>>>>> >>>>>>>>> As the FLIP said, it’s good to have a default main class for >> Flink >>>>>> SQLs, >>>>>>>>> which allows users to submit Flink SQLs in the same way as >>> DataStream >>>>>>>>> jobs, or else users need to write their own main class. >>>>>>>>> >>>>>>>>>> SQL Driver needs to serialize SessionState which is very >>> challenging >>>>>>>>>> but not detailed covered in the FLIP. >>>>>>>>> >>>>>>>>> With the help of ExecNodeGraph, do we still need the serialized >>>>>>>>> SessionState? If not, we could make SQL Driver accepts two >>> serialized >>>>>>>>> formats: >>>>>>>>> >>>>>>>>> - SQL files for user-facing public usage >>>>>>>>> - ExecNodeGraph for internal usage >>>>>>>>> >>>>>>>>> It’s kind of similar to the relationship between job jars and >>>>>> jobgraphs. >>>>>>>>> >>>>>>>>>> Regarding "K8S doesn't support shipping multiple jars", is that >>>> true? >>>>>>>> Is it >>>>>>>>>> possible to support it? >>>>>>>>> >>>>>>>>> Yes, K8s doesn’t distribute any files. It’s the users’ >>> responsibility >>>>>> to >>>>>>>> make >>>>>>>>> sure the resources are accessible in the containers. The common >>>>>> solutions >>>>>>>>> I know is to use distributed file systems or use init containers >> to >>>>>>>> localize the >>>>>>>>> resources. >>>>>>>>> >>>>>>>>> Now I lean toward introducing a fs to do the distribution job. >>> WDYT? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Paul Lam >>>>>>>>> >>>>>>>>>> 2023年6月1日 20:33,Jark Wu <imj...@gmail.com <mailto: >>> imj...@gmail.com> >>>> <mailto:imj...@gmail.com <mailto:imj...@gmail.com>> >>>>>> <mailto:imj...@gmail.com <mailto:imj...@gmail.com> <mailto: >>>> imj...@gmail.com <mailto:imj...@gmail.com>>>> >>>>>>>> 写道: >>>>>>>>>> >>>>>>>>>> Hi Paul, >>>>>>>>>> >>>>>>>>>> Thanks for starting this discussion. I like the proposal! This >> is >>> a >>>>>>>>>> frequently requested feature! >>>>>>>>>> >>>>>>>>>> I agree with Shengkai that ExecNodeGraph as the submission >> object >>>> is a >>>>>>>>>> better idea than SQL file. To be more specific, it should be >>>>>>>> JsonPlanGraph >>>>>>>>>> or CompiledPlan which is the serializable representation. >>>> CompiledPlan >>>>>>>> is a >>>>>>>>>> clear separation between compiling/optimization/validation and >>>>>>>> execution. >>>>>>>>>> This can keep the validation and metadata accessing still on the >>>>>>>> SQLGateway >>>>>>>>>> side. This allows SQLGateway to leverage some metadata caching >> and >>>> UDF >>>>>>>> JAR >>>>>>>>>> caching for better compiling performance. >>>>>>>>>> >>>>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it >>>> still >>>>>>>>>> necessary to support SQL Driver? Regarding non-interactive SQL >>> jobs, >>>>>>>> users >>>>>>>>>> can use the Table API program for application mode. SQL Driver >>> needs >>>>>> to >>>>>>>>>> serialize SessionState which is very challenging but not >> detailed >>>>>>>> covered >>>>>>>>>> in the FLIP. >>>>>>>>>> >>>>>>>>>> Regarding "K8S doesn't support shipping multiple jars", is that >>>> true? >>>>>>>> Is it >>>>>>>>>> possible to support it? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jark >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, 1 Jun 2023 at 16:58, Paul Lam <paullin3...@gmail.com >>>> <mailto:paullin3...@gmail.com> <mailto: >>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com>> <mailto: >>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> <mailto: >>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Weihua, >>>>>>>>>>> >>>>>>>>>>> You’re right. Distributing the SQLs to the TMs is one of the >>>>>>>> challenging >>>>>>>>>>> parts of this FLIP. >>>>>>>>>>> >>>>>>>>>>> Web submission is not enabled in application mode currently as >>> you >>>>>>>> said, >>>>>>>>>>> but it could be changed if we have good reasons. >>>>>>>>>>> >>>>>>>>>>> What do you think about introducing a distributed storage for >> SQL >>>>>>>> Gateway? >>>>>>>>>>> >>>>>>>>>>> We could make use of Flink file systems [1] to distribute the >> SQL >>>>>>>> Gateway >>>>>>>>>>> generated resources, that should solve the problem at its root >>>> cause. >>>>>>>>>>> >>>>>>>>>>> Users could specify Flink-supported file systems to ship files. >>>> It’s >>>>>>>> only >>>>>>>>>>> required when using SQL Gateway with K8s application mode. >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>> < >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>>> >>>>>> < >>>>>> >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>> < >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>>> >>>>>>> >>>>>>>> < >>>>>>>> >>>>>> >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>> < >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>>> >>>>>> < >>>>>> >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>> < >>>> >>> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >>>>> >>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Paul Lam >>>>>>>>>>> >>>>>>>>>>>> 2023年6月1日 13:55,Weihua Hu <huweihua....@gmail.com <mailto: >>>> huweihua....@gmail.com> <mailto: >>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com>>> 写道: >>>>>>>>>>>> >>>>>>>>>>>> Thanks Paul for your reply. >>>>>>>>>>>> >>>>>>>>>>>> SQLDriver looks good to me. >>>>>>>>>>>> >>>>>>>>>>>> 2. Do you mean a pass the SQL string a configuration or a >>> program >>>>>>>>>>> argument? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I brought this up because we were unable to pass the SQL file >> to >>>>>> Flink >>>>>>>>>>>> using Kubernetes mode. >>>>>>>>>>>> For DataStream/Python users, they need to prepare their images >>> for >>>>>> the >>>>>>>>>>> jars >>>>>>>>>>>> and dependencies. >>>>>>>>>>>> But for SQL users, they can use a common image to run >> different >>>> SQL >>>>>>>>>>> queries >>>>>>>>>>>> if there are no other udf requirements. >>>>>>>>>>>> It would be great if the SQL query and image were not bound. >>>>>>>>>>>> >>>>>>>>>>>> Using strings is a way to decouple these, but just as you >>>> mentioned, >>>>>>>> it's >>>>>>>>>>>> not easy to pass complex SQL. >>>>>>>>>>>> >>>>>>>>>>>>> use web submission >>>>>>>>>>>> AFAIK, we can not use web submission in the Application mode. >>>> Please >>>>>>>>>>>> correct me if I'm wrong. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Weihua >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 31, 2023 at 9:37 PM Paul Lam < >> paullin3...@gmail.com >>>> <mailto:paullin3...@gmail.com> >>>>>> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Biao, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for your comments! >>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Scope: is this FLIP only targeted for non-interactive >> Flink >>>> SQL >>>>>>>> jobs >>>>>>>>>>>>> in >>>>>>>>>>>>>> Application mode? More specifically, if we use SQL >>>> client/gateway >>>>>> to >>>>>>>>>>>>>> execute some interactive SQLs like a SELECT query, can we >> ask >>>>>> flink >>>>>>>> to >>>>>>>>>>>>> use >>>>>>>>>>>>>> Application mode to execute those queries after this FLIP? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for pointing it out. I think only DMLs would be >> executed >>>> via >>>>>>>> SQL >>>>>>>>>>>>> Driver. >>>>>>>>>>>>> I'll add the scope to the FLIP. >>>>>>>>>>>>> >>>>>>>>>>>>>> 2. Deployment: I believe in YARN mode, the implementation is >>>>>>>> trivial as >>>>>>>>>>>>> we >>>>>>>>>>>>>> can ship files via YARN's tool easily but for K8s, things >> can >>> be >>>>>>>> more >>>>>>>>>>>>>> complicated as Shengkai said. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Your input is very informative. I’m thinking about using web >>>>>>>> submission, >>>>>>>>>>>>> but it requires exposing the JobManager port which could also >>> be >>>> a >>>>>>>>>>> problem >>>>>>>>>>>>> on K8s. >>>>>>>>>>>>> >>>>>>>>>>>>> Another approach is to explicitly require a distributed >> storage >>>> to >>>>>>>> ship >>>>>>>>>>>>> files, >>>>>>>>>>>>> but we may need a new deployment executor for that. >>>>>>>>>>>>> >>>>>>>>>>>>> What do you think of these two approaches? >>>>>>>>>>>>> >>>>>>>>>>>>>> 3. Serialization of SessionState: in SessionState, there are >>>> some >>>>>>>>>>>>>> unserializable fields >>>>>>>>>>>>>> like >>>>>>>> org.apache.flink.table.resource.ResourceManager#userClassLoader. >>>>>>>>>>> It >>>>>>>>>>>>>> may be worthwhile to add more details about the >> serialization >>>>>> part. >>>>>>>>>>>>> >>>>>>>>>>>>> I agree. That’s a missing part. But if we use ExecNodeGraph >> as >>>>>>>> Shengkai >>>>>>>>>>>>> mentioned, do we eliminate the need for serialization of >>>>>>>> SessionState? >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Paul Lam >>>>>>>>>>>>> >>>>>>>>>>>>>> 2023年5月31日 13:07,Biao Geng <biaoge...@gmail.com <mailto: >>>> biaoge...@gmail.com> <mailto: >>>>>> biaoge...@gmail.com <mailto:biaoge...@gmail.com>>> 写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks Paul for the proposal!I believe it would be very >> useful >>>> for >>>>>>>>>>> flink >>>>>>>>>>>>>> users. >>>>>>>>>>>>>> After reading the FLIP, I have some questions: >>>>>>>>>>>>>> 1. Scope: is this FLIP only targeted for non-interactive >> Flink >>>> SQL >>>>>>>> jobs >>>>>>>>>>>>> in >>>>>>>>>>>>>> Application mode? More specifically, if we use SQL >>>> client/gateway >>>>>> to >>>>>>>>>>>>>> execute some interactive SQLs like a SELECT query, can we >> ask >>>>>> flink >>>>>>>> to >>>>>>>>>>>>> use >>>>>>>>>>>>>> Application mode to execute those queries after this FLIP? >>>>>>>>>>>>>> 2. Deployment: I believe in YARN mode, the implementation is >>>>>>>> trivial as >>>>>>>>>>>>> we >>>>>>>>>>>>>> can ship files via YARN's tool easily but for K8s, things >> can >>> be >>>>>>>> more >>>>>>>>>>>>>> complicated as Shengkai said. I have implemented a simple >> POC >>>>>>>>>>>>>> < >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >>>> < >>>> >>> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >>>>> >>>>>> < >>>>>> >>>> >>> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >>>> < >>>> >>> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >>>>> >>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> based on SQL client before(i.e. consider the SQL client >> which >>>>>>>> supports >>>>>>>>>>>>>> executing a SQL file as the SQL driver in this FLIP). One >>>> problem >>>>>> I >>>>>>>>>>> have >>>>>>>>>>>>>> met is how do we ship SQL files ( or Job Graph) to the k8s >>> side. >>>>>>>>>>> Without >>>>>>>>>>>>>> such support, users have to modify the initContainer or >>> rebuild >>>> a >>>>>>>> new >>>>>>>>>>> K8s >>>>>>>>>>>>>> image every time to fetch the SQL file. Like the flink k8s >>>>>> operator, >>>>>>>>>>> one >>>>>>>>>>>>>> workaround is to utilize the flink config(transforming the >> SQL >>>>>> file >>>>>>>> to >>>>>>>>>>> a >>>>>>>>>>>>>> escaped string like Weihua mentioned) which will be >> converted >>>> to a >>>>>>>>>>>>>> ConfigMap but K8s has size limit of ConfigMaps(no larger >> than >>>> 1MB >>>>>>>>>>>>>> < >> https://kubernetes.io/docs/concepts/configuration/configmap/ >>> < >>>> https://kubernetes.io/docs/concepts/configuration/configmap/> < >>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ < >>>> https://kubernetes.io/docs/concepts/configuration/configmap/>>>). >>>>>>>> Not >>>>>>>>>>>>> sure >>>>>>>>>>>>>> if we have better solutions. >>>>>>>>>>>>>> 3. Serialization of SessionState: in SessionState, there are >>>> some >>>>>>>>>>>>>> unserializable fields >>>>>>>>>>>>>> like >>>>>>>> org.apache.flink.table.resource.ResourceManager#userClassLoader. >>>>>>>>>>> It >>>>>>>>>>>>>> may be worthwhile to add more details about the >> serialization >>>>>> part. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Biao Geng >>>>>>>>>>>>>> >>>>>>>>>>>>>> Paul Lam <paullin3...@gmail.com <mailto: >> paullin3...@gmail.com >>>> >>>> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >>>>>> 于2023年5月31日周三 11:49写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Weihua, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks a lot for your input! Please see my comments inline. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Is SQLRunner the better name? We use this to run a SQL >>> Job. >>>>>> (Not >>>>>>>>>>>>>>> strong, >>>>>>>>>>>>>>>> the SQLDriver is fine for me) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I’ve thought about SQL Runner but picked SQL Driver for the >>>>>>>> following >>>>>>>>>>>>>>> reasons FYI: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. I have a PythonDriver doing the same job for PyFlink [1] >>>>>>>>>>>>>>> 2. Flink program's main class is sort of like Driver in >> JDBC >>>>>> which >>>>>>>>>>>>>>> translates SQLs into >>>>>>>>>>>>>>> databases specific languages. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In general, I’m +1 for SQL Driver and +0 for SQL Runner. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise, >> we >>>> need >>>>>>>> to >>>>>>>>>>>>>>> prepare >>>>>>>>>>>>>>>> a SQL file in an image for Kubernetes application mode, >>> which >>>>>> may >>>>>>>> be >>>>>>>>>>> a >>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>> cumbersome. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Do you mean a pass the SQL string a configuration or a >>> program >>>>>>>>>>> argument? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I thought it might be convenient for testing propose, but >> not >>>>>>>>>>>>> recommended >>>>>>>>>>>>>>> for production, >>>>>>>>>>>>>>> cause Flink SQLs could be complicated and involves lots of >>>>>>>> characters >>>>>>>>>>>>> that >>>>>>>>>>>>>>> need to escape. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - I noticed that we don't specify the SQLDriver jar in the >>>>>>>>>>>>>>> "run-application" >>>>>>>>>>>>>>>> command. Does that mean we need to perform automatic >>> detection >>>>>> in >>>>>>>>>>>>> Flink? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes! It’s like running a PyFlink job with the following >>>> command: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> ./bin/flink run \ >>>>>>>>>>>>>>> --pyModule table.word_count \ >>>>>>>>>>>>>>> --pyFiles examples/python/table >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The CLI determines if it’s a SQL job, if yes apply the SQL >>>> Driver >>>>>>>>>>>>>>> automatically. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >>>> < >>>> >>> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >>>>> >>>>>> < >>>>>> >>>> >>> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >>>> < >>>> >>> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >>>>> >>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Paul Lam >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2023年5月30日 21:56,Weihua Hu <huweihua....@gmail.com >> <mailto: >>>> huweihua....@gmail.com> <mailto: >>>>>> huweihua....@gmail.com>> 写道: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks Paul for the proposal. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> +1 for this. It is valuable in improving ease of use. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have a few questions. >>>>>>>>>>>>>>>> - Is SQLRunner the better name? We use this to run a SQL >>> Job. >>>>>> (Not >>>>>>>>>>>>>>> strong, >>>>>>>>>>>>>>>> the SQLDriver is fine for me) >>>>>>>>>>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise, >> we >>>> need >>>>>>>> to >>>>>>>>>>>>>>> prepare >>>>>>>>>>>>>>>> a SQL file in an image for Kubernetes application mode, >>> which >>>>>> may >>>>>>>> be >>>>>>>>>>> a >>>>>>>>>>>>>>> bit >>>>>>>>>>>>>>>> cumbersome. >>>>>>>>>>>>>>>> - I noticed that we don't specify the SQLDriver jar in the >>>>>>>>>>>>>>> "run-application" >>>>>>>>>>>>>>>> command. Does that mean we need to perform automatic >>> detection >>>>>> in >>>>>>>>>>>>> Flink? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> Weihua >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, May 29, 2023 at 7:24 PM Paul Lam < >>>> paullin3...@gmail.com >>>>>> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi team, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I’d like to start a discussion about FLIP-316 [1], which >>>>>>>> introduces >>>>>>>>>>> a >>>>>>>>>>>>>>> SQL >>>>>>>>>>>>>>>>> driver as the >>>>>>>>>>>>>>>>> default main class for Flink SQL jobs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Currently, Flink SQL could be executed out of the box >>> either >>>>>> via >>>>>>>> SQL >>>>>>>>>>>>>>>>> Client/Gateway >>>>>>>>>>>>>>>>> or embedded in a Flink Java/Python program. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> However, each one has its drawback: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - SQL Client/Gateway doesn’t support the application >>>> deployment >>>>>>>> mode >>>>>>>>>>>>> [2] >>>>>>>>>>>>>>>>> - Flink Java/Python program requires extra work to write >> a >>>>>>>> non-SQL >>>>>>>>>>>>>>> program >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Therefore, I propose adding a SQL driver to act as the >>>> default >>>>>>>> main >>>>>>>>>>>>>>> class >>>>>>>>>>>>>>>>> for SQL jobs. >>>>>>>>>>>>>>>>> Please see the FLIP docs for details and feel free to >>>> comment. >>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver >>>>>>>>>>>>>>>>> < >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/FLINK-26541 < >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> Paul Lam >>>> >>>> >>> >>