Hi Yang, Thanks a lot for your input!
It’s great that FLINK-28915 has covered the file download part. I’ve created a ticket for the file upload part [1]. It's a prerequisite for supporting K8s application mode for SQL Gateway. [1] https://issues.apache.org/jira/browse/FLINK-32315 Best, Paul Lam > 2023年6月12日 11:40,Yang Wang <wangyang0...@apache.org> 写道: > > Sorry for the late reply. I am in favor of introducing such a built-in > resource localization mechanism > based on Flink FileSystem. Then FLINK-28915[1] could be the second step > which will download > the jars and dependencies to the JobManager/TaskManager local directory > before working. > > The first step could be done in another ticket in Flink. Or some external > Flink jobs management system > could also take care of this. > > [1]. https://issues.apache.org/jira/browse/FLINK-28915 > > Best, > Yang > > Paul Lam <paullin3...@gmail.com> 于2023年6月9日周五 17:39写道: > >> Hi Mason, >> >> I get your point. I'm increasingly feeling the need to introduce a >> built-in >> file distribution mechanism for flink-kubernetes module, just like Spark >> does with `spark.kubernetes.file.upload.path` [1]. >> >> I’m assuming the workflow is as follows: >> >> - KubernetesClusterDescripter uploads all local resources to a remote >> storage via Flink filesystem (skips if the resources are already remote). >> - KubernetesApplicationClusterEntrypoint downloads the resources >> and put them in the classpath during startup. >> >> I wouldn't mind splitting it into another FLIP to ensure that everything is >> done correctly. >> >> cc'ed @Yang to gather more opinions. >> >> [1] >> https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management >> >> Best, >> Paul Lam >> >> 2023年6月8日 12:15,Mason Chen <mas.chen6...@gmail.com> 写道: >> >> Hi Paul, >> >> Thanks for your response! >> >> I agree that utilizing SQL Drivers in Java applications is equally >> important >> >> as employing them in SQL Gateway. WRT init containers, I think most >> users use them just as a workaround. For example, wget a jar from the >> maven repo. >> >> We could implement the functionality in SQL Driver in a more graceful >> way and the flink-supported filesystem approach seems to be a >> good choice. >> >> >> My main point is: can we solve the problem with a design agnostic of SQL >> and Stream API? I mentioned a use case where this ability is useful for >> Java or Stream API applications. Maybe this is even a non-goal to your FLIP >> since you are focusing on the driver entrypoint. >> >> Jark mentioned some optimizations: >> >> This allows SQLGateway to leverage some metadata caching and UDF JAR >> caching for better compiling performance. >> >> It would be great to see this even outside the SQLGateway (i.e. UDF JAR >> caching). >> >> Best, >> Mason >> >> On Wed, Jun 7, 2023 at 2:26 AM Shengkai Fang <fskm...@gmail.com> wrote: >> >> Hi. Paul. Thanks for your update and the update makes me understand the >> design much better. >> >> But I still have some questions about the FLIP. >> >> For SQL Gateway, only DMLs need to be delegated to the SQL server >> Driver. I would think about the details and update the FLIP. Do you have >> >> some >> >> ideas already? >> >> >> If the applicaiton mode can not support library mode, I think we should >> only execute INSERT INTO and UPDATE/ DELETE statement in the application >> mode. AFAIK, we can not support ANALYZE TABLE and CALL PROCEDURE >> statements. The ANALYZE TABLE syntax need to register the statistic to the >> catalog after job finishes and the CALL PROCEDURE statement doesn't >> generate the ExecNodeGraph. >> >> * Introduce storage via option `sql-gateway.application.storage-dir` >> >> If we can not support to submit the jars through web submission, +1 to >> introduce the options to upload the files. While I think the uploader >> should be responsible to remove the uploaded jars. Can we remove the jars >> if the job is running or gateway exits? >> >> * JobID is not avaliable >> >> Can we use the returned rest client by ApplicationDeployer to query the job >> id? I am concerned that users don't know which job is related to the >> submitted SQL. >> >> * Do we need to introduce a new module named flink-table-sql-runner? >> >> It seems we need to introduce a new module. Will the new module is >> available in the distribution package? I agree with Jark that we don't need >> to introduce this for table-API users and these users have their main >> class. If we want to make users write the k8s operator more easily, I think >> we should modify the k8s operator repo. If we don't need to support SQL >> files, can we make this jar only visible in the sql-gateway like we do in >> the planner loader?[1] >> >> [1] >> >> >> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95 >> >> Best, >> Shengkai >> >> >> >> >> >> >> >> >> Weihua Hu <huweihua....@gmail.com> 于2023年6月7日周三 10:52写道: >> >> Hi, >> >> Thanks for updating the FLIP. >> >> I have two cents on the distribution of SQLs and resources. >> 1. Should we support a common file distribution mechanism for k8s >> application mode? >> I have seen some issues and requirements on the mailing list. >> In our production environment, we implement the download command in the >> CliFrontend. >> And automatically add an init container to the POD for file >> >> downloading. >> >> The advantage of this >> is that we can use all Flink-supported file systems to store files. >> >> This need more discussion. I would appreciate hearing more opinions. >> >> 2. In this FLIP, we distribute files in two different ways in YARN and >> Kubernetes. Can we combine it in one way? >> If we don't want to implement a common file distribution for k8s >> application mode. Could we use the SQLDriver >> to download the files both in YARN and K8S? IMO, this can reduce the >> >> cost >> >> of code maintenance. >> >> Best, >> Weihua >> >> >> On Wed, Jun 7, 2023 at 10:18 AM Paul Lam <paullin3...@gmail.com> wrote: >> >> Hi Mason, >> >> Thanks for your input! >> >> +1 for init containers or a more generalized way of obtaining >> >> arbitrary >> >> files. File fetching isn't specific to just SQL--it also matters for >> >> Java >> >> applications if the user doesn't want to rebuild a Flink image and >> >> just >> >> wants to modify the user application fat jar. >> >> >> I agree that utilizing SQL Drivers in Java applications is equally >> important >> as employing them in SQL Gateway. WRT init containers, I think most >> users use them just as a workaround. For example, wget a jar from the >> maven repo. >> >> We could implement the functionality in SQL Driver in a more graceful >> way and the flink-supported filesystem approach seems to be a >> good choice. >> >> Also, what do you think about prefixing the config options with >> `sql-driver` instead of just `sql` to be more specific? >> >> >> LGTM, since SQL Driver is a public interface and the options are >> specific to it. >> >> Best, >> Paul Lam >> >> 2023年6月6日 06:30,Mason Chen <mas.chen6...@gmail.com> 写道: >> >> Hi Paul, >> >> +1 for this feature and supporting SQL file + JSON plans. We get a >> >> lot >> >> of >> >> requests to just be able to submit a SQL file, but the JSON plan >> optimizations make sense. >> >> +1 for init containers or a more generalized way of obtaining >> >> arbitrary >> >> files. File fetching isn't specific to just SQL--it also matters for >> >> Java >> >> applications if the user doesn't want to rebuild a Flink image and >> >> just >> >> wants to modify the user application fat jar. >> >> Please note that we could reuse the checkpoint storage like S3/HDFS, >> >> which >> >> should >> >> >> be required to run Flink in production, so I guess that would be >> >> acceptable >> >> for most >> >> >> users. WDYT? >> >> >> If you do go this route, it would be nice to support writing these >> >> files >> >> to >> >> S3/HDFS via Flink. This makes access control and policy management >> >> simpler. >> >> >> Also, what do you think about prefixing the config options with >> `sql-driver` instead of just `sql` to be more specific? >> >> Best, >> Mason >> >> On Mon, Jun 5, 2023 at 2:28 AM Paul Lam <paullin3...@gmail.com >> >> <mailto: >> >> paullin3...@gmail.com>> wrote: >> >> >> Hi Jark, >> >> Thanks for your input! Please see my comments inline. >> >> Isn't Table API the same way as DataSream jobs to submit Flink SQL? >> DataStream API also doesn't provide a default main class for users, >> why do we need to provide such one for SQL? >> >> >> Sorry for the confusion I caused. By DataStream jobs, I mean jobs >> >> submitted >> >> via Flink CLI which actually could be DataStream/Table jobs. >> >> I think a default main class would be user-friendly which eliminates >> >> the >> >> need >> for users to write a main class as SQLRunner in Flink K8s operator >> >> [1]. >> >> >> I thought the proposed SqlDriver was a dedicated main class >> >> accepting >> >> SQL files, is >> >> that correct? >> >> >> Both JSON plans and SQL files are accepted. SQL Gateway should use >> >> JSON >> >> plans, >> while CLI users may use either JSON plans or SQL files. >> >> Please see the updated FLIP[2] for more details. >> >> Personally, I prefer the way of init containers which doesn't >> >> depend >> >> on >> >> additional components. >> This can reduce the moving parts of a production environment. >> Depending on a distributed file system makes the testing, demo, and >> >> local >> >> setup harder than init containers. >> >> >> Please note that we could reuse the checkpoint storage like S3/HDFS, >> >> which >> >> should >> be required to run Flink in production, so I guess that would be >> acceptable for most >> users. WDYT? >> >> WRT testing, demo, and local setups, I think we could support the >> >> local >> >> filesystem >> scheme i.e. file://** as the state backends do. It works as long as >> >> SQL >> >> Gateway >> and JobManager(or SQL Driver) can access the resource directory >> >> (specified >> >> via >> `sql-gateway.application.storage-dir`). >> >> Thanks! >> >> [1] >> >> >> >> >> https://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.java >> >> [2] >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver >> >> [3] >> >> >> >> >> https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161 >> >> >> Best, >> Paul Lam >> >> 2023年6月3日 12:21,Jark Wu <imj...@gmail.com> 写道: >> >> Hi Paul, >> >> Thanks for your reply. I left my comments inline. >> >> As the FLIP said, it’s good to have a default main class for Flink >> >> SQLs, >> >> which allows users to submit Flink SQLs in the same way as >> >> DataStream >> >> jobs, or else users need to write their own main class. >> >> >> Isn't Table API the same way as DataSream jobs to submit Flink SQL? >> DataStream API also doesn't provide a default main class for users, >> why do we need to provide such one for SQL? >> >> With the help of ExecNodeGraph, do we still need the serialized >> SessionState? If not, we could make SQL Driver accepts two >> >> serialized >> >> formats: >> >> >> No, ExecNodeGraph doesn't need to serialize SessionState. I thought >> >> the >> >> proposed SqlDriver was a dedicated main class accepting SQL files, >> >> is >> >> that correct? >> If true, we have to ship the SessionState for this case which is a >> >> large >> >> work. >> I think we just need a JsonPlanDriver which is a main class that >> >> accepts >> >> JsonPlan as the parameter. >> >> >> The common solutions I know is to use distributed file systems or >> >> use >> >> init containers to localize the resources. >> >> >> Personally, I prefer the way of init containers which doesn't >> >> depend >> >> on >> >> additional components. >> This can reduce the moving parts of a production environment. >> Depending on a distributed file system makes the testing, demo, and >> >> local >> >> setup harder than init containers. >> >> Best, >> Jark >> >> >> >> >> On Fri, 2 Jun 2023 at 18:10, Paul Lam <paullin3...@gmail.com >> >> <mailto: >> >> paullin3...@gmail.com> <mailto: >> >> paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> wrote: >> >> >> The FLIP is in the early phase and some details are not included, >> >> but >> >> fortunately, we got lots of valuable ideas from the discussion. >> >> Thanks to everyone who joined the dissuasion! >> @Weihua @Shanmon @Shengkai @Biao @Jark >> >> This weekend I’m gonna revisit and update the FLIP, adding more >> details. Hopefully, we can further align our opinions. >> >> Best, >> Paul Lam >> >> 2023年6月2日 18:02,Paul Lam <paullin3...@gmail.com <mailto: >> >> paullin3...@gmail.com>> 写道: >> >> >> Hi Jark, >> >> Thanks a lot for your input! >> >> If we decide to submit ExecNodeGraph instead of SQL file, is it >> >> still >> >> necessary to support SQL Driver? >> >> >> I think so. Apart from usage in SQL Gateway, SQL Driver could >> >> simplify >> >> Flink SQL execution with Flink CLI. >> >> As the FLIP said, it’s good to have a default main class for >> >> Flink >> >> SQLs, >> >> which allows users to submit Flink SQLs in the same way as >> >> DataStream >> >> jobs, or else users need to write their own main class. >> >> SQL Driver needs to serialize SessionState which is very >> >> challenging >> >> but not detailed covered in the FLIP. >> >> >> With the help of ExecNodeGraph, do we still need the serialized >> SessionState? If not, we could make SQL Driver accepts two >> >> serialized >> >> formats: >> >> - SQL files for user-facing public usage >> - ExecNodeGraph for internal usage >> >> It’s kind of similar to the relationship between job jars and >> >> jobgraphs. >> >> >> Regarding "K8S doesn't support shipping multiple jars", is that >> >> true? >> >> Is it >> >> possible to support it? >> >> >> Yes, K8s doesn’t distribute any files. It’s the users’ >> >> responsibility >> >> to >> >> make >> >> sure the resources are accessible in the containers. The common >> >> solutions >> >> I know is to use distributed file systems or use init containers >> >> to >> >> localize the >> >> resources. >> >> Now I lean toward introducing a fs to do the distribution job. >> >> WDYT? >> >> >> Best, >> Paul Lam >> >> 2023年6月1日 20:33,Jark Wu <imj...@gmail.com <mailto: >> >> imj...@gmail.com> >> >> <mailto:imj...@gmail.com <mailto:imj...@gmail.com>> >> >> <mailto:imj...@gmail.com <mailto:imj...@gmail.com> <mailto: >> >> imj...@gmail.com <mailto:imj...@gmail.com>>>> >> >> 写道: >> >> >> Hi Paul, >> >> Thanks for starting this discussion. I like the proposal! This >> >> is >> >> a >> >> frequently requested feature! >> >> I agree with Shengkai that ExecNodeGraph as the submission >> >> object >> >> is a >> >> better idea than SQL file. To be more specific, it should be >> >> JsonPlanGraph >> >> or CompiledPlan which is the serializable representation. >> >> CompiledPlan >> >> is a >> >> clear separation between compiling/optimization/validation and >> >> execution. >> >> This can keep the validation and metadata accessing still on the >> >> SQLGateway >> >> side. This allows SQLGateway to leverage some metadata caching >> >> and >> >> UDF >> >> JAR >> >> caching for better compiling performance. >> >> If we decide to submit ExecNodeGraph instead of SQL file, is it >> >> still >> >> necessary to support SQL Driver? Regarding non-interactive SQL >> >> jobs, >> >> users >> >> can use the Table API program for application mode. SQL Driver >> >> needs >> >> to >> >> serialize SessionState which is very challenging but not >> >> detailed >> >> covered >> >> in the FLIP. >> >> Regarding "K8S doesn't support shipping multiple jars", is that >> >> true? >> >> Is it >> >> possible to support it? >> >> Best, >> Jark >> >> >> >> On Thu, 1 Jun 2023 at 16:58, Paul Lam <paullin3...@gmail.com >> >> <mailto:paullin3...@gmail.com> <mailto: >> >> paullin3...@gmail.com <mailto:paullin3...@gmail.com>> <mailto: >> >> paullin3...@gmail.com <mailto:paullin3...@gmail.com> <mailto: >> >> paullin3...@gmail.com <mailto:paullin3...@gmail.com>>>> wrote: >> >> >> Hi Weihua, >> >> You’re right. Distributing the SQLs to the TMs is one of the >> >> challenging >> >> parts of this FLIP. >> >> Web submission is not enabled in application mode currently as >> >> you >> >> said, >> >> but it could be changed if we have good reasons. >> >> What do you think about introducing a distributed storage for >> >> SQL >> >> Gateway? >> >> >> We could make use of Flink file systems [1] to distribute the >> >> SQL >> >> Gateway >> >> generated resources, that should solve the problem at its root >> >> cause. >> >> >> Users could specify Flink-supported file systems to ship files. >> >> It’s >> >> only >> >> required when using SQL Gateway with K8s application mode. >> >> [1] >> >> >> >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> < >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> >> < >> >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> < >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> >> >> < >> >> >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> < >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> >> < >> >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> < >> >> >> >> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/ >> >> >> >> >> >> Best, >> Paul Lam >> >> 2023年6月1日 13:55,Weihua Hu <huweihua....@gmail.com <mailto: >> >> huweihua....@gmail.com> <mailto: >> >> huweihua....@gmail.com <mailto:huweihua....@gmail.com>>> 写道: >> >> >> Thanks Paul for your reply. >> >> SQLDriver looks good to me. >> >> 2. Do you mean a pass the SQL string a configuration or a >> >> program >> >> argument? >> >> >> >> I brought this up because we were unable to pass the SQL file >> >> to >> >> Flink >> >> using Kubernetes mode. >> For DataStream/Python users, they need to prepare their images >> >> for >> >> the >> >> jars >> >> and dependencies. >> But for SQL users, they can use a common image to run >> >> different >> >> SQL >> >> queries >> >> if there are no other udf requirements. >> It would be great if the SQL query and image were not bound. >> >> Using strings is a way to decouple these, but just as you >> >> mentioned, >> >> it's >> >> not easy to pass complex SQL. >> >> use web submission >> >> AFAIK, we can not use web submission in the Application mode. >> >> Please >> >> correct me if I'm wrong. >> >> >> Best, >> Weihua >> >> >> On Wed, May 31, 2023 at 9:37 PM Paul Lam < >> >> paullin3...@gmail.com >> >> <mailto:paullin3...@gmail.com> >> >> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >> >> wrote: >> >> >> Hi Biao, >> >> Thanks for your comments! >> >> 1. Scope: is this FLIP only targeted for non-interactive >> >> Flink >> >> SQL >> >> jobs >> >> in >> >> Application mode? More specifically, if we use SQL >> >> client/gateway >> >> to >> >> execute some interactive SQLs like a SELECT query, can we >> >> ask >> >> flink >> >> to >> >> use >> >> Application mode to execute those queries after this FLIP? >> >> >> Thanks for pointing it out. I think only DMLs would be >> >> executed >> >> via >> >> SQL >> >> Driver. >> I'll add the scope to the FLIP. >> >> 2. Deployment: I believe in YARN mode, the implementation is >> >> trivial as >> >> we >> >> can ship files via YARN's tool easily but for K8s, things >> >> can >> >> be >> >> more >> >> complicated as Shengkai said. >> >> >> >> Your input is very informative. I’m thinking about using web >> >> submission, >> >> but it requires exposing the JobManager port which could also >> >> be >> >> a >> >> problem >> >> on K8s. >> >> Another approach is to explicitly require a distributed >> >> storage >> >> to >> >> ship >> >> files, >> but we may need a new deployment executor for that. >> >> What do you think of these two approaches? >> >> 3. Serialization of SessionState: in SessionState, there are >> >> some >> >> unserializable fields >> like >> >> org.apache.flink.table.resource.ResourceManager#userClassLoader. >> >> It >> >> may be worthwhile to add more details about the >> >> serialization >> >> part. >> >> >> I agree. That’s a missing part. But if we use ExecNodeGraph >> >> as >> >> Shengkai >> >> mentioned, do we eliminate the need for serialization of >> >> SessionState? >> >> >> Best, >> Paul Lam >> >> 2023年5月31日 13:07,Biao Geng <biaoge...@gmail.com <mailto: >> >> biaoge...@gmail.com> <mailto: >> >> biaoge...@gmail.com <mailto:biaoge...@gmail.com>>> 写道: >> >> >> Thanks Paul for the proposal!I believe it would be very >> >> useful >> >> for >> >> flink >> >> users. >> After reading the FLIP, I have some questions: >> 1. Scope: is this FLIP only targeted for non-interactive >> >> Flink >> >> SQL >> >> jobs >> >> in >> >> Application mode? More specifically, if we use SQL >> >> client/gateway >> >> to >> >> execute some interactive SQLs like a SELECT query, can we >> >> ask >> >> flink >> >> to >> >> use >> >> Application mode to execute those queries after this FLIP? >> 2. Deployment: I believe in YARN mode, the implementation is >> >> trivial as >> >> we >> >> can ship files via YARN's tool easily but for K8s, things >> >> can >> >> be >> >> more >> >> complicated as Shengkai said. I have implemented a simple >> >> POC >> >> < >> >> >> >> >> >> >> >> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >> >> < >> >> >> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >> >> >> < >> >> >> >> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >> >> < >> >> >> >> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133 >> >> >> >> >> based on SQL client before(i.e. consider the SQL client >> >> which >> >> supports >> >> executing a SQL file as the SQL driver in this FLIP). One >> >> problem >> >> I >> >> have >> >> met is how do we ship SQL files ( or Job Graph) to the k8s >> >> side. >> >> Without >> >> such support, users have to modify the initContainer or >> >> rebuild >> >> a >> >> new >> >> K8s >> >> image every time to fetch the SQL file. Like the flink k8s >> >> operator, >> >> one >> >> workaround is to utilize the flink config(transforming the >> >> SQL >> >> file >> >> to >> >> a >> >> escaped string like Weihua mentioned) which will be >> >> converted >> >> to a >> >> ConfigMap but K8s has size limit of ConfigMaps(no larger >> >> than >> >> 1MB >> >> < >> >> https://kubernetes.io/docs/concepts/configuration/configmap/ >> >> < >> >> https://kubernetes.io/docs/concepts/configuration/configmap/> < >> >> https://kubernetes.io/docs/concepts/configuration/configmap/ < >> >> https://kubernetes.io/docs/concepts/configuration/configmap/>>>). >> >> Not >> >> sure >> >> if we have better solutions. >> 3. Serialization of SessionState: in SessionState, there are >> >> some >> >> unserializable fields >> like >> >> org.apache.flink.table.resource.ResourceManager#userClassLoader. >> >> It >> >> may be worthwhile to add more details about the >> >> serialization >> >> part. >> >> >> Best, >> Biao Geng >> >> Paul Lam <paullin3...@gmail.com <mailto: >> >> paullin3...@gmail.com >> >> >> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >> >> 于2023年5月31日周三 11:49写道: >> >> >> Hi Weihua, >> >> Thanks a lot for your input! Please see my comments inline. >> >> - Is SQLRunner the better name? We use this to run a SQL >> >> Job. >> >> (Not >> >> strong, >> >> the SQLDriver is fine for me) >> >> >> I’ve thought about SQL Runner but picked SQL Driver for the >> >> following >> >> reasons FYI: >> >> 1. I have a PythonDriver doing the same job for PyFlink [1] >> 2. Flink program's main class is sort of like Driver in >> >> JDBC >> >> which >> >> translates SQLs into >> databases specific languages. >> >> In general, I’m +1 for SQL Driver and +0 for SQL Runner. >> >> - Could we run SQL jobs using SQL in strings? Otherwise, >> >> we >> >> need >> >> to >> >> prepare >> >> a SQL file in an image for Kubernetes application mode, >> >> which >> >> may >> >> be >> >> a >> >> bit >> >> cumbersome. >> >> >> Do you mean a pass the SQL string a configuration or a >> >> program >> >> argument? >> >> >> I thought it might be convenient for testing propose, but >> >> not >> >> recommended >> >> for production, >> cause Flink SQLs could be complicated and involves lots of >> >> characters >> >> that >> >> need to escape. >> >> WDYT? >> >> - I noticed that we don't specify the SQLDriver jar in the >> >> "run-application" >> >> command. Does that mean we need to perform automatic >> >> detection >> >> in >> >> Flink? >> >> >> Yes! It’s like running a PyFlink job with the following >> >> command: >> >> >> ``` >> ./bin/flink run \ >> --pyModule table.word_count \ >> --pyFiles examples/python/table >> ``` >> >> The CLI determines if it’s a SQL job, if yes apply the SQL >> >> Driver >> >> automatically. >> >> >> [1] >> >> >> >> >> >> >> >> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >> >> < >> >> >> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >> >> >> < >> >> >> >> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >> >> < >> >> >> >> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java >> >> >> >> >> Best, >> Paul Lam >> >> 2023年5月30日 21:56,Weihua Hu <huweihua....@gmail.com >> >> <mailto: >> >> huweihua....@gmail.com> <mailto: >> >> huweihua....@gmail.com>> 写道: >> >> >> Thanks Paul for the proposal. >> >> +1 for this. It is valuable in improving ease of use. >> >> I have a few questions. >> - Is SQLRunner the better name? We use this to run a SQL >> >> Job. >> >> (Not >> >> strong, >> >> the SQLDriver is fine for me) >> - Could we run SQL jobs using SQL in strings? Otherwise, >> >> we >> >> need >> >> to >> >> prepare >> >> a SQL file in an image for Kubernetes application mode, >> >> which >> >> may >> >> be >> >> a >> >> bit >> >> cumbersome. >> - I noticed that we don't specify the SQLDriver jar in the >> >> "run-application" >> >> command. Does that mean we need to perform automatic >> >> detection >> >> in >> >> Flink? >> >> >> >> Best, >> Weihua >> >> >> On Mon, May 29, 2023 at 7:24 PM Paul Lam < >> >> paullin3...@gmail.com >> >> <mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com>>> >> >> wrote: >> >> >> Hi team, >> >> I’d like to start a discussion about FLIP-316 [1], which >> >> introduces >> >> a >> >> SQL >> >> driver as the >> default main class for Flink SQL jobs. >> >> Currently, Flink SQL could be executed out of the box >> >> either >> >> via >> >> SQL >> >> Client/Gateway >> or embedded in a Flink Java/Python program. >> >> However, each one has its drawback: >> >> - SQL Client/Gateway doesn’t support the application >> >> deployment >> >> mode >> >> [2] >> >> - Flink Java/Python program requires extra work to write >> >> a >> >> non-SQL >> >> program >> >> >> Therefore, I propose adding a SQL driver to act as the >> >> default >> >> main >> >> class >> >> for SQL jobs. >> Please see the FLIP docs for details and feel free to >> >> comment. >> >> Thanks! >> >> >> [1] >> >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver >> >> < >> >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver >> >> >> [2] https://issues.apache.org/jira/browse/FLINK-26541 < >> https://issues.apache.org/jira/browse/FLINK-26541> >> >> Best, >> Paul Lam >> >> >> >> >> >> >>