Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Paul Lam Sun, 10 Dec 2023 19:01:51 -0800

Hi Ferenc,

Sorry for my late reply.


> Is any active work happening on this FLIP? As far as I see there
> are blockers that needs to happen first to implement regarding
> artifact distribution.

You’re right. There’s a block in K8s application mode, but none in
YARN application. I’m doing a POC on YARN application mode
before starting a vote thread.

I’ve been busy lately, but the FLIP is active for sure. The situation
will change in a couple of weeks.

Thank you for reaching out! I’ll let you know when the POC is completed.

Best,
Paul Lam

> 2023年11月21日 06:01，Ferenc Csaky <ferenc.cs...@pm.me.INVALID> 写道：
> 
> Hello devs,
> 
> Is any active work happening on this FLIP? As far as I see there
> are blockers that needs to happen first to implement regarding
> artifact distribution.
> 
> Is this work in halt completetly or some efforts are going into
> resolve the blockers first or something?
> 
> Our platform would benefit this feature a lot, we have a kind of
> working custom implementation at the moment, but it is uniquely
> adapted to our app and platform.
> 
> I could help out to move this forward.
> 
> Best,
> Ferenc
> 
> 
> 
> On Friday, June 30th, 2023 at 04:53, Paul Lam <paullin3...@gmail.com 
> <mailto:paullin3...@gmail.com>> wrote:
> 
> 
>> 
>> 
>> Hi Jing,
>> 
>> Thanks for your input!
>> 
>>> Would you like to add
>>> one section to describe(better with script/code example) how to use it in
>>> these two scenarios from users' perspective?
>> 
>> 
>> OK. I’ll update the FLIP with the code snippet after I get the POC branch 
>> done.
>> 
>>> NIT: the pictures have transparent background when readers click on it. It
>>> would be great if you can replace them with pictures with white background.
>> 
>> 
>> Fixed. Thanks for pointing that out :)
>> 
>> Best,
>> Paul Lam
>> 
>>> 2023年6月27日 06:51，Jing Ge j...@ververica.com.INVALID 写道：
>>> 
>>> Hi Paul,
>>> 
>>> Thanks for driving it and thank you all for the informative discussion! The
>>> FLIP is in good shape now. As described in the FLIP, SQL Driver will be
>>> mainly used to run Flink SQLs in two scenarios: 1. SQL client/gateway in
>>> application mode and 2. external system integration. Would you like to add
>>> one section to describe(better with script/code example) how to use it in
>>> these two scenarios from users' perspective?
>>> 
>>> NIT: the pictures have transparent background when readers click on it. It
>>> would be great if you can replace them with pictures with white background.
>>> 
>>> Best regards,
>>> Jing
>>> 
>>> On Mon, Jun 26, 2023 at 1:31 PM Paul Lam <paullin3...@gmail.com 
>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com> wrote:
>>> 
>>>> Hi Shengkai,
>>>> 
>>>>> * How can we ship the json plan to the JobManager?
>>>> 
>>>> The Flink K8s module should be responsible for file distribution. We could
>>>> introduce
>>>> an option like `kubernetes.storage.dir`. For each flink cluster, there
>>>> would be a
>>>> dedicated subdirectory, with the pattern like
>>>> `${kubernetes.storage.dir}/${cluster-id}`.
>>>> 
>>>> All resources-related options (e.g. pipeline jars, json plans) that are
>>>> configured with
>>>> scheme `file://` <file://`/> <file://`/> <file:///%60 <file:///%60>> would 
>>>> be uploaded to the resource directory
>>>> and downloaded to the
>>>> jobmanager, before SQL Driver accesses the files with the original
>>>> filenames.
>>>> 
>>>>> * Classloading strategy
>>>> 
>>>> We could directly specify the SQL Gateway jar as the jar file in
>>>> PackagedProgram.
>>>> It would be treated like a normal user jar and the SQL Driver is loaded
>>>> into the user
>>>> classloader. WDYT?
>>>> 
>>>>> * Option `$internal.sql-gateway.driver.sql-config` is string type
>>>>> I think it's better to use Map type here
>>>> 
>>>> By Map type configuration, do you mean a nested map that contains all
>>>> configurations?
>>>> 
>>>> I hope I've explained myself well, it’s a file that contains the extra SQL
>>>> configurations, which would be shipped to the jobmanager.
>>>> 
>>>>> * PoC branch
>>>> 
>>>> Sure. I’ll let you know once I get the job done.
>>>> 
>>>> Best,
>>>> Paul Lam
>>>> 
>>>>> 2023年6月26日 14:27，Shengkai Fang <fskm...@gmail.com 
>>>>> <mailto:fskm...@gmail.com> mailto:fskm...@gmail.com> 写道：
>>>>> 
>>>>> Hi, Paul.
>>>>> 
>>>>> Thanks for your update. I have a few questions about the new design:
>>>>> 
>>>>> * How can we ship the json plan to the JobManager?
>>>>> 
>>>>> The current design only exposes an option about the URL of the json
>>>>> plan. It seems the gateway is responsible to upload to an external 
>>>>> stroage.
>>>>> Can we reuse the PipelineOptions.JARS to ship to the remote filesystem?
>>>>> 
>>>>> * Classloading strategy
>>>>> 
>>>>> Currently, the Driver is in the sql-gateway package. It means the Driver
>>>>> is not in the JM's classpath directly. Because the sql-gateway jar is now
>>>>> in the opt directory rather than lib directory. It may need to add the
>>>>> external dependencies as Python does[1]. BTW, I think it's better to move
>>>>> the Driver into the flink-table-runtime package, which is much easier to
>>>>> find(Sorry for the wrong opinion before).
>>>>> 
>>>>> * Option `$internal.sql-gateway.driver.sql-config` is string type
>>>>> 
>>>>> I think it's better to use Map type here
>>>>> 
>>>>> * PoC branch
>>>>> 
>>>>> Because this FLIP involves many modules, do you have a PoC branch to
>>>>> verify it does work?
>>>>> 
>>>>> Best,
>>>>> Shengkai
>>>>> 
>>>>> [1]
>>>>> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940
>>>>> <
>>>>> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L940
>>>>> 
>>>>> Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>> mailto:paullin3...@gmail.com>>
>>>>> 于2023年6月19日周一 14:09写道：
>>>>> Hi Shengkai,
>>>>> 
>>>>> Sorry for my late reply. It took me some time to update the FLIP.
>>>>> 
>>>>> In the latest FLIP design, SQL Driver is placed in flink-sql-gateway
>>>>> module. PTAL.
>>>>> 
>>>>> The FLIP does not cover details about the K8s file distribution, but its
>>>>> general usage would
>>>>> be very much the same as YARN setups. We could make follow-up
>>>>> discussions in the jira
>>>>> tickets.
>>>>> 
>>>>> Best,
>>>>> Paul Lam
>>>>> 
>>>>>> 2023年6月12日 15:29，Shengkai Fang <fskm...@gmail.com 
>>>>>> <mailto:fskm...@gmail.com> mailto:fskm...@gmail.com <mailto:
>>>>>> fskm...@gmail.com <mailto:fskm...@gmail.com> mailto:fskm...@gmail.com>> 
>>>>>> 写道：
>>>>>> 
>>>>>>> If it’s the case, I’m good with introducing a new module and making
>>>>>>> SQL Driver
>>>>>>> an internal class and accepts JSON plans only.
>>>>>> 
>>>>>> I rethink this again and again. I think it's better to move the
>>>>>> SqlDriver into the sql-gateway module because the sql client relies on 
>>>>>> the
>>>>>> sql-gateway to submit the sql and the sql-gateway has the ability to
>>>>>> generate the ExecNodeGraph now. +1 to support accepting JSON plans only.
>>>>>> 
>>>>>> * Upload configuration through command line parameter
>>>>>> 
>>>>>> ExecNodeGraph only contains the job's information but it doesn't
>>>>>> contain the checkpoint dir, checkpoint interval, execution mode and so 
>>>>>> on.
>>>>>> So I think we should also upload the configuration.
>>>>>> 
>>>>>> * KubernetesClusterDescripter and
>>>>>> KubernetesApplicationClusterEntrypoint are responsible for the jar
>>>>>> upload/download
>>>>>> 
>>>>>> +1 for the change.
>>>>>> 
>>>>>> Could you update the FLIP about the current discussion?
>>>>>> 
>>>>>> Best,
>>>>>> Shengkai
>>>>>> 
>>>>>> Yang Wang <wangyang0...@apache.org <mailto:wangyang0...@apache.org> 
>>>>>> mailto:wangyang0...@apache.org 
>>>>>> <mailto:wangyang0918@apache.orgmailto:wangyang0...@apache.org>>
>>>>>> 于2023年6月12日周一 11:41写道：
>>>>>> Sorry for the late reply. I am in favor of introducing such a built-in
>>>>>> resource localization mechanism
>>>>>> based on Flink FileSystem. Then FLINK-28915[1] could be the second step
>>>>>> which will download
>>>>>> the jars and dependencies to the JobManager/TaskManager local directory
>>>>>> before working.
>>>>>> 
>>>>>> The first step could be done in another ticket in Flink. Or some
>>>>>> external
>>>>>> Flink jobs management system
>>>>>> could also take care of this.
>>>>>> 
>>>>>> [1]. https://issues.apache.org/jira/browse/FLINK-28915 
>>>>>> https://issues.apache.org/jira/browse/FLINK-28915 <
>>>>>> https://issues.apache.org/jira/browse/FLINK-28915 
>>>>>> https://issues.apache.org/jira/browse/FLINK-28915>
>>>>>> 
>>>>>> Best,
>>>>>> Yang
>>>>>> 
>>>>>> Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>> mailto:paullin3...@gmail.com>>
>>>>>> 于2023年6月9日周五 17:39写道：
>>>>>> 
>>>>>>> Hi Mason,
>>>>>>> 
>>>>>>> I get your point. I'm increasingly feeling the need to introduce a
>>>>>>> built-in
>>>>>>> file distribution mechanism for flink-kubernetes module, just like
>>>>>>> Spark
>>>>>>> does with `spark.kubernetes.file.upload.path` [1].
>>>>>>> 
>>>>>>> I’m assuming the workflow is as follows:
>>>>>>> 
>>>>>>> - KubernetesClusterDescripter uploads all local resources to a remote
>>>>>>> storage via Flink filesystem (skips if the resources are already
>>>>>>> remote).
>>>>>>> - KubernetesApplicationClusterEntrypoint downloads the resources
>>>>>>> and put them in the classpath during startup.
>>>>>>> 
>>>>>>> I wouldn't mind splitting it into another FLIP to ensure that
>>>>>>> everything is
>>>>>>> done correctly.
>>>>>>> 
>>>>>>> cc'ed @Yang to gather more opinions.
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management
>>>>  
>>>> https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management
>>>> <
>>>> https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management
>>>>  
>>>> https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management
>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月8日 12:15，Mason Chen <mas.chen6...@gmail.com 
>>>>>>> <mailto:mas.chen6...@gmail.com> mailto:mas.chen6...@gmail.com <mailto:
>>>>>>> mas.chen6...@gmail.com <mailto:mas.chen6...@gmail.com> 
>>>>>>> mailto:mas.chen6...@gmail.com>> 写道：
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Thanks for your response!
>>>>>>> 
>>>>>>> I agree that utilizing SQL Drivers in Java applications is equally
>>>>>>> important
>>>>>>> 
>>>>>>> as employing them in SQL Gateway. WRT init containers, I think most
>>>>>>> users use them just as a workaround. For example, wget a jar from the
>>>>>>> maven repo.
>>>>>>> 
>>>>>>> We could implement the functionality in SQL Driver in a more graceful
>>>>>>> way and the flink-supported filesystem approach seems to be a
>>>>>>> good choice.
>>>>>>> 
>>>>>>> My main point is: can we solve the problem with a design agnostic of
>>>>>>> SQL
>>>>>>> and Stream API? I mentioned a use case where this ability is useful
>>>>>>> for
>>>>>>> Java or Stream API applications. Maybe this is even a non-goal to
>>>>>>> your FLIP
>>>>>>> since you are focusing on the driver entrypoint.
>>>>>>> 
>>>>>>> Jark mentioned some optimizations:
>>>>>>> 
>>>>>>> This allows SQLGateway to leverage some metadata caching and UDF JAR
>>>>>>> caching for better compiling performance.
>>>>>>> 
>>>>>>> It would be great to see this even outside the SQLGateway (i.e. UDF
>>>>>>> JAR
>>>>>>> caching).
>>>>>>> 
>>>>>>> Best,
>>>>>>> Mason
>>>>>>> 
>>>>>>> On Wed, Jun 7, 2023 at 2:26 AM Shengkai Fang <fskm...@gmail.com 
>>>>>>> <mailto:fskm...@gmail.com> mailto:fskm...@gmail.com
>>>>>>> <mailto:fskm...@gmail.com mailto:fskm...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi. Paul. Thanks for your update and the update makes me understand
>>>>>>> the
>>>>>>> design much better.
>>>>>>> 
>>>>>>> But I still have some questions about the FLIP.
>>>>>>> 
>>>>>>> For SQL Gateway, only DMLs need to be delegated to the SQL server
>>>>>>> Driver. I would think about the details and update the FLIP. Do you
>>>>>>> have
>>>>>>> 
>>>>>>> some
>>>>>>> 
>>>>>>> ideas already?
>>>>>>> 
>>>>>>> If the applicaiton mode can not support library mode, I think we
>>>>>>> should
>>>>>>> only execute INSERT INTO and UPDATE/ DELETE statement in the
>>>>>>> application
>>>>>>> mode. AFAIK, we can not support ANALYZE TABLE and CALL PROCEDURE
>>>>>>> statements. The ANALYZE TABLE syntax need to register the statistic
>>>>>>> to the
>>>>>>> catalog after job finishes and the CALL PROCEDURE statement doesn't
>>>>>>> generate the ExecNodeGraph.
>>>>>>> 
>>>>>>> * Introduce storage via option `sql-gateway.application.storage-dir`
>>>>>>> 
>>>>>>> If we can not support to submit the jars through web submission, +1 to
>>>>>>> introduce the options to upload the files. While I think the uploader
>>>>>>> should be responsible to remove the uploaded jars. Can we remove the
>>>>>>> jars
>>>>>>> if the job is running or gateway exits?
>>>>>>> 
>>>>>>> * JobID is not avaliable
>>>>>>> 
>>>>>>> Can we use the returned rest client by ApplicationDeployer to query
>>>>>>> the job
>>>>>>> id? I am concerned that users don't know which job is related to the
>>>>>>> submitted SQL.
>>>>>>> 
>>>>>>> * Do we need to introduce a new module named flink-table-sql-runner?
>>>>>>> 
>>>>>>> It seems we need to introduce a new module. Will the new module is
>>>>>>> available in the distribution package? I agree with Jark that we
>>>>>>> don't need
>>>>>>> to introduce this for table-API users and these users have their main
>>>>>>> class. If we want to make users write the k8s operator more easily, I
>>>>>>> think
>>>>>>> we should modify the k8s operator repo. If we don't need to support
>>>>>>> SQL
>>>>>>> files, can we make this jar only visible in the sql-gateway like we
>>>>>>> do in
>>>>>>> the planner loader?[1]
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95
>>>>  
>>>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95
>>>> <
>>>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95
>>>>  
>>>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-loader/src/main/java/org/apache/flink/table/planner/loader/PlannerModule.java#L95
>>>> 
>>>>>>> Best,
>>>>>>> Shengkai
>>>>>>> 
>>>>>>> Weihua Hu <huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com>>
>>>>>>> 于2023年6月7日周三 10:52写道：
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Thanks for updating the FLIP.
>>>>>>> 
>>>>>>> I have two cents on the distribution of SQLs and resources.
>>>>>>> 1. Should we support a common file distribution mechanism for k8s
>>>>>>> application mode?
>>>>>>> I have seen some issues and requirements on the mailing list.
>>>>>>> In our production environment, we implement the download command in
>>>>>>> the
>>>>>>> CliFrontend.
>>>>>>> And automatically add an init container to the POD for file
>>>>>>> 
>>>>>>> downloading.
>>>>>>> 
>>>>>>> The advantage of this
>>>>>>> is that we can use all Flink-supported file systems to store files.
>>>>>>> 
>>>>>>> This need more discussion. I would appreciate hearing more opinions.
>>>>>>> 
>>>>>>> 2. In this FLIP, we distribute files in two different ways in YARN and
>>>>>>> Kubernetes. Can we combine it in one way?
>>>>>>> If we don't want to implement a common file distribution for k8s
>>>>>>> application mode. Could we use the SQLDriver
>>>>>>> to download the files both in YARN and K8S? IMO, this can reduce the
>>>>>>> 
>>>>>>> cost
>>>>>>> 
>>>>>>> of code maintenance.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Weihua
>>>>>>> 
>>>>>>> On Wed, Jun 7, 2023 at 10:18 AM Paul Lam <paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Mason,
>>>>>>> 
>>>>>>> Thanks for your input!
>>>>>>> 
>>>>>>> +1 for init containers or a more generalized way of obtaining
>>>>>>> 
>>>>>>> arbitrary
>>>>>>> 
>>>>>>> files. File fetching isn't specific to just SQL--it also matters for
>>>>>>> 
>>>>>>> Java
>>>>>>> 
>>>>>>> applications if the user doesn't want to rebuild a Flink image and
>>>>>>> 
>>>>>>> just
>>>>>>> 
>>>>>>> wants to modify the user application fat jar.
>>>>>>> 
>>>>>>> I agree that utilizing SQL Drivers in Java applications is equally
>>>>>>> important
>>>>>>> as employing them in SQL Gateway. WRT init containers, I think most
>>>>>>> users use them just as a workaround. For example, wget a jar from the
>>>>>>> maven repo.
>>>>>>> 
>>>>>>> We could implement the functionality in SQL Driver in a more graceful
>>>>>>> way and the flink-supported filesystem approach seems to be a
>>>>>>> good choice.
>>>>>>> 
>>>>>>> Also, what do you think about prefixing the config options with
>>>>>>> `sql-driver` instead of just `sql` to be more specific?
>>>>>>> 
>>>>>>> LGTM, since SQL Driver is a public interface and the options are
>>>>>>> specific to it.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月6日 06:30，Mason Chen <mas.chen6...@gmail.com 
>>>>>>> <mailto:mas.chen6...@gmail.com> mailto:mas.chen6...@gmail.com <mailto:
>>>>>>> mas.chen6...@gmail.com <mailto:mas.chen6...@gmail.com> 
>>>>>>> mailto:mas.chen6...@gmail.com>> 写道：
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> +1 for this feature and supporting SQL file + JSON plans. We get a
>>>>>>> 
>>>>>>> lot
>>>>>>> 
>>>>>>> of
>>>>>>> 
>>>>>>> requests to just be able to submit a SQL file, but the JSON plan
>>>>>>> optimizations make sense.
>>>>>>> 
>>>>>>> +1 for init containers or a more generalized way of obtaining
>>>>>>> 
>>>>>>> arbitrary
>>>>>>> 
>>>>>>> files. File fetching isn't specific to just SQL--it also matters for
>>>>>>> 
>>>>>>> Java
>>>>>>> 
>>>>>>> applications if the user doesn't want to rebuild a Flink image and
>>>>>>> 
>>>>>>> just
>>>>>>> 
>>>>>>> wants to modify the user application fat jar.
>>>>>>> 
>>>>>>> Please note that we could reuse the checkpoint storage like S3/HDFS,
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> should
>>>>>>> 
>>>>>>> be required to run Flink in production, so I guess that would be
>>>>>>> 
>>>>>>> acceptable
>>>>>>> 
>>>>>>> for most
>>>>>>> 
>>>>>>> users. WDYT?
>>>>>>> 
>>>>>>> If you do go this route, it would be nice to support writing these
>>>>>>> 
>>>>>>> files
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> S3/HDFS via Flink. This makes access control and policy management
>>>>>>> 
>>>>>>> simpler.
>>>>>>> 
>>>>>>> Also, what do you think about prefixing the config options with
>>>>>>> `sql-driver` instead of just `sql` to be more specific?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Mason
>>>>>>> 
>>>>>>> On Mon, Jun 5, 2023 at 2:28 AM Paul Lam <paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>> wrote:
>>>>>>> 
>>>>>>> Hi Jark,
>>>>>>> 
>>>>>>> Thanks for your input! Please see my comments inline.
>>>>>>> 
>>>>>>> Isn't Table API the same way as DataSream jobs to submit Flink SQL?
>>>>>>> DataStream API also doesn't provide a default main class for users,
>>>>>>> why do we need to provide such one for SQL?
>>>>>>> 
>>>>>>> Sorry for the confusion I caused. By DataStream jobs, I mean jobs
>>>>>>> 
>>>>>>> submitted
>>>>>>> 
>>>>>>> via Flink CLI which actually could be DataStream/Table jobs.
>>>>>>> 
>>>>>>> I think a default main class would be user-friendly which eliminates
>>>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> need
>>>>>>> for users to write a main class as SQLRunner in Flink K8s operator
>>>>>>> 
>>>>>>> [1].
>>>>>>> 
>>>>>>> I thought the proposed SqlDriver was a dedicated main class
>>>>>>> 
>>>>>>> accepting
>>>>>>> 
>>>>>>> SQL files, is
>>>>>>> 
>>>>>>> that correct?
>>>>>>> 
>>>>>>> Both JSON plans and SQL files are accepted. SQL Gateway should use
>>>>>>> 
>>>>>>> JSON
>>>>>>> 
>>>>>>> plans,
>>>>>>> while CLI users may use either JSON plans or SQL files.
>>>>>>> 
>>>>>>> Please see the updated FLIP[2] for more details.
>>>>>>> 
>>>>>>> Personally, I prefer the way of init containers which doesn't
>>>>>>> 
>>>>>>> depend
>>>>>>> 
>>>>>>> on
>>>>>>> 
>>>>>>> additional components.
>>>>>>> This can reduce the moving parts of a production environment.
>>>>>>> Depending on a distributed file system makes the testing, demo, and
>>>>>>> 
>>>>>>> local
>>>>>>> 
>>>>>>> setup harder than init containers.
>>>>>>> 
>>>>>>> Please note that we could reuse the checkpoint storage like S3/HDFS,
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> should
>>>>>>> be required to run Flink in production, so I guess that would be
>>>>>>> acceptable for most
>>>>>>> users. WDYT?
>>>>>>> 
>>>>>>> WRT testing, demo, and local setups, I think we could support the
>>>>>>> 
>>>>>>> local
>>>>>>> 
>>>>>>> filesystem
>>>>>>> scheme i.e. file://** <file:///**> <file:///**> <> as the state 
>>>>>>> backends do. It works as long as
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> Gateway
>>>>>>> and JobManager(or SQL Driver) can access the resource directory
>>>>>>> 
>>>>>>> (specified
>>>>>>> 
>>>>>>> via
>>>>>>> `sql-gateway.application.storage-dir`).
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.javahttps://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.java
>>>> <
>>>> https://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.javahttps://github.com/apache/flink-kubernetes-operator/blob/main/examples/flink-sql-runner-example/src/main/java/org/apache/flink/examples/SqlRunner.java
>>>> 
>>>>>>> [2]
>>>> 
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>> <
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>> 
>>>>>>> [3]
>>>> 
>>>> https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161
>>>> <
>>>> https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161https://github.com/apache/flink/blob/3245e0443b2a4663552a5b707c5c8c46876c1f6d/flink-runtime/src/test/java/org/apache/flink/runtime/state/filesystem/AbstractFileCheckpointStorageAccessTestBase.java#L161
>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月3日 12:21，Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com> 
>>>>>>> mailto:imj...@gmail.com <mailto:imj...@gmail.com 
>>>>>>> mailto:imj...@gmail.com>>
>>>>>>> 写道：
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Thanks for your reply. I left my comments inline.
>>>>>>> 
>>>>>>> As the FLIP said, it’s good to have a default main class for Flink
>>>>>>> 
>>>>>>> SQLs,
>>>>>>> 
>>>>>>> which allows users to submit Flink SQLs in the same way as
>>>>>>> 
>>>>>>> DataStream
>>>>>>> 
>>>>>>> jobs, or else users need to write their own main class.
>>>>>>> 
>>>>>>> Isn't Table API the same way as DataSream jobs to submit Flink SQL?
>>>>>>> DataStream API also doesn't provide a default main class for users,
>>>>>>> why do we need to provide such one for SQL?
>>>>>>> 
>>>>>>> With the help of ExecNodeGraph, do we still need the serialized
>>>>>>> SessionState? If not, we could make SQL Driver accepts two
>>>>>>> 
>>>>>>> serialized
>>>>>>> 
>>>>>>> formats:
>>>>>>> 
>>>>>>> No, ExecNodeGraph doesn't need to serialize SessionState. I thought
>>>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> proposed SqlDriver was a dedicated main class accepting SQL files,
>>>>>>> 
>>>>>>> is
>>>>>>> 
>>>>>>> that correct?
>>>>>>> If true, we have to ship the SessionState for this case which is a
>>>>>>> 
>>>>>>> large
>>>>>>> 
>>>>>>> work.
>>>>>>> I think we just need a JsonPlanDriver which is a main class that
>>>>>>> 
>>>>>>> accepts
>>>>>>> 
>>>>>>> JsonPlan as the parameter.
>>>>>>> 
>>>>>>> The common solutions I know is to use distributed file systems or
>>>>>>> 
>>>>>>> use
>>>>>>> 
>>>>>>> init containers to localize the resources.
>>>>>>> 
>>>>>>> Personally, I prefer the way of init containers which doesn't
>>>>>>> 
>>>>>>> depend
>>>>>>> 
>>>>>>> on
>>>>>>> 
>>>>>>> additional components.
>>>>>>> This can reduce the moving parts of a production environment.
>>>>>>> Depending on a distributed file system makes the testing, demo, and
>>>>>>> 
>>>>>>> local
>>>>>>> 
>>>>>>> setup harder than init containers.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Jark
>>>>>>> 
>>>>>>> On Fri, 2 Jun 2023 at 18:10, Paul Lam <paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>>> wrote:
>>>>>>> 
>>>>>>> The FLIP is in the early phase and some details are not included,
>>>>>>> 
>>>>>>> but
>>>>>>> 
>>>>>>> fortunately, we got lots of valuable ideas from the discussion.
>>>>>>> 
>>>>>>> Thanks to everyone who joined the dissuasion!
>>>>>>> @Weihua @Shanmon @Shengkai @Biao @Jark
>>>>>>> 
>>>>>>> This weekend I’m gonna revisit and update the FLIP, adding more
>>>>>>> details. Hopefully, we can further align our opinions.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月2日 18:02，Paul Lam <paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>> 写道：
>>>>>>> 
>>>>>>> Hi Jark,
>>>>>>> 
>>>>>>> Thanks a lot for your input!
>>>>>>> 
>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it
>>>>>>> 
>>>>>>> still
>>>>>>> 
>>>>>>> necessary to support SQL Driver?
>>>>>>> 
>>>>>>> I think so. Apart from usage in SQL Gateway, SQL Driver could
>>>>>>> 
>>>>>>> simplify
>>>>>>> 
>>>>>>> Flink SQL execution with Flink CLI.
>>>>>>> 
>>>>>>> As the FLIP said, it’s good to have a default main class for
>>>>>>> 
>>>>>>> Flink
>>>>>>> 
>>>>>>> SQLs,
>>>>>>> 
>>>>>>> which allows users to submit Flink SQLs in the same way as
>>>>>>> 
>>>>>>> DataStream
>>>>>>> 
>>>>>>> jobs, or else users need to write their own main class.
>>>>>>> 
>>>>>>> SQL Driver needs to serialize SessionState which is very
>>>>>>> 
>>>>>>> challenging
>>>>>>> 
>>>>>>> but not detailed covered in the FLIP.
>>>>>>> 
>>>>>>> With the help of ExecNodeGraph, do we still need the serialized
>>>>>>> SessionState? If not, we could make SQL Driver accepts two
>>>>>>> 
>>>>>>> serialized
>>>>>>> 
>>>>>>> formats:
>>>>>>> 
>>>>>>> - SQL files for user-facing public usage
>>>>>>> - ExecNodeGraph for internal usage
>>>>>>> 
>>>>>>> It’s kind of similar to the relationship between job jars and
>>>>>>> 
>>>>>>> jobgraphs.
>>>>>>> 
>>>>>>> Regarding "K8S doesn't support shipping multiple jars", is that
>>>>>>> 
>>>>>>> true?
>>>>>>> 
>>>>>>> Is it
>>>>>>> 
>>>>>>> possible to support it?
>>>>>>> 
>>>>>>> Yes, K8s doesn’t distribute any files. It’s the users’
>>>>>>> 
>>>>>>> responsibility
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> make
>>>>>>> 
>>>>>>> sure the resources are accessible in the containers. The common
>>>>>>> 
>>>>>>> solutions
>>>>>>> 
>>>>>>> I know is to use distributed file systems or use init containers
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> localize the
>>>>>>> 
>>>>>>> resources.
>>>>>>> 
>>>>>>> Now I lean toward introducing a fs to do the distribution job.
>>>>>>> 
>>>>>>> WDYT?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月1日 20:33，Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com> 
>>>>>>> mailto:imj...@gmail.com <mailto:imj...@gmail.com 
>>>>>>> mailto:imj...@gmail.com>
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> imj...@gmail.com <mailto:imj...@gmail.com> mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com>>
>>>>>>> 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com> <mailto:
>>>>>>> imj...@gmail.com <mailto:imj...@gmail.com> mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com>>>
>>>>>>> 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com> <mailto:
>>>>>>> imj...@gmail.com <mailto:imj...@gmail.com> mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> imj...@gmail.com <mailto:imj...@gmail.com> mailto:imj...@gmail.com 
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com> 
>>>>>>> <mailto:imjark@gmail.commailto:imj...@gmail.com
>>>>>>> <mailto:imj...@gmail.com mailto:imj...@gmail.com>>>>>
>>>>>>> 
>>>>>>> 写道：
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Thanks for starting this discussion. I like the proposal! This
>>>>>>> 
>>>>>>> is
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> frequently requested feature!
>>>>>>> 
>>>>>>> I agree with Shengkai that ExecNodeGraph as the submission
>>>>>>> 
>>>>>>> object
>>>>>>> 
>>>>>>> is a
>>>>>>> 
>>>>>>> better idea than SQL file. To be more specific, it should be
>>>>>>> 
>>>>>>> JsonPlanGraph
>>>>>>> 
>>>>>>> or CompiledPlan which is the serializable representation.
>>>>>>> 
>>>>>>> CompiledPlan
>>>>>>> 
>>>>>>> is a
>>>>>>> 
>>>>>>> clear separation between compiling/optimization/validation and
>>>>>>> 
>>>>>>> execution.
>>>>>>> 
>>>>>>> This can keep the validation and metadata accessing still on the
>>>>>>> 
>>>>>>> SQLGateway
>>>>>>> 
>>>>>>> side. This allows SQLGateway to leverage some metadata caching
>>>>>>> 
>>>>>>> and
>>>>>>> 
>>>>>>> UDF
>>>>>>> 
>>>>>>> JAR
>>>>>>> 
>>>>>>> caching for better compiling performance.
>>>>>>> 
>>>>>>> If we decide to submit ExecNodeGraph instead of SQL file, is it
>>>>>>> 
>>>>>>> still
>>>>>>> 
>>>>>>> necessary to support SQL Driver? Regarding non-interactive SQL
>>>>>>> 
>>>>>>> jobs,
>>>>>>> 
>>>>>>> users
>>>>>>> 
>>>>>>> can use the Table API program for application mode. SQL Driver
>>>>>>> 
>>>>>>> needs
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> serialize SessionState which is very challenging but not
>>>>>>> 
>>>>>>> detailed
>>>>>>> 
>>>>>>> covered
>>>>>>> 
>>>>>>> in the FLIP.
>>>>>>> 
>>>>>>> Regarding "K8S doesn't support shipping multiple jars", is that
>>>>>>> 
>>>>>>> true?
>>>>>>> 
>>>>>>> Is it
>>>>>>> 
>>>>>>> possible to support it?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Jark
>>>>>>> 
>>>>>>> On Thu, 1 Jun 2023 at 16:58, Paul Lam <paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com> mailto:paullin3...@gmail.com <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com>>
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Weihua,
>>>>>>> 
>>>>>>> You’re right. Distributing the SQLs to the TMs is one of the
>>>>>>> 
>>>>>>> challenging
>>>>>>> 
>>>>>>> parts of this FLIP.
>>>>>>> 
>>>>>>> Web submission is not enabled in application mode currently as
>>>>>>> 
>>>>>>> you
>>>>>>> 
>>>>>>> said,
>>>>>>> 
>>>>>>> but it could be changed if we have good reasons.
>>>>>>> 
>>>>>>> What do you think about introducing a distributed storage for
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> Gateway?
>>>>>>> 
>>>>>>> We could make use of Flink file systems [1] to distribute the
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> Gateway
>>>>>>> 
>>>>>>> generated resources, that should solve the problem at its root
>>>>>>> 
>>>>>>> cause.
>>>>>>> 
>>>>>>> Users could specify Flink-supported file systems to ship files.
>>>>>>> 
>>>>>>> It’s
>>>>>>> 
>>>>>>> only
>>>>>>> 
>>>>>>> required when using SQL Gateway with K8s application mode.
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> <
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> <
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>>  
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年6月1日 13:55，Weihua Hu <huweihua....@gmail.com 
>>>>>>> <mailto:huweihua....@gmail.com> mailto:huweihua....@gmail.com <mailto:
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com> <mailto:
>>>>>>> 
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com> <mailto:
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com>>>> 写道：
>>>>>>> 
>>>>>>> Thanks Paul for your reply.
>>>>>>> 
>>>>>>> SQLDriver looks good to me.
>>>>>>> 
>>>>>>> 2. Do you mean a pass the SQL string a configuration or a
>>>>>>> 
>>>>>>> program
>>>>>>> 
>>>>>>> argument?
>>>>>>> 
>>>>>>> I brought this up because we were unable to pass the SQL file
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> Flink
>>>>>>> 
>>>>>>> using Kubernetes mode.
>>>>>>> For DataStream/Python users, they need to prepare their images
>>>>>>> 
>>>>>>> for
>>>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> jars
>>>>>>> 
>>>>>>> and dependencies.
>>>>>>> But for SQL users, they can use a common image to run
>>>>>>> 
>>>>>>> different
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> queries
>>>>>>> 
>>>>>>> if there are no other udf requirements.
>>>>>>> It would be great if the SQL query and image were not bound.
>>>>>>> 
>>>>>>> Using strings is a way to decouple these, but just as you
>>>>>>> 
>>>>>>> mentioned,
>>>>>>> 
>>>>>>> it's
>>>>>>> 
>>>>>>> not easy to pass complex SQL.
>>>>>>> 
>>>>>>> use web submission
>>>>>>> 
>>>>>>> AFAIK, we can not use web submission in the Application mode.
>>>>>>> 
>>>>>>> Please
>>>>>>> 
>>>>>>> correct me if I'm wrong.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Weihua
>>>>>>> 
>>>>>>> On Wed, May 31, 2023 at 9:37 PM Paul Lam <
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com>>
>>>>>>> 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>>>
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi Biao,
>>>>>>> 
>>>>>>> Thanks for your comments!
>>>>>>> 
>>>>>>> 1. Scope: is this FLIP only targeted for non-interactive
>>>>>>> 
>>>>>>> Flink
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> jobs
>>>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> Application mode? More specifically, if we use SQL
>>>>>>> 
>>>>>>> client/gateway
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> execute some interactive SQLs like a SELECT query, can we
>>>>>>> 
>>>>>>> ask
>>>>>>> 
>>>>>>> flink
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> use
>>>>>>> 
>>>>>>> Application mode to execute those queries after this FLIP?
>>>>>>> 
>>>>>>> Thanks for pointing it out. I think only DMLs would be
>>>>>>> 
>>>>>>> executed
>>>>>>> 
>>>>>>> via
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> Driver.
>>>>>>> I'll add the scope to the FLIP.
>>>>>>> 
>>>>>>> 2. Deployment: I believe in YARN mode, the implementation is
>>>>>>> 
>>>>>>> trivial as
>>>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> can ship files via YARN's tool easily but for K8s, things
>>>>>>> 
>>>>>>> can
>>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> more
>>>>>>> 
>>>>>>> complicated as Shengkai said.
>>>>>>> 
>>>>>>> Your input is very informative. I’m thinking about using web
>>>>>>> 
>>>>>>> submission,
>>>>>>> 
>>>>>>> but it requires exposing the JobManager port which could also
>>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> problem
>>>>>>> 
>>>>>>> on K8s.
>>>>>>> 
>>>>>>> Another approach is to explicitly require a distributed
>>>>>>> 
>>>>>>> storage
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> ship
>>>>>>> 
>>>>>>> files,
>>>>>>> but we may need a new deployment executor for that.
>>>>>>> 
>>>>>>> What do you think of these two approaches?
>>>>>>> 
>>>>>>> 3. Serialization of SessionState: in SessionState, there are
>>>>>>> 
>>>>>>> some
>>>>>>> 
>>>>>>> unserializable fields
>>>>>>> like
>>>>>>> 
>>>>>>> org.apache.flink.table.resource.ResourceManager#userClassLoader.
>>>>>>> 
>>>>>>> It
>>>>>>> 
>>>>>>> may be worthwhile to add more details about the
>>>>>>> 
>>>>>>> serialization
>>>>>>> 
>>>>>>> part.
>>>>>>> 
>>>>>>> I agree. That’s a missing part. But if we use ExecNodeGraph
>>>>>>> 
>>>>>>> as
>>>>>>> 
>>>>>>> Shengkai
>>>>>>> 
>>>>>>> mentioned, do we eliminate the need for serialization of
>>>>>>> 
>>>>>>> SessionState?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年5月31日 13:07，Biao Geng <biaoge...@gmail.com 
>>>>>>> <mailto:biaoge...@gmail.com> mailto:biaoge...@gmail.com <mailto:
>>>>>>> biaoge...@gmail.com <mailto:biaoge...@gmail.com> 
>>>>>>> mailto:biaoge...@gmail.com> <mailto:
>>>>>>> 
>>>>>>> biaoge...@gmail.com <mailto:biaoge...@gmail.com> 
>>>>>>> mailto:biaoge...@gmail.com <mailto:biaoge...@gmail.com 
>>>>>>> mailto:biaoge...@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> biaoge...@gmail.com <mailto:biaoge...@gmail.com> 
>>>>>>> mailto:biaoge...@gmail.com <mailto:biaoge...@gmail.com 
>>>>>>> mailto:biaoge...@gmail.com> <mailto:
>>>>>>> biaoge...@gmail.com <mailto:biaoge...@gmail.com> 
>>>>>>> mailto:biaoge...@gmail.com <mailto:biaoge...@gmail.com 
>>>>>>> mailto:biaoge...@gmail.com>>>> 写道：
>>>>>>> 
>>>>>>> Thanks Paul for the proposal!I believe it would be very
>>>>>>> 
>>>>>>> useful
>>>>>>> 
>>>>>>> for
>>>>>>> 
>>>>>>> flink
>>>>>>> 
>>>>>>> users.
>>>>>>> After reading the FLIP, I have some questions:
>>>>>>> 1. Scope: is this FLIP only targeted for non-interactive
>>>>>>> 
>>>>>>> Flink
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> jobs
>>>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> Application mode? More specifically, if we use SQL
>>>>>>> 
>>>>>>> client/gateway
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> execute some interactive SQLs like a SELECT query, can we
>>>>>>> 
>>>>>>> ask
>>>>>>> 
>>>>>>> flink
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> use
>>>>>>> 
>>>>>>> Application mode to execute those queries after this FLIP?
>>>>>>> 2. Deployment: I believe in YARN mode, the implementation is
>>>>>>> 
>>>>>>> trivial as
>>>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> can ship files via YARN's tool easily but for K8s, things
>>>>>>> 
>>>>>>> can
>>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> more
>>>>>>> 
>>>>>>> complicated as Shengkai said. I have implemented a simple
>>>>>>> 
>>>>>>> POC
>>>>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> <
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> <
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> <
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> <
>>>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>> 
>>>>>>> based on SQL client before(i.e. consider the SQL client
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> supports
>>>>>>> 
>>>>>>> executing a SQL file as the SQL driver in this FLIP). One
>>>>>>> 
>>>>>>> problem
>>>>>>> 
>>>>>>> I
>>>>>>> 
>>>>>>> have
>>>>>>> 
>>>>>>> met is how do we ship SQL files ( or Job Graph) to the k8s
>>>>>>> 
>>>>>>> side.
>>>>>>> 
>>>>>>> Without
>>>>>>> 
>>>>>>> such support, users have to modify the initContainer or
>>>>>>> 
>>>>>>> rebuild
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> new
>>>>>>> 
>>>>>>> K8s
>>>>>>> 
>>>>>>> image every time to fetch the SQL file. Like the flink k8s
>>>>>>> 
>>>>>>> operator,
>>>>>>> 
>>>>>>> one
>>>>>>> 
>>>>>>> workaround is to utilize the flink config(transforming the
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> file
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> escaped string like Weihua mentioned) which will be
>>>>>>> 
>>>>>>> converted
>>>>>>> 
>>>>>>> to a
>>>>>>> 
>>>>>>> ConfigMap but K8s has size limit of ConfigMaps(no larger
>>>>>>> 
>>>>>>> than
>>>>>>> 
>>>>>>> 1MB
>>>>>>> 
>>>>>>> <
>>>>>>> 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ <
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/>
>>>>>>> 
>>>>>>> <
>>>>>>> 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ <
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/>> <
>>>>>>> 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ <
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/> <
>>>>>>> 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ <
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/ 
>>>>>>> https://kubernetes.io/docs/concepts/configuration/configmap/>>>>).
>>>>>>> 
>>>>>>> Not
>>>>>>> 
>>>>>>> sure
>>>>>>> 
>>>>>>> if we have better solutions.
>>>>>>> 3. Serialization of SessionState: in SessionState, there are
>>>>>>> 
>>>>>>> some
>>>>>>> 
>>>>>>> unserializable fields
>>>>>>> like
>>>>>>> 
>>>>>>> org.apache.flink.table.resource.ResourceManager#userClassLoader.
>>>>>>> 
>>>>>>> It
>>>>>>> 
>>>>>>> may be worthwhile to add more details about the
>>>>>>> 
>>>>>>> serialization
>>>>>>> 
>>>>>>> part.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Biao Geng
>>>>>>> 
>>>>>>> Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>>>
>>>>>>> 
>>>>>>> 于2023年5月31日周三 11:49写道：
>>>>>>> 
>>>>>>> Hi Weihua,
>>>>>>> 
>>>>>>> Thanks a lot for your input! Please see my comments inline.
>>>>>>> 
>>>>>>> - Is SQLRunner the better name? We use this to run a SQL
>>>>>>> 
>>>>>>> Job.
>>>>>>> 
>>>>>>> (Not
>>>>>>> 
>>>>>>> strong,
>>>>>>> 
>>>>>>> the SQLDriver is fine for me)
>>>>>>> 
>>>>>>> I’ve thought about SQL Runner but picked SQL Driver for the
>>>>>>> 
>>>>>>> following
>>>>>>> 
>>>>>>> reasons FYI:
>>>>>>> 
>>>>>>> 1. I have a PythonDriver doing the same job for PyFlink [1]
>>>>>>> 2. Flink program's main class is sort of like Driver in
>>>>>>> 
>>>>>>> JDBC
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> translates SQLs into
>>>>>>> databases specific languages.
>>>>>>> 
>>>>>>> In general, I’m +1 for SQL Driver and +0 for SQL Runner.
>>>>>>> 
>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise,
>>>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> need
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> prepare
>>>>>>> 
>>>>>>> a SQL file in an image for Kubernetes application mode,
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> may
>>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> bit
>>>>>>> 
>>>>>>> cumbersome.
>>>>>>> 
>>>>>>> Do you mean a pass the SQL string a configuration or a
>>>>>>> 
>>>>>>> program
>>>>>>> 
>>>>>>> argument?
>>>>>>> 
>>>>>>> I thought it might be convenient for testing propose, but
>>>>>>> 
>>>>>>> not
>>>>>>> 
>>>>>>> recommended
>>>>>>> 
>>>>>>> for production,
>>>>>>> cause Flink SQLs could be complicated and involves lots of
>>>>>>> 
>>>>>>> characters
>>>>>>> 
>>>>>>> that
>>>>>>> 
>>>>>>> need to escape.
>>>>>>> 
>>>>>>> WDYT?
>>>>>>> 
>>>>>>> - I noticed that we don't specify the SQLDriver jar in the
>>>>>>> 
>>>>>>> "run-application"
>>>>>>> 
>>>>>>> command. Does that mean we need to perform automatic
>>>>>>> 
>>>>>>> detection
>>>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> Flink?
>>>>>>> 
>>>>>>> Yes! It’s like running a PyFlink job with the following
>>>>>>> 
>>>>>>> command:
>>>>>>> 
>>>>>>> `./bin/flink run \\ --pyModule table.word_count \\ --pyFiles 
>>>>>>> examples/python/table`
>>>>>>> 
>>>>>>> The CLI determines if it’s a SQL job, if yes apply the SQL
>>>>>>> 
>>>>>>> Driver
>>>>>>> 
>>>>>>> automatically.
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> <
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> <
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> <
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> 
>>>>>>> <
>>>> 
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> <
>>>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.javahttps://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2023年5月30日 21:56，Weihua Hu <huweihua....@gmail.com 
>>>>>>> <mailto:huweihua....@gmail.com> mailto:huweihua....@gmail.com <mailto:
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com>
>>>>>>> 
>>>>>>> <mailto:
>>>>>>> 
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com>> <mailto:
>>>>>>> 
>>>>>>> huweihua....@gmail.com <mailto:huweihua....@gmail.com> 
>>>>>>> mailto:huweihua....@gmail.com <mailto:huweihua....@gmail.com 
>>>>>>> mailto:huweihua....@gmail.com>>> 写道：
>>>>>>> 
>>>>>>> Thanks Paul for the proposal.
>>>>>>> 
>>>>>>> +1 for this. It is valuable in improving ease of use.
>>>>>>> 
>>>>>>> I have a few questions.
>>>>>>> - Is SQLRunner the better name? We use this to run a SQL
>>>>>>> 
>>>>>>> Job.
>>>>>>> 
>>>>>>> (Not
>>>>>>> 
>>>>>>> strong,
>>>>>>> 
>>>>>>> the SQLDriver is fine for me)
>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise,
>>>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> need
>>>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> prepare
>>>>>>> 
>>>>>>> a SQL file in an image for Kubernetes application mode,
>>>>>>> 
>>>>>>> which
>>>>>>> 
>>>>>>> may
>>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> bit
>>>>>>> 
>>>>>>> cumbersome.
>>>>>>> - I noticed that we don't specify the SQLDriver jar in the
>>>>>>> 
>>>>>>> "run-application"
>>>>>>> 
>>>>>>> command. Does that mean we need to perform automatic
>>>>>>> 
>>>>>>> detection
>>>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> Flink?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Weihua
>>>>>>> 
>>>>>>> On Mon, May 29, 2023 at 7:24 PM Paul Lam <
>>>>>>> 
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>
>>>>>>> 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com 
>>>>>>> <mailto:paullin3...@gmail.com mailto:paullin3...@gmail.com> <mailto:
>>>>>>> paullin3...@gmail.com <mailto:paullin3...@gmail.com> 
>>>>>>> mailto:paullin3...@gmail.com <mailto:paullin3...@gmail.com 
>>>>>>> mailto:paullin3...@gmail.com>>>>
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi team,
>>>>>>> 
>>>>>>> I’d like to start a discussion about FLIP-316 [1], which
>>>>>>> 
>>>>>>> introduces
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> driver as the
>>>>>>> default main class for Flink SQL jobs.
>>>>>>> 
>>>>>>> Currently, Flink SQL could be executed out of the box
>>>>>>> 
>>>>>>> either
>>>>>>> 
>>>>>>> via
>>>>>>> 
>>>>>>> SQL
>>>>>>> 
>>>>>>> Client/Gateway
>>>>>>> or embedded in a Flink Java/Python program.
>>>>>>> 
>>>>>>> However, each one has its drawback:
>>>>>>> 
>>>>>>> - SQL Client/Gateway doesn’t support the application
>>>>>>> 
>>>>>>> deployment
>>>>>>> 
>>>>>>> mode
>>>>>>> 
>>>>>>> [2]
>>>>>>> 
>>>>>>> - Flink Java/Python program requires extra work to write
>>>>>>> 
>>>>>>> a
>>>>>>> 
>>>>>>> non-SQL
>>>>>>> 
>>>>>>> program
>>>>>>> 
>>>>>>> Therefore, I propose adding a SQL driver to act as the
>>>>>>> 
>>>>>>> default
>>>>>>> 
>>>>>>> main
>>>>>>> 
>>>>>>> class
>>>>>>> 
>>>>>>> for SQL jobs.
>>>>>>> Please see the FLIP docs for details and feel free to
>>>>>>> 
>>>>>>> comment.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> [1]
>>>> 
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver
>>>> <
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver
>>>> 
>>>>>>> <
>>>> 
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>> <
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>>  
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>> 
>>>>>>> [2] https://issues.apache.org/jira/browse/FLINK-26541 
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541 <
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541 
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541> <
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541 
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541 <
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541 
>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541>>
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Reply via email to