Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Paul Lam Fri, 02 Jun 2023 03:03:06 -0700

Hi Jark,

Thanks a lot for your input!


> If we decide to submit ExecNodeGraph instead of SQL file, is it still
> necessary to support SQL Driver? 

I think so. Apart from usage in SQL Gateway, SQL Driver could simplify
Flink SQL execution with Flink CLI.

As the FLIP said, it’s good to have a default main class for Flink SQLs,
which allows users to submit Flink SQLs in the same way as DataStream
jobs, or else users need to write their own main class.

>  SQL Driver needs to serialize SessionState which is very challenging
> but not detailed covered in the FLIP.

With the help of ExecNodeGraph, do we still need the serialized 
SessionState? If not, we could make SQL Driver accepts two serialized
formats:

- SQL files for user-facing public usage
- ExecNodeGraph for internal usage

It’s kind of similar to the relationship between job jars and jobgraphs.

> Regarding "K8S doesn't support shipping multiple jars", is that true? Is it
> possible to support it?

Yes, K8s doesn’t distribute any files. It’s the users’ responsibility to make
sure the resources are accessible in the containers. The common solutions
I know is to use distributed file systems or use init containers to localize the
resources. 

Now I lean toward introducing a fs to do the distribution job. WDYT?

Best,
Paul Lam

> 2023年6月1日 20:33，Jark Wu <imj...@gmail.com> 写道：
> 
> Hi Paul,
> 
> Thanks for starting this discussion. I like the proposal! This is a
> frequently requested feature!
> 
> I agree with Shengkai that ExecNodeGraph as the submission object is a
> better idea than SQL file. To be more specific, it should be JsonPlanGraph
> or CompiledPlan which is the serializable representation. CompiledPlan is a
> clear separation between compiling/optimization/validation and execution.
> This can keep the validation and metadata accessing still on the SQLGateway
> side. This allows SQLGateway to leverage some metadata caching and UDF JAR
> caching for better compiling performance.
> 
> If we decide to submit ExecNodeGraph instead of SQL file, is it still
> necessary to support SQL Driver? Regarding non-interactive SQL jobs, users
> can use the Table API program for application mode. SQL Driver needs to
> serialize SessionState which is very challenging but not detailed covered
> in the FLIP.
> 
> Regarding "K8S doesn't support shipping multiple jars", is that true? Is it
> possible to support it?
> 
> Best,
> Jark
> 
> 
> 
> On Thu, 1 Jun 2023 at 16:58, Paul Lam <paullin3...@gmail.com> wrote:
> 
>> Hi Weihua,
>> 
>> You’re right. Distributing the SQLs to the TMs is one of the challenging
>> parts of this FLIP.
>> 
>> Web submission is not enabled in application mode currently as you said,
>> but it could be changed if we have good reasons.
>> 
>> What do you think about introducing a distributed storage for SQL Gateway?
>> 
>> We could make use of Flink file systems [1] to distribute the SQL Gateway
>> generated resources, that should solve the problem at its root cause.
>> 
>> Users could specify Flink-supported file systems to ship files. It’s only
>> required when using SQL Gateway with K8s application mode.
>> 
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/overview/
>> 
>> Best,
>> Paul Lam
>> 
>>> 2023年6月1日 13:55，Weihua Hu <huweihua....@gmail.com> 写道：
>>> 
>>> Thanks Paul for your reply.
>>> 
>>> SQLDriver looks good to me.
>>> 
>>> 2. Do you mean a pass the SQL string a configuration or a program
>> argument?
>>> 
>>> 
>>> I brought this up because we were unable to pass the SQL file to Flink
>>> using Kubernetes mode.
>>> For DataStream/Python users, they need to prepare their images for the
>> jars
>>> and dependencies.
>>> But for SQL users, they can use a common image to run different SQL
>> queries
>>> if there are no other udf requirements.
>>> It would be great if the SQL query and image were not bound.
>>> 
>>> Using strings is a way to decouple these, but just as you mentioned, it's
>>> not easy to pass complex SQL.
>>> 
>>>> use web submission
>>> AFAIK, we can not use web submission in the Application mode. Please
>>> correct me if I'm wrong.
>>> 
>>> 
>>> Best,
>>> Weihua
>>> 
>>> 
>>> On Wed, May 31, 2023 at 9:37 PM Paul Lam <paullin3...@gmail.com> wrote:
>>> 
>>>> Hi Biao,
>>>> 
>>>> Thanks for your comments!
>>>> 
>>>>> 1. Scope: is this FLIP only targeted for non-interactive Flink SQL jobs
>>>> in
>>>>> Application mode? More specifically, if we use SQL client/gateway to
>>>>> execute some interactive SQLs like a SELECT query, can we ask flink to
>>>> use
>>>>> Application mode to execute those queries after this FLIP?
>>>> 
>>>> Thanks for pointing it out. I think only DMLs would be executed via SQL
>>>> Driver.
>>>> I'll add the scope to the FLIP.
>>>> 
>>>>> 2. Deployment: I believe in YARN mode, the implementation is trivial as
>>>> we
>>>>> can ship files via YARN's tool easily but for K8s, things can be more
>>>>> complicated as Shengkai said.
>>>> 
>>>> 
>>>> Your input is very informative. I’m thinking about using web submission,
>>>> but it requires exposing the JobManager port which could also be a
>> problem
>>>> on K8s.
>>>> 
>>>> Another approach is to explicitly require a distributed storage to ship
>>>> files,
>>>> but we may need a new deployment executor for that.
>>>> 
>>>> What do you think of these two approaches?
>>>> 
>>>>> 3. Serialization of SessionState: in SessionState, there are some
>>>>> unserializable fields
>>>>> like org.apache.flink.table.resource.ResourceManager#userClassLoader.
>> It
>>>>> may be worthwhile to add more details about the serialization part.
>>>> 
>>>> I agree. That’s a missing part. But if we use ExecNodeGraph as Shengkai
>>>> mentioned, do we eliminate the need for serialization of SessionState?
>>>> 
>>>> Best,
>>>> Paul Lam
>>>> 
>>>>> 2023年5月31日 13:07，Biao Geng <biaoge...@gmail.com> 写道：
>>>>> 
>>>>> Thanks Paul for the proposal!I believe it would be very useful for
>> flink
>>>>> users.
>>>>> After reading the FLIP, I have some questions:
>>>>> 1. Scope: is this FLIP only targeted for non-interactive Flink SQL jobs
>>>> in
>>>>> Application mode? More specifically, if we use SQL client/gateway to
>>>>> execute some interactive SQLs like a SELECT query, can we ask flink to
>>>> use
>>>>> Application mode to execute those queries after this FLIP?
>>>>> 2. Deployment: I believe in YARN mode, the implementation is trivial as
>>>> we
>>>>> can ship files via YARN's tool easily but for K8s, things can be more
>>>>> complicated as Shengkai said. I have implemented a simple POC
>>>>> <
>>>> 
>> https://github.com/bgeng777/flink/commit/5b4338fe52ec343326927f0fc12f015dd22b1133
>>>>> 
>>>>> based on SQL client before(i.e. consider the SQL client which supports
>>>>> executing a SQL file as the SQL driver in this FLIP). One problem I
>> have
>>>>> met is how do we ship SQL files ( or Job Graph) to the k8s side.
>> Without
>>>>> such support, users have to modify the initContainer or rebuild a new
>> K8s
>>>>> image every time to fetch the SQL file. Like the flink k8s operator,
>> one
>>>>> workaround is to utilize the flink config(transforming the SQL file to
>> a
>>>>> escaped string like Weihua mentioned) which will be converted to a
>>>>> ConfigMap but K8s has size limit of ConfigMaps(no larger than 1MB
>>>>> <https://kubernetes.io/docs/concepts/configuration/configmap/>). Not
>>>> sure
>>>>> if we have better solutions.
>>>>> 3. Serialization of SessionState: in SessionState, there are some
>>>>> unserializable fields
>>>>> like org.apache.flink.table.resource.ResourceManager#userClassLoader.
>> It
>>>>> may be worthwhile to add more details about the serialization part.
>>>>> 
>>>>> Best,
>>>>> Biao Geng
>>>>> 
>>>>> Paul Lam <paullin3...@gmail.com> 于2023年5月31日周三 11:49写道：
>>>>> 
>>>>>> Hi Weihua,
>>>>>> 
>>>>>> Thanks a lot for your input! Please see my comments inline.
>>>>>> 
>>>>>>> - Is SQLRunner the better name? We use this to run a SQL Job. (Not
>>>>>> strong,
>>>>>>> the SQLDriver is fine for me)
>>>>>> 
>>>>>> I’ve thought about SQL Runner but picked SQL Driver for the following
>>>>>> reasons FYI:
>>>>>> 
>>>>>> 1. I have a PythonDriver doing the same job for PyFlink [1]
>>>>>> 2. Flink program's main class is sort of like Driver in JDBC which
>>>>>> translates SQLs into
>>>>>>  databases specific languages.
>>>>>> 
>>>>>> In general, I’m +1 for SQL Driver and +0 for SQL Runner.
>>>>>> 
>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise, we need to
>>>>>> prepare
>>>>>>> a SQL file in an image for Kubernetes application mode, which may be
>> a
>>>>>> bit
>>>>>>> cumbersome.
>>>>>> 
>>>>>> Do you mean a pass the SQL string a configuration or a program
>> argument?
>>>>>> 
>>>>>> I thought it might be convenient for testing propose, but not
>>>> recommended
>>>>>> for production,
>>>>>> cause Flink SQLs could be complicated and involves lots of characters
>>>> that
>>>>>> need to escape.
>>>>>> 
>>>>>> WDYT?
>>>>>> 
>>>>>>> - I noticed that we don't specify the SQLDriver jar in the
>>>>>> "run-application"
>>>>>>> command. Does that mean we need to perform automatic detection in
>>>> Flink?
>>>>>> 
>>>>>> Yes! It’s like running a PyFlink job with the following command:
>>>>>> 
>>>>>> ```
>>>>>> ./bin/flink run \
>>>>>>    --pyModule table.word_count \
>>>>>>    --pyFiles examples/python/table
>>>>>> ```
>>>>>> 
>>>>>> The CLI determines if it’s a SQL job, if yes apply the SQL Driver
>>>>>> automatically.
>>>>>> 
>>>>>> 
>>>>>> [1]
>>>>>> 
>>>> 
>> https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonDriver.java
>>>>>> 
>>>>>> Best,
>>>>>> Paul Lam
>>>>>> 
>>>>>>> 2023年5月30日 21:56，Weihua Hu <huweihua....@gmail.com> 写道：
>>>>>>> 
>>>>>>> Thanks Paul for the proposal.
>>>>>>> 
>>>>>>> +1 for this. It is valuable in improving ease of use.
>>>>>>> 
>>>>>>> I have a few questions.
>>>>>>> - Is SQLRunner the better name? We use this to run a SQL Job. (Not
>>>>>> strong,
>>>>>>> the SQLDriver is fine for me)
>>>>>>> - Could we run SQL jobs using SQL in strings? Otherwise, we need to
>>>>>> prepare
>>>>>>> a SQL file in an image for Kubernetes application mode, which may be
>> a
>>>>>> bit
>>>>>>> cumbersome.
>>>>>>> - I noticed that we don't specify the SQLDriver jar in the
>>>>>> "run-application"
>>>>>>> command. Does that mean we need to perform automatic detection in
>>>> Flink?
>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> Weihua
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, May 29, 2023 at 7:24 PM Paul Lam <paullin3...@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi team,
>>>>>>>> 
>>>>>>>> I’d like to start a discussion about FLIP-316 [1], which introduces
>> a
>>>>>> SQL
>>>>>>>> driver as the
>>>>>>>> default main class for Flink SQL jobs.
>>>>>>>> 
>>>>>>>> Currently, Flink SQL could be executed out of the box either via SQL
>>>>>>>> Client/Gateway
>>>>>>>> or embedded in a Flink Java/Python program.
>>>>>>>> 
>>>>>>>> However, each one has its drawback:
>>>>>>>> 
>>>>>>>> - SQL Client/Gateway doesn’t support the application deployment mode
>>>> [2]
>>>>>>>> - Flink Java/Python program requires extra work to write a non-SQL
>>>>>> program
>>>>>>>> 
>>>>>>>> Therefore, I propose adding a SQL driver to act as the default main
>>>>>> class
>>>>>>>> for SQL jobs.
>>>>>>>> Please see the FLIP docs for details and feel free to comment.
>> Thanks!
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316%3A+Introduce+SQL+Driver
>>>>>>>> <
>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-316:+Introduce+SQL+Driver
>>>>>>>>> 
>>>>>>>> [2] https://issues.apache.org/jira/browse/FLINK-26541 <
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-26541>
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Paul Lam
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

Reply via email to