Re: [DISCUSS]FLIP-163: SQL Client Improvements

Rui Li Sun, 31 Jan 2021 22:46:41 -0800

Thanks Shengkai for the update! The proposed changes look good to me.

On Fri, Jan 29, 2021 at 8:26 PM Shengkai Fang <[email protected]> wrote:


> Hi, Rui.
> You are right. I have already modified the FLIP.
>
> The main changes:
>
> # -f parameter has no restriction about the statement type.
> Sometimes, users use the pipe to redirect the result of queries to debug
> when submitting job by -f parameter. It's much convenient comparing to
> writing INSERT INTO statements.
>
> # Add a new sql client option `sql-client.job.detach` .
> Users prefer to execute jobs one by one in the batch mode. Users can set
> this option false and the client will process the next job until the
> current job finishes. The default value of this option is false, which
> means the client will execute the next job when the current job is
> submitted.
>
> Best,
> Shengkai
>
>
>
> Rui Li <[email protected]> 于2021年1月29日周五 下午4:52写道：
>
>> Hi Shengkai,
>>
>> Regarding #2, maybe the -f options in flink and hive have different
>> implications, and we should clarify the behavior. For example, if the
>> client just submits the job and exits, what happens if the file contains
>> two INSERT statements? I don't think we should treat them as a statement
>> set, because users should explicitly write BEGIN STATEMENT SET in that
>> case. And the client shouldn't asynchronously submit the two jobs, because
>> the 2nd may depend on the 1st, right?
>>
>> On Fri, Jan 29, 2021 at 4:30 PM Shengkai Fang <[email protected]> wrote:
>>
>>> Hi Rui,
>>> Thanks for your feedback. I agree with your suggestions.
>>>
>>> For the suggestion 1: Yes. we are plan to strengthen the set command. In
>>> the implementation, it will just put the key-value into the
>>> `Configuration`, which will be used to generate the table config. If hive
>>> supports to read the setting from the table config, users are able to set
>>> the hive-related settings.
>>>
>>> For the suggestion 2: The -f parameter will submit the job and exit. If
>>> the queries never end, users have to cancel the job by themselves, which is
>>> not reliable(people may forget their jobs). In most case, queries are used
>>> to analyze the data. Users should use queries in the interactive mode.
>>>
>>> Best,
>>> Shengkai
>>>
>>> Rui Li <[email protected]> 于2021年1月29日周五 下午3:18写道：
>>>
>>>> Thanks Shengkai for bringing up this discussion. I think it covers a
>>>> lot of useful features which will dramatically improve the usability of our
>>>> SQL Client. I have two questions regarding the FLIP.
>>>>
>>>> 1. Do you think we can let users set arbitrary configurations via the
>>>> SET command? A connector may have its own configurations and we don't have
>>>> a way to dynamically change such configurations in SQL Client. For example,
>>>> users may want to be able to change hive conf when using hive connector 
>>>> [1].
>>>> 2. Any reason why we have to forbid queries in SQL files specified with
>>>> the -f option? Hive supports a similar -f option but allows queries in the
>>>> file. And a common use case is to run some query and redirect the results
>>>> to a file. So I think maybe flink users would like to do the same,
>>>> especially in batch scenarios.
>>>>
>>>> [1] https://issues.apache.org/jira/browse/FLINK-20590
>>>>
>>>> On Fri, Jan 29, 2021 at 10:46 AM Sebastian Liu <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Shengkai,
>>>>>
>>>>> Glad to see this improvement. And I have some additional suggestions:
>>>>>
>>>>> #1. Unify the TableEnvironment in ExecutionContext to
>>>>> StreamTableEnvironment for both streaming and batch sql.
>>>>> #2. Improve the way of results retrieval: sql client collect the
>>>>> results
>>>>> locally all at once using accumulators at present,
>>>>>       which may have memory issues in JM or Local for the big query
>>>>> result.
>>>>> Accumulator is only suitable for testing purpose.
>>>>>       We may change to use SelectTableSink, which is based
>>>>> on CollectSinkOperatorCoordinator.
>>>>> #3. Do we need to consider Flink SQL gateway which is in FLIP-91. Seems
>>>>> that this FLIP has not moved forward for a long time.
>>>>>       Provide a long running service out of the box to facilitate the
>>>>> sql
>>>>> submission is necessary.
>>>>>
>>>>> What do you think of these?
>>>>>
>>>>> [1]
>>>>>
>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway
>>>>>
>>>>>
>>>>> Shengkai Fang <[email protected]> 于2021年1月28日周四 下午8:54写道：
>>>>>
>>>>> > Hi devs,
>>>>> >
>>>>> > Jark and I want to start a discussion about FLIP-163:SQL Client
>>>>> > Improvements.
>>>>> >
>>>>> > Many users have complained about the problems of the sql client. For
>>>>> > example, users can not register the table proposed by FLIP-95.
>>>>> >
>>>>> > The main changes in this FLIP:
>>>>> >
>>>>> > - use -i parameter to specify the sql file to initialize the table
>>>>> > environment and deprecated YAML file;
>>>>> > - add -f to submit sql file and deprecated '-u' parameter;
>>>>> > - add more interactive commands, e.g ADD JAR;
>>>>> > - support statement set syntax;
>>>>> >
>>>>> >
>>>>> > For more detailed changes, please refer to FLIP-163[1].
>>>>> >
>>>>> > Look forward to your feedback.
>>>>> >
>>>>> >
>>>>> > Best,
>>>>> > Shengkai
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-163%3A+SQL+Client+Improvements
>>>>> >
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> *With kind regards
>>>>> ------------------------------------------------------------
>>>>> Sebastian Liu 刘洋
>>>>> Institute of Computing Technology, Chinese Academy of Science
>>>>> Mobile\WeChat: +86—15201613655
>>>>> E-mail: [email protected] <[email protected]>
>>>>> QQ: 3239559*
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards!
>>>> Rui Li
>>>>
>>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>

-- 
Best regards!
Rui Li

Re: [DISCUSS]FLIP-163: SQL Client Improvements

Reply via email to