Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Paul Lam Thu, 19 May 2022 20:25:03 -0700

Hi Jark,

WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
part of the reason why I proposed “STOP/CANCEL QUERY” at the
beginning. The downside of it is that it’s not ANSI-SQL compatible.


Another question is, what should be the syntax for ungracefully 
canceling a query? As ShengKai pointed out in a offline discussion, 
“STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
Flink CLI has both stop and cancel, mostly due to historical problems.

WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
that savepoints are owned by users and beyond the lifecycle of a Flink
cluster. For example, a user might take a savepoint at a custom path
that’s different than the default savepoint path, I think jobmanager would
not remember that, not to mention the jobmanager may be a fresh new
one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's 
probably a best-effort one.

WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
Savepoints are alias for nested transactions in DB area[1], and there’s
correspondingly global transactions. If we consider Flink jobs as 
global transactions and Flink checkpoints as nested transactions,
then the savepoint semantics are close, thus I think savepoint syntax 
in SQL-standard could be considered. But again, I’m don’t have very
strong preference.

Ping @Timo to get more inputs.

[1] https://en.wikipedia.org/wiki/Nested_transaction 
<https://en.wikipedia.org/wiki/Nested_transaction>

Best,
Paul Lam

> 2022年5月18日 17:48，Jark Wu <[email protected]> 写道：
> 
> Hi Paul,
> 
> 1) SHOW QUERIES
> +1 to add finished time, but it would be better to call it "end_time" to
> keep aligned with names in Web UI.
> 
> 2) DROP QUERY
> I think we shouldn't throw exceptions for batch jobs, otherwise, how to
> stop batch queries?
> At present, I don't think "DROP" is a suitable keyword for this statement.
> From the perspective of users, "DROP" sounds like the query should be
> removed from the
> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
> suitable and
> compliant with commands of Flink CLI.
> 
> 3) SHOW SAVEPOINTS
> I think this statement is needed, otherwise, savepoints are lost after the
> SAVEPOINT
> command is executed. Savepoints can be retrieved from REST API
> "/jobs/:jobid/checkpoints"
> with filtering "checkpoint_type"="savepoint". It's also worth considering
> providing "SHOW CHECKPOINTS"
> to list all checkpoints.
> 
> 4) SAVEPOINT & RELEASE SAVEPOINT
> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT statements
> now.
> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are both
> the same savepoint id.
> However, in our syntax, the first one is query id, and the second one is
> savepoint path, which is confusing and
> not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
> should be in the same syntax set.
> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> <sp_path>.
> That means we don't follow the majority of vendors in SAVEPOINT commands. I
> would say the purpose is different in Flink.
> What other's opinion on this?
> 
> Best,
> Jark
> 
> [1]:
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> 
> 
> On Wed, 18 May 2022 at 14:43, Paul Lam <[email protected]> wrote:
> 
>> Hi Godfrey,
>> 
>> Thanks a lot for your inputs!
>> 
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream
>> or SQL) or
>> clients (SQL client or CLI). Under the hook, it’s based on
>> ClusterClient#listJobs, the
>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL
>> client, because
>> these jobs can be managed via SQL client too.
>> 
>> WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a
>> bit afraid that the
>> rows would be too long.
>> 
>> WRT ‘DROP QUERY’,
>>> What's the behavior for batch jobs and the non-running jobs?
>> 
>> 
>> In general, the behavior would be aligned with Flink CLI. Triggering a
>> savepoint for
>> a non-running job would cause errors, and the error message would be
>> printed to
>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>> streaming
>> execution mode would be the same with streaming jobs. However, for batch
>> jobs in
>> batch execution mode, I think there would be an error, because batch
>> execution
>> doesn’t support checkpoints currently (please correct me if I’m wrong).
>> 
>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
>> jobClient doesn’t have such a functionality at the moment, neither do
>> Flink CLI.
>> Maybe we could make it a follow-up FLIP, which includes the modifications
>> to
>> clusterClient/jobClient and Flink CLI. WDYT?
>> 
>> Best,
>> Paul Lam
>> 
>>> 2022年5月17日 20:34，godfrey he <[email protected]> 写道：
>>> 
>>> Godfrey
>> 
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to