Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Martijn Visser Thu, 09 Jun 2022 01:37:25 -0700

Hi Paul,

That's a fair point, but I still think we should not offer that capability
via the CLI either. But that's a different discussion :)


Thanks,

Martijn

Op do 9 jun. 2022 om 10:08 schreef Paul Lam <paullin3...@gmail.com>:

> Hi Martijn,
>
> I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM
> mode, since the statement is triggered by users instead of Flink runtime.
>
> We’re simply providing a tool for user to cleanup the savepoints, just
> like `bin/flink savepoint -d :savepointPath` in Flink CLI [1].
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints
>
> Best,
> Paul Lam
>
> 2022年6月9日 15:41，Martijn Visser <martijnvis...@apache.org> 写道：
>
> Hi all,
>
> I would not include a DROP SAVEPOINT syntax. With the recently introduced
> CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot
> ownership and if you have a savepoint established "with NO_CLAIM it creates
> its own copy and leaves the existing one up to the user." [1] We shouldn't
> then again make it fuzzy by making it possible that Flink can remove
> snapshots.
>
> Best regards,
>
> Martijn
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
>
> Op do 9 jun. 2022 om 09:27 schreef Paul Lam <paullin3...@gmail.com>:
>
>> Hi team,
>>
>> It's great to see our opinions are finally converging!
>>
>> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>
>>
>> LGTM. Adding it to the FLIP.
>>
>> To Jark,
>>
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>>
>>
>> Good point. The default savepoint dir should be enough for most cases.
>>
>> To Jing,
>>
>> DROP SAVEPOINT ALL
>>
>>
>> I think it’s valid to have such a statement, but I have two concerns:
>>
>>    - `ALL` is already an SQL keyword, thus it may cause ambiguity.
>>    - Flink CLI and REST API doesn’t provided the corresponding
>>    functionalities, and we’d better keep them aligned.
>>
>> How about making this statement as follow-up tasks which should touch
>> REST API and Flink CLI?
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月9日 11:53，godfrey he <godfre...@gmail.com> 写道：
>>
>> Hi all,
>>
>> Regarding `PIPELINE`, it comes from flink-core module, see
>> `PipelineOptions` class for more details.
>> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with
>> `JOBS`.
>>
>> +1 to discuss JOBTREE in other FLIP.
>>
>> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>
>> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT
>> <savepoint_path>`
>>
>> Best,
>> Godfrey
>>
>> Jing Ge <j...@ververica.com> 于2022年6月9日周四 01:48写道：
>>
>>
>> Hi Paul, Hi Jark,
>>
>> Re JOBTREE, agree that it is out of the scope of this FLIP
>>
>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
>> SAVEPOINT ALL' housekeeping. WDYT?
>>
>> Best regards,
>> Jing
>>
>>
>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <imj...@gmail.com> wrote:
>>
>>
>> Hi Jing,
>>
>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of
>> the scope
>> of this FLIP and can be discussed in another FLIP.
>>
>> Job lineage is a big topic that may involve many problems:
>> 1) how to collect and report job entities, attributes, and lineages?
>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>> 3) how does Flink SQL CLI/Gateway know the lineage information and show
>> jobtree?
>> 4) ...
>>
>> Best,
>> Jark
>>
>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <imj...@gmail.com> wrote:
>>
>>
>> Hi Paul,
>>
>> I'm fine with using JOBS. The only concern is that this may conflict with
>> displaying more detailed
>> information for query (e.g. query content, plan) in the future, e.g. SHOW
>> QUERIES EXTENDED in ksqldb[1].
>> This is not a big problem as we can introduce SHOW QUERIES in the future
>> if necessary.
>>
>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> `table.job.stop-with-drain`)
>>
>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>> It might be trivial and error-prone to set configuration before executing
>> a statement,
>> and the configuration will affect all statements after that.
>>
>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>> and always use configuration "state.savepoints.dir" as the default
>> savepoint dir.
>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>> and savepoint_path is the returned value.
>>
>> I'm fine with other changes.
>>
>> Thanks,
>> Jark
>>
>> [1]:
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>
>>
>>
>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <paullin3...@gmail.com> wrote:
>>
>>
>> Hi Jing,
>>
>> Thank you for your inputs!
>>
>> TBH, I haven’t considered the ETL scenario that you mentioned. I think
>> they’re managed just like other jobs interns of job lifecycles (please
>> correct me if I’m wrong).
>>
>> WRT to the SQL statements about SQL lineages, I think it might be a
>> little bit out of the scope of the FLIP, since it’s mainly about
>> lifecycles. By the way, do we have these functionalities in Flink CLI or
>> REST API already?
>>
>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
>> updating the FLIP arcading to the latest discussions.
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月8日 07:31，Jing Ge <j...@ververica.com> 写道：
>>
>> Hi Paul,
>>
>> Sorry that I am a little bit too late to join this thread. Thanks for
>> driving this and starting this informative discussion. The FLIP looks
>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>
>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
>> build a DAG for many DAGs?
>>
>> 1)
>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
>> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
>> responsible to *produce* data as the result(cube, materialized view, etc.)
>> for the future consumption by queries. The INSERT INTO SELECT FROM example
>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
>> Jobs instead of Queries.
>>
>> 2)
>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
>> support syntax like:
>>
>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
>> job_id
>> SHOW JOBTREES // shows all DAGs
>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>
>> 3)
>> Could we also support Savepoint housekeeping syntax? We ran into this
>> issue that a lot of savepoints have been created by customers (via their
>> apps). It will take extra (hacking) effort to clean it.
>>
>> RELEASE SAVEPOINT ALL
>>
>> Best regards,
>> Jing
>>
>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvis...@apache.org>
>> wrote:
>>
>>
>> Hi Paul,
>>
>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>> could
>> imply that this will actually show the query, but we're returning IDs of
>> the running application. At first I was also not very much in favour of
>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>
>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>
>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3...@gmail.com>:
>>
>> Hi Godfrey,
>>
>> Sorry for the late reply, I was on vacation.
>>
>> It looks like we have a variety of preferences on the syntax, how about we
>> choose the most acceptable one?
>>
>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>> would be:
>>
>> - SHOW JOBS
>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> `table.job.stop-with-drain`)
>>
>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>> JOB`:
>>
>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>> manager remembers)
>> - DROP SAVEPOINT <savepoint_path>
>>
>> cc @Jark @ShengKai @Martijn @Timo .
>>
>> Best,
>> Paul Lam
>>
>>
>> godfrey he <godfre...@gmail.com> 于2022年5月23日周一 21:34写道：
>>
>> Hi Paul,
>>
>> Thanks for the update.
>>
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>
>> (DataStream or SQL) or
>> clients (SQL client or CLI).
>>
>> Is DataStream job a QUERY? I think not.
>> For a QUERY, the most important concept is the statement. But the
>> result does not contain this info.
>> If we need to contain all jobs in the cluster, I think the name should
>> be JOB or PIPELINE.
>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>
>> SHOW SAVEPOINTS
>>
>> To list the savepoint for a specific job, we need to specify a
>> specific pipeline,
>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>
>> Best,
>> Godfrey
>>
>> Paul Lam <paullin3...@gmail.com> 于2022年5月20日周五 11:25写道：
>>
>>
>> Hi Jark,
>>
>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>
>> Another question is, what should be the syntax for ungracefully
>> canceling a query? As ShengKai pointed out in a offline discussion,
>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>
>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>> that savepoints are owned by users and beyond the lifecycle of a Flink
>> cluster. For example, a user might take a savepoint at a custom path
>> that’s different than the default savepoint path, I think jobmanager
>>
>> would
>>
>> not remember that, not to mention the jobmanager may be a fresh new
>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>> probably a best-effort one.
>>
>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>> Savepoints are alias for nested transactions in DB area[1], and there’s
>> correspondingly global transactions. If we consider Flink jobs as
>> global transactions and Flink checkpoints as nested transactions,
>> then the savepoint semantics are close, thus I think savepoint syntax
>> in SQL-standard could be considered. But again, I’m don’t have very
>> strong preference.
>>
>> Ping @Timo to get more inputs.
>>
>> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>
>> https://en.wikipedia.org/wiki/Nested_transaction>
>>
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月18日 17:48，Jark Wu <imj...@gmail.com> 写道：
>>
>> Hi Paul,
>>
>> 1) SHOW QUERIES
>> +1 to add finished time, but it would be better to call it "end_time"
>>
>> to
>>
>> keep aligned with names in Web UI.
>>
>> 2) DROP QUERY
>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>
>> to
>>
>> stop batch queries?
>> At present, I don't think "DROP" is a suitable keyword for this
>>
>> statement.
>>
>> From the perspective of users, "DROP" sounds like the query should be
>> removed from the
>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>
>> more
>>
>> suitable and
>> compliant with commands of Flink CLI.
>>
>> 3) SHOW SAVEPOINTS
>> I think this statement is needed, otherwise, savepoints are lost
>>
>> after the
>>
>> SAVEPOINT
>> command is executed. Savepoints can be retrieved from REST API
>> "/jobs/:jobid/checkpoints"
>> with filtering "checkpoint_type"="savepoint". It's also worth
>>
>> considering
>>
>> providing "SHOW CHECKPOINTS"
>> to list all checkpoints.
>>
>> 4) SAVEPOINT & RELEASE SAVEPOINT
>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>
>> statements
>>
>> now.
>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>
>> both
>>
>> the same savepoint id.
>> However, in our syntax, the first one is query id, and the second one
>>
>> is
>>
>> savepoint path, which is confusing and
>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>
>> they
>>
>> should be in the same syntax set.
>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>> <sp_path>.
>> That means we don't follow the majority of vendors in SAVEPOINT
>>
>> commands. I
>>
>> would say the purpose is different in Flink.
>> What other's opinion on this?
>>
>> Best,
>> Jark
>>
>> [1]:
>>
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>
>>
>>
>> On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3...@gmail.com> wrote:
>>
>> Hi Godfrey,
>>
>> Thanks a lot for your inputs!
>>
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>
>> (DataStream
>>
>> or SQL) or
>> clients (SQL client or CLI). Under the hook, it’s based on
>> ClusterClient#listJobs, the
>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>
>> in SQL
>>
>> client, because
>> these jobs can be managed via SQL client too.
>>
>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>
>> I’m a
>>
>> bit afraid that the
>> rows would be too long.
>>
>> WRT ‘DROP QUERY’,
>>
>> What's the behavior for batch jobs and the non-running jobs?
>>
>>
>>
>> In general, the behavior would be aligned with Flink CLI. Triggering
>>
>> a
>>
>> savepoint for
>> a non-running job would cause errors, and the error message would be
>> printed to
>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>> streaming
>> execution mode would be the same with streaming jobs. However, for
>>
>> batch
>>
>> jobs in
>> batch execution mode, I think there would be an error, because batch
>> execution
>> doesn’t support checkpoints currently (please correct me if I’m
>>
>> wrong).
>>
>>
>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>
>> clusterClient/
>>
>> jobClient doesn’t have such a functionality at the moment, neither do
>> Flink CLI.
>> Maybe we could make it a follow-up FLIP, which includes the
>>
>> modifications
>>
>> to
>> clusterClient/jobClient and Flink CLI. WDYT?
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月17日 20:34，godfrey he <godfre...@gmail.com> 写道：
>>
>> Godfrey
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to