Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Martijn Visser Thu, 09 Jun 2022 00:41:37 -0700

Hi all,

I would not include a DROP SAVEPOINT syntax. With the recently introduced
CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot
ownership and if you have a savepoint established "with NO_CLAIM it creates
its own copy and leaves the existing one up to the user." [1] We shouldn't
then again make it fuzzy by making it possible that Flink can remove
snapshots.


Best regards,

Martijn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership

Op do 9 jun. 2022 om 09:27 schreef Paul Lam <paullin3...@gmail.com>:

> Hi team,
>
> It's great to see our opinions are finally converging!
>
> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>
>
> LGTM. Adding it to the FLIP.
>
> To Jark,
>
> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>
>
> Good point. The default savepoint dir should be enough for most cases.
>
> To Jing,
>
> DROP SAVEPOINT ALL
>
>
> I think it’s valid to have such a statement, but I have two concerns:
>
>    - `ALL` is already an SQL keyword, thus it may cause ambiguity.
>    - Flink CLI and REST API doesn’t provided the corresponding
>    functionalities, and we’d better keep them aligned.
>
> How about making this statement as follow-up tasks which should touch REST
> API and Flink CLI?
>
> Best,
> Paul Lam
>
> 2022年6月9日 11:53，godfrey he <godfre...@gmail.com> 写道：
>
> Hi all,
>
> Regarding `PIPELINE`, it comes from flink-core module, see
> `PipelineOptions` class for more details.
> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with
> `JOBS`.
>
> +1 to discuss JOBTREE in other FLIP.
>
> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>
> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT
> <savepoint_path>`
>
> Best,
> Godfrey
>
> Jing Ge <j...@ververica.com> 于2022年6月9日周四 01:48写道：
>
>
> Hi Paul, Hi Jark,
>
> Re JOBTREE, agree that it is out of the scope of this FLIP
>
> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
> SAVEPOINT ALL' housekeeping. WDYT?
>
> Best regards,
> Jing
>
>
> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <imj...@gmail.com> wrote:
>
>
> Hi Jing,
>
> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the
> scope
> of this FLIP and can be discussed in another FLIP.
>
> Job lineage is a big topic that may involve many problems:
> 1) how to collect and report job entities, attributes, and lineages?
> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
> 3) how does Flink SQL CLI/Gateway know the lineage information and show
> jobtree?
> 4) ...
>
> Best,
> Jark
>
> On Wed, 8 Jun 2022 at 20:44, Jark Wu <imj...@gmail.com> wrote:
>
>
> Hi Paul,
>
> I'm fine with using JOBS. The only concern is that this may conflict with
> displaying more detailed
> information for query (e.g. query content, plan) in the future, e.g. SHOW
> QUERIES EXTENDED in ksqldb[1].
> This is not a big problem as we can introduce SHOW QUERIES in the future
> if necessary.
>
> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
>
> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
> It might be trivial and error-prone to set configuration before executing
> a statement,
> and the configuration will affect all statements after that.
>
> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>
> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
> and always use configuration "state.savepoints.dir" as the default
> savepoint dir.
> The concern with using "<savepoint_path>" is here should be savepoint dir,
> and savepoint_path is the returned value.
>
> I'm fine with other changes.
>
> Thanks,
> Jark
>
> [1]:
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>
>
>
> On Wed, 8 Jun 2022 at 15:07, Paul Lam <paullin3...@gmail.com> wrote:
>
>
> Hi Jing,
>
> Thank you for your inputs!
>
> TBH, I haven’t considered the ETL scenario that you mentioned. I think
> they’re managed just like other jobs interns of job lifecycles (please
> correct me if I’m wrong).
>
> WRT to the SQL statements about SQL lineages, I think it might be a little
> bit out of the scope of the FLIP, since it’s mainly about lifecycles. By
> the way, do we have these functionalities in Flink CLI or REST API already?
>
> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
> updating the FLIP arcading to the latest discussions.
>
> Best,
> Paul Lam
>
> 2022年6月8日 07:31，Jing Ge <j...@ververica.com> 写道：
>
> Hi Paul,
>
> Sorry that I am a little bit too late to join this thread. Thanks for
> driving this and starting this informative discussion. The FLIP looks
> really interesting. It will help us a lot to manage Flink SQL jobs.
>
> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
> build a DAG for many DAGs?
>
> 1)
> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
> responsible to *produce* data as the result(cube, materialized view, etc.)
> for the future consumption by queries. The INSERT INTO SELECT FROM example
> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
> Jobs instead of Queries.
>
> 2)
> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
> support syntax like:
>
> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
> job_id
> SHOW JOBTREES // shows all DAGs
> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>
> 3)
> Could we also support Savepoint housekeeping syntax? We ran into this
> issue that a lot of savepoints have been created by customers (via their
> apps). It will take extra (hacking) effort to clean it.
>
> RELEASE SAVEPOINT ALL
>
> Best regards,
> Jing
>
> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvis...@apache.org>
> wrote:
>
>
> Hi Paul,
>
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>
> Best regards,
>
> Martijn
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3...@gmail.com>:
>
> Hi Godfrey,
>
> Sorry for the late reply, I was on vacation.
>
> It looks like we have a variety of preferences on the syntax, how about we
> choose the most acceptable one?
>
> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
> would be:
>
> - SHOW JOBS
> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
>
> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> JOB`:
>
> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> manager remembers)
> - DROP SAVEPOINT <savepoint_path>
>
> cc @Jark @ShengKai @Martijn @Timo .
>
> Best,
> Paul Lam
>
>
> godfrey he <godfre...@gmail.com> 于2022年5月23日周一 21:34写道：
>
> Hi Paul,
>
> Thanks for the update.
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>
> (DataStream or SQL) or
> clients (SQL client or CLI).
>
> Is DataStream job a QUERY? I think not.
> For a QUERY, the most important concept is the statement. But the
> result does not contain this info.
> If we need to contain all jobs in the cluster, I think the name should
> be JOB or PIPELINE.
> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>
> SHOW SAVEPOINTS
>
> To list the savepoint for a specific job, we need to specify a
> specific pipeline,
> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>
> Best,
> Godfrey
>
> Paul Lam <paullin3...@gmail.com> 于2022年5月20日周五 11:25写道：
>
>
> Hi Jark,
>
> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> part of the reason why I proposed “STOP/CANCEL QUERY” at the
> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>
> Another question is, what should be the syntax for ungracefully
> canceling a query? As ShengKai pointed out in a offline discussion,
> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> Flink CLI has both stop and cancel, mostly due to historical problems.
>
> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> that savepoints are owned by users and beyond the lifecycle of a Flink
> cluster. For example, a user might take a savepoint at a custom path
> that’s different than the default savepoint path, I think jobmanager
>
> would
>
> not remember that, not to mention the jobmanager may be a fresh new
> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> probably a best-effort one.
>
> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> Savepoints are alias for nested transactions in DB area[1], and there’s
> correspondingly global transactions. If we consider Flink jobs as
> global transactions and Flink checkpoints as nested transactions,
> then the savepoint semantics are close, thus I think savepoint syntax
> in SQL-standard could be considered. But again, I’m don’t have very
> strong preference.
>
> Ping @Timo to get more inputs.
>
> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>
> https://en.wikipedia.org/wiki/Nested_transaction>
>
>
> Best,
> Paul Lam
>
> 2022年5月18日 17:48，Jark Wu <imj...@gmail.com> 写道：
>
> Hi Paul,
>
> 1) SHOW QUERIES
> +1 to add finished time, but it would be better to call it "end_time"
>
> to
>
> keep aligned with names in Web UI.
>
> 2) DROP QUERY
> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>
> to
>
> stop batch queries?
> At present, I don't think "DROP" is a suitable keyword for this
>
> statement.
>
> From the perspective of users, "DROP" sounds like the query should be
> removed from the
> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>
> more
>
> suitable and
> compliant with commands of Flink CLI.
>
> 3) SHOW SAVEPOINTS
> I think this statement is needed, otherwise, savepoints are lost
>
> after the
>
> SAVEPOINT
> command is executed. Savepoints can be retrieved from REST API
> "/jobs/:jobid/checkpoints"
> with filtering "checkpoint_type"="savepoint". It's also worth
>
> considering
>
> providing "SHOW CHECKPOINTS"
> to list all checkpoints.
>
> 4) SAVEPOINT & RELEASE SAVEPOINT
> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>
> statements
>
> now.
> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>
> both
>
> the same savepoint id.
> However, in our syntax, the first one is query id, and the second one
>
> is
>
> savepoint path, which is confusing and
> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>
> they
>
> should be in the same syntax set.
> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> <sp_path>.
> That means we don't follow the majority of vendors in SAVEPOINT
>
> commands. I
>
> would say the purpose is different in Flink.
> What other's opinion on this?
>
> Best,
> Jark
>
> [1]:
>
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>
>
>
> On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3...@gmail.com> wrote:
>
> Hi Godfrey,
>
> Thanks a lot for your inputs!
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>
> (DataStream
>
> or SQL) or
> clients (SQL client or CLI). Under the hook, it’s based on
> ClusterClient#listJobs, the
> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>
> in SQL
>
> client, because
> these jobs can be managed via SQL client too.
>
> WRT finished time, I think you’re right. Adding it to the FLIP. But
>
> I’m a
>
> bit afraid that the
> rows would be too long.
>
> WRT ‘DROP QUERY’,
>
> What's the behavior for batch jobs and the non-running jobs?
>
>
>
> In general, the behavior would be aligned with Flink CLI. Triggering
>
> a
>
> savepoint for
> a non-running job would cause errors, and the error message would be
> printed to
> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> streaming
> execution mode would be the same with streaming jobs. However, for
>
> batch
>
> jobs in
> batch execution mode, I think there would be an error, because batch
> execution
> doesn’t support checkpoints currently (please correct me if I’m
>
> wrong).
>
>
> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>
> clusterClient/
>
> jobClient doesn’t have such a functionality at the moment, neither do
> Flink CLI.
> Maybe we could make it a follow-up FLIP, which includes the
>
> modifications
>
> to
> clusterClient/jobClient and Flink CLI. WDYT?
>
> Best,
> Paul Lam
>
> 2022年5月17日 20:34，godfrey he <godfre...@gmail.com> 写道：
>
> Godfrey
>
>
>
>
>
>
>
>
>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to