Hi Paul, That's a fair point, but I still think we should not offer that capability via the CLI either. But that's a different discussion :)
Thanks, Martijn Op do 9 jun. 2022 om 10:08 schreef Paul Lam <paullin3...@gmail.com>: > Hi Martijn, > > I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM > mode, since the statement is triggered by users instead of Flink runtime. > > We’re simply providing a tool for user to cleanup the savepoints, just > like `bin/flink savepoint -d :savepointPath` in Flink CLI [1]. > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints > > Best, > Paul Lam > > 2022年6月9日 15:41,Martijn Visser <martijnvis...@apache.org> 写道: > > Hi all, > > I would not include a DROP SAVEPOINT syntax. With the recently introduced > CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot > ownership and if you have a savepoint established "with NO_CLAIM it creates > its own copy and leaves the existing one up to the user." [1] We shouldn't > then again make it fuzzy by making it possible that Flink can remove > snapshots. > > Best regards, > > Martijn > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership > > Op do 9 jun. 2022 om 09:27 schreef Paul Lam <paullin3...@gmail.com>: > >> Hi team, >> >> It's great to see our opinions are finally converging! >> >> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ` >> >> >> LGTM. Adding it to the FLIP. >> >> To Jark, >> >> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>” >> >> >> Good point. The default savepoint dir should be enough for most cases. >> >> To Jing, >> >> DROP SAVEPOINT ALL >> >> >> I think it’s valid to have such a statement, but I have two concerns: >> >> - `ALL` is already an SQL keyword, thus it may cause ambiguity. >> - Flink CLI and REST API doesn’t provided the corresponding >> functionalities, and we’d better keep them aligned. >> >> How about making this statement as follow-up tasks which should touch >> REST API and Flink CLI? >> >> Best, >> Paul Lam >> >> 2022年6月9日 11:53,godfrey he <godfre...@gmail.com> 写道: >> >> Hi all, >> >> Regarding `PIPELINE`, it comes from flink-core module, see >> `PipelineOptions` class for more details. >> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with >> `JOBS`. >> >> +1 to discuss JOBTREE in other FLIP. >> >> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ` >> >> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT >> <savepoint_path>` >> >> Best, >> Godfrey >> >> Jing Ge <j...@ververica.com> 于2022年6月9日周四 01:48写道: >> >> >> Hi Paul, Hi Jark, >> >> Re JOBTREE, agree that it is out of the scope of this FLIP >> >> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP >> SAVEPOINT ALL' housekeeping. WDYT? >> >> Best regards, >> Jing >> >> >> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <imj...@gmail.com> wrote: >> >> >> Hi Jing, >> >> Regarding JOBTREE (job lineage), I agree with Paul that this is out of >> the scope >> of this FLIP and can be discussed in another FLIP. >> >> Job lineage is a big topic that may involve many problems: >> 1) how to collect and report job entities, attributes, and lineages? >> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub? >> 3) how does Flink SQL CLI/Gateway know the lineage information and show >> jobtree? >> 4) ... >> >> Best, >> Jark >> >> On Wed, 8 Jun 2022 at 20:44, Jark Wu <imj...@gmail.com> wrote: >> >> >> Hi Paul, >> >> I'm fine with using JOBS. The only concern is that this may conflict with >> displaying more detailed >> information for query (e.g. query content, plan) in the future, e.g. SHOW >> QUERIES EXTENDED in ksqldb[1]. >> This is not a big problem as we can introduce SHOW QUERIES in the future >> if necessary. >> >> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and >> `table.job.stop-with-drain`) >> >> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ? >> It might be trivial and error-prone to set configuration before executing >> a statement, >> and the configuration will affect all statements after that. >> >> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id> >> >> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>", >> and always use configuration "state.savepoints.dir" as the default >> savepoint dir. >> The concern with using "<savepoint_path>" is here should be savepoint dir, >> and savepoint_path is the returned value. >> >> I'm fine with other changes. >> >> Thanks, >> Jark >> >> [1]: >> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ >> >> >> >> On Wed, 8 Jun 2022 at 15:07, Paul Lam <paullin3...@gmail.com> wrote: >> >> >> Hi Jing, >> >> Thank you for your inputs! >> >> TBH, I haven’t considered the ETL scenario that you mentioned. I think >> they’re managed just like other jobs interns of job lifecycles (please >> correct me if I’m wrong). >> >> WRT to the SQL statements about SQL lineages, I think it might be a >> little bit out of the scope of the FLIP, since it’s mainly about >> lifecycles. By the way, do we have these functionalities in Flink CLI or >> REST API already? >> >> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the >> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m >> updating the FLIP arcading to the latest discussions. >> >> Best, >> Paul Lam >> >> 2022年6月8日 07:31,Jing Ge <j...@ververica.com> 写道: >> >> Hi Paul, >> >> Sorry that I am a little bit too late to join this thread. Thanks for >> driving this and starting this informative discussion. The FLIP looks >> really interesting. It will help us a lot to manage Flink SQL jobs. >> >> Have you considered the ETL scenario with Flink SQL, where multiple SQLs >> build a DAG for many DAGs? >> >> 1) >> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to >> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are >> responsible to *produce* data as the result(cube, materialized view, etc.) >> for the future consumption by queries. The INSERT INTO SELECT FROM example >> in FLIP and CTAS are typical SQL in this case. I would prefer to call them >> Jobs instead of Queries. >> >> 2) >> Speaking of ETL DAG, we might want to see the lineage. Is it possible to >> support syntax like: >> >> SHOW JOBTREE <job_id> // shows the downstream DAG from the given job_id >> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given >> job_id >> SHOW JOBTREES // shows all DAGs >> SHOW ANCIENTS <job_id> // shows all parents of the given job_id >> >> 3) >> Could we also support Savepoint housekeeping syntax? We ran into this >> issue that a lot of savepoints have been created by customers (via their >> apps). It will take extra (hacking) effort to clean it. >> >> RELEASE SAVEPOINT ALL >> >> Best regards, >> Jing >> >> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvis...@apache.org> >> wrote: >> >> >> Hi Paul, >> >> I'm still doubting the keyword for the SQL applications. SHOW QUERIES >> could >> imply that this will actually show the query, but we're returning IDs of >> the running application. At first I was also not very much in favour of >> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink >> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS >> >> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax. >> >> Best regards, >> >> Martijn >> >> [1] >> >> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary >> >> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3...@gmail.com>: >> >> Hi Godfrey, >> >> Sorry for the late reply, I was on vacation. >> >> It looks like we have a variety of preferences on the syntax, how about we >> choose the most acceptable one? >> >> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs >> would be: >> >> - SHOW JOBS >> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and >> `table.job.stop-with-drain`) >> >> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR >> JOB`: >> >> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id> >> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job >> manager remembers) >> - DROP SAVEPOINT <savepoint_path> >> >> cc @Jark @ShengKai @Martijn @Timo . >> >> Best, >> Paul Lam >> >> >> godfrey he <godfre...@gmail.com> 于2022年5月23日周一 21:34写道: >> >> Hi Paul, >> >> Thanks for the update. >> >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs >> >> (DataStream or SQL) or >> clients (SQL client or CLI). >> >> Is DataStream job a QUERY? I think not. >> For a QUERY, the most important concept is the statement. But the >> result does not contain this info. >> If we need to contain all jobs in the cluster, I think the name should >> be JOB or PIPELINE. >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id. >> >> SHOW SAVEPOINTS >> >> To list the savepoint for a specific job, we need to specify a >> specific pipeline, >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id >> >> Best, >> Godfrey >> >> Paul Lam <paullin3...@gmail.com> 于2022年5月20日周五 11:25写道: >> >> >> Hi Jark, >> >> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s >> part of the reason why I proposed “STOP/CANCEL QUERY” at the >> beginning. The downside of it is that it’s not ANSI-SQL compatible. >> >> Another question is, what should be the syntax for ungracefully >> canceling a query? As ShengKai pointed out in a offline discussion, >> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users. >> Flink CLI has both stop and cancel, mostly due to historical problems. >> >> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is >> that savepoints are owned by users and beyond the lifecycle of a Flink >> cluster. For example, a user might take a savepoint at a custom path >> that’s different than the default savepoint path, I think jobmanager >> >> would >> >> not remember that, not to mention the jobmanager may be a fresh new >> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's >> probably a best-effort one. >> >> WRT savepoint syntax, I’m thinking of the semantic of the savepoint. >> Savepoints are alias for nested transactions in DB area[1], and there’s >> correspondingly global transactions. If we consider Flink jobs as >> global transactions and Flink checkpoints as nested transactions, >> then the savepoint semantics are close, thus I think savepoint syntax >> in SQL-standard could be considered. But again, I’m don’t have very >> strong preference. >> >> Ping @Timo to get more inputs. >> >> [1] https://en.wikipedia.org/wiki/Nested_transaction < >> >> https://en.wikipedia.org/wiki/Nested_transaction> >> >> >> Best, >> Paul Lam >> >> 2022年5月18日 17:48,Jark Wu <imj...@gmail.com> 写道: >> >> Hi Paul, >> >> 1) SHOW QUERIES >> +1 to add finished time, but it would be better to call it "end_time" >> >> to >> >> keep aligned with names in Web UI. >> >> 2) DROP QUERY >> I think we shouldn't throw exceptions for batch jobs, otherwise, how >> >> to >> >> stop batch queries? >> At present, I don't think "DROP" is a suitable keyword for this >> >> statement. >> >> From the perspective of users, "DROP" sounds like the query should be >> removed from the >> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is >> >> more >> >> suitable and >> compliant with commands of Flink CLI. >> >> 3) SHOW SAVEPOINTS >> I think this statement is needed, otherwise, savepoints are lost >> >> after the >> >> SAVEPOINT >> command is executed. Savepoints can be retrieved from REST API >> "/jobs/:jobid/checkpoints" >> with filtering "checkpoint_type"="savepoint". It's also worth >> >> considering >> >> providing "SHOW CHECKPOINTS" >> to list all checkpoints. >> >> 4) SAVEPOINT & RELEASE SAVEPOINT >> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT >> >> statements >> >> now. >> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are >> >> both >> >> the same savepoint id. >> However, in our syntax, the first one is query id, and the second one >> >> is >> >> savepoint path, which is confusing and >> not consistent. When I came across SHOW SAVEPOINT, I thought maybe >> >> they >> >> should be in the same syntax set. >> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT >> <sp_path>. >> That means we don't follow the majority of vendors in SAVEPOINT >> >> commands. I >> >> would say the purpose is different in Flink. >> What other's opinion on this? >> >> Best, >> Jark >> >> [1]: >> >> >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints >> >> >> >> On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3...@gmail.com> wrote: >> >> Hi Godfrey, >> >> Thanks a lot for your inputs! >> >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs >> >> (DataStream >> >> or SQL) or >> clients (SQL client or CLI). Under the hook, it’s based on >> ClusterClient#listJobs, the >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed >> >> in SQL >> >> client, because >> these jobs can be managed via SQL client too. >> >> WRT finished time, I think you’re right. Adding it to the FLIP. But >> >> I’m a >> >> bit afraid that the >> rows would be too long. >> >> WRT ‘DROP QUERY’, >> >> What's the behavior for batch jobs and the non-running jobs? >> >> >> >> In general, the behavior would be aligned with Flink CLI. Triggering >> >> a >> >> savepoint for >> a non-running job would cause errors, and the error message would be >> printed to >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in >> streaming >> execution mode would be the same with streaming jobs. However, for >> >> batch >> >> jobs in >> batch execution mode, I think there would be an error, because batch >> execution >> doesn’t support checkpoints currently (please correct me if I’m >> >> wrong). >> >> >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink >> >> clusterClient/ >> >> jobClient doesn’t have such a functionality at the moment, neither do >> Flink CLI. >> Maybe we could make it a follow-up FLIP, which includes the >> >> modifications >> >> to >> clusterClient/jobClient and Flink CLI. WDYT? >> >> Best, >> Paul Lam >> >> 2022年5月17日 20:34,godfrey he <godfre...@gmail.com> 写道: >> >> Godfrey >> >> >> >> >> >> >> >> >> >> >