Hi Jing, Thank you for your inputs!
TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong). WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already? WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions. Best, Paul Lam > 2022年6月8日 07:31,Jing Ge <j...@ververica.com> 写道: > > Hi Paul, > > Sorry that I am a little bit too late to join this thread. Thanks for driving > this and starting this informative discussion. The FLIP looks really > interesting. It will help us a lot to manage Flink SQL jobs. > > Have you considered the ETL scenario with Flink SQL, where multiple SQLs > build a DAG for many DAGs? > > 1) > +1 for SHOW JOBS. I think sooner or later we will start to discuss how to > support ETL jobs. Briefly speaking, SQLs that used to build the DAG are > responsible to *produce* data as the result(cube, materialized view, etc.) > for the future consumption by queries. The INSERT INTO SELECT FROM example in > FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs > instead of Queries. > > 2) > Speaking of ETL DAG, we might want to see the lineage. Is it possible to > support syntax like: > > SHOW JOBTREE <job_id> // shows the downstream DAG from the given job_id > SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given > job_id > SHOW JOBTREES // shows all DAGs > SHOW ANCIENTS <job_id> // shows all parents of the given job_id > > 3) > Could we also support Savepoint housekeeping syntax? We ran into this issue > that a lot of savepoints have been created by customers (via their apps). It > will take extra (hacking) effort to clean it. > > RELEASE SAVEPOINT ALL > > Best regards, > Jing > > On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvis...@apache.org > <mailto:martijnvis...@apache.org>> wrote: > Hi Paul, > > I'm still doubting the keyword for the SQL applications. SHOW QUERIES could > imply that this will actually show the query, but we're returning IDs of > the running application. At first I was also not very much in favour of > SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink > jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS > > Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax. > > Best regards, > > Martijn > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary > <https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary> > > Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3...@gmail.com > <mailto:paullin3...@gmail.com>>: > > > Hi Godfrey, > > > > Sorry for the late reply, I was on vacation. > > > > It looks like we have a variety of preferences on the syntax, how about we > > choose the most acceptable one? > > > > WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs > > would be: > > > > - SHOW JOBS > > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and > > `table.job.stop-with-drain`) > > > > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR > > JOB`: > > > > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id> > > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job > > manager remembers) > > - DROP SAVEPOINT <savepoint_path> > > > > cc @Jark @ShengKai @Martijn @Timo . > > > > Best, > > Paul Lam > > > > > > godfrey he <godfre...@gmail.com <mailto:godfre...@gmail.com>> 于2022年5月23日周一 > > 21:34写道: > > > >> Hi Paul, > >> > >> Thanks for the update. > >> > >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs > >> (DataStream or SQL) or > >> clients (SQL client or CLI). > >> > >> Is DataStream job a QUERY? I think not. > >> For a QUERY, the most important concept is the statement. But the > >> result does not contain this info. > >> If we need to contain all jobs in the cluster, I think the name should > >> be JOB or PIPELINE. > >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id. > >> > >> > SHOW SAVEPOINTS > >> To list the savepoint for a specific job, we need to specify a > >> specific pipeline, > >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id > >> > >> Best, > >> Godfrey > >> > >> Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com>> > >> 于2022年5月20日周五 11:25写道: > >> > > >> > Hi Jark, > >> > > >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s > >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the > >> > beginning. The downside of it is that it’s not ANSI-SQL compatible. > >> > > >> > Another question is, what should be the syntax for ungracefully > >> > canceling a query? As ShengKai pointed out in a offline discussion, > >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users. > >> > Flink CLI has both stop and cancel, mostly due to historical problems. > >> > > >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is > >> > that savepoints are owned by users and beyond the lifecycle of a Flink > >> > cluster. For example, a user might take a savepoint at a custom path > >> > that’s different than the default savepoint path, I think jobmanager > >> would > >> > not remember that, not to mention the jobmanager may be a fresh new > >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's > >> > probably a best-effort one. > >> > > >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint. > >> > Savepoints are alias for nested transactions in DB area[1], and there’s > >> > correspondingly global transactions. If we consider Flink jobs as > >> > global transactions and Flink checkpoints as nested transactions, > >> > then the savepoint semantics are close, thus I think savepoint syntax > >> > in SQL-standard could be considered. But again, I’m don’t have very > >> > strong preference. > >> > > >> > Ping @Timo to get more inputs. > >> > > >> > [1] https://en.wikipedia.org/wiki/Nested_transaction > >> > <https://en.wikipedia.org/wiki/Nested_transaction> < > >> https://en.wikipedia.org/wiki/Nested_transaction > >> <https://en.wikipedia.org/wiki/Nested_transaction>> > >> > > >> > Best, > >> > Paul Lam > >> > > >> > > 2022年5月18日 17:48,Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> > >> > > 写道: > >> > > > >> > > Hi Paul, > >> > > > >> > > 1) SHOW QUERIES > >> > > +1 to add finished time, but it would be better to call it "end_time" > >> to > >> > > keep aligned with names in Web UI. > >> > > > >> > > 2) DROP QUERY > >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how > >> to > >> > > stop batch queries? > >> > > At present, I don't think "DROP" is a suitable keyword for this > >> statement. > >> > > From the perspective of users, "DROP" sounds like the query should be > >> > > removed from the > >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is > >> more > >> > > suitable and > >> > > compliant with commands of Flink CLI. > >> > > > >> > > 3) SHOW SAVEPOINTS > >> > > I think this statement is needed, otherwise, savepoints are lost > >> after the > >> > > SAVEPOINT > >> > > command is executed. Savepoints can be retrieved from REST API > >> > > "/jobs/:jobid/checkpoints" > >> > > with filtering "checkpoint_type"="savepoint". It's also worth > >> considering > >> > > providing "SHOW CHECKPOINTS" > >> > > to list all checkpoints. > >> > > > >> > > 4) SAVEPOINT & RELEASE SAVEPOINT > >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT > >> statements > >> > > now. > >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are > >> both > >> > > the same savepoint id. > >> > > However, in our syntax, the first one is query id, and the second one > >> is > >> > > savepoint path, which is confusing and > >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe > >> they > >> > > should be in the same syntax set. > >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT > >> > > <sp_path>. > >> > > That means we don't follow the majority of vendors in SAVEPOINT > >> commands. I > >> > > would say the purpose is different in Flink. > >> > > What other's opinion on this? > >> > > > >> > > Best, > >> > > Jark > >> > > > >> > > [1]: > >> > > > >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints > >> > >> <https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints> > >> > > > >> > > > >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3...@gmail.com > >> > > <mailto:paullin3...@gmail.com>> wrote: > >> > > > >> > >> Hi Godfrey, > >> > >> > >> > >> Thanks a lot for your inputs! > >> > >> > >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs > >> (DataStream > >> > >> or SQL) or > >> > >> clients (SQL client or CLI). Under the hook, it’s based on > >> > >> ClusterClient#listJobs, the > >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed > >> in SQL > >> > >> client, because > >> > >> these jobs can be managed via SQL client too. > >> > >> > >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But > >> I’m a > >> > >> bit afraid that the > >> > >> rows would be too long. > >> > >> > >> > >> WRT ‘DROP QUERY’, > >> > >>> What's the behavior for batch jobs and the non-running jobs? > >> > >> > >> > >> > >> > >> In general, the behavior would be aligned with Flink CLI. Triggering > >> a > >> > >> savepoint for > >> > >> a non-running job would cause errors, and the error message would be > >> > >> printed to > >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in > >> > >> streaming > >> > >> execution mode would be the same with streaming jobs. However, for > >> batch > >> > >> jobs in > >> > >> batch execution mode, I think there would be an error, because batch > >> > >> execution > >> > >> doesn’t support checkpoints currently (please correct me if I’m > >> wrong). > >> > >> > >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink > >> clusterClient/ > >> > >> jobClient doesn’t have such a functionality at the moment, neither do > >> > >> Flink CLI. > >> > >> Maybe we could make it a follow-up FLIP, which includes the > >> modifications > >> > >> to > >> > >> clusterClient/jobClient and Flink CLI. WDYT? > >> > >> > >> > >> Best, > >> > >> Paul Lam > >> > >> > >> > >>> 2022年5月17日 20:34,godfrey he <godfre...@gmail.com > >> > >>> <mailto:godfre...@gmail.com>> 写道: > >> > >>> > >> > >>> Godfrey > >> > >> > >> > >> > >> > > >> > >