Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Paul Lam Wed, 08 Jun 2022 00:07:50 -0700

Hi Jing,

Thank you for your inputs!


TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re 
managed just like other jobs interns of job lifecycles (please correct me if 
I’m wrong).

WRT to the SQL statements about SQL lineages, I think it might be a little bit 
out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, 
do we have these functionalities in Flink CLI or REST API already? 

WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the 
community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating 
the FLIP arcading to the latest discussions.

Best,
Paul Lam

> 2022年6月8日 07:31，Jing Ge <[email protected]> 写道：
> 
> Hi Paul,
> 
> Sorry that I am a little bit too late to join this thread. Thanks for driving 
> this and starting this informative discussion. The FLIP looks really 
> interesting. It will help us a lot to manage Flink SQL jobs. 
> 
> Have you considered the ETL scenario with Flink SQL, where multiple SQLs 
> build a DAG for many DAGs?
> 
> 1)
> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to 
> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are 
> responsible to *produce* data as the result(cube, materialized view, etc.) 
> for the future consumption by queries. The INSERT INTO SELECT FROM example in 
> FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs 
> instead of Queries.
> 
> 2)
> Speaking of ETL DAG, we might want to see the lineage. Is it possible to 
> support syntax like:
> 
> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given 
> job_id
> SHOW JOBTREES // shows all DAGs
> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
> 
> 3)
> Could we also support Savepoint housekeeping syntax? We ran into this issue 
> that a lot of savepoints have been created by customers (via their apps). It 
> will take extra (hacking) effort to clean it.
> 
> RELEASE SAVEPOINT ALL   
> 
> Best regards,
> Jing
> 
> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Paul,
> 
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
> 
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
> 
> Best regards,
> 
> Martijn
> 
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary 
> <https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary>
> 
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <[email protected] 
> <mailto:[email protected]>>:
> 
> > Hi Godfrey,
> >
> > Sorry for the late reply, I was on vacation.
> >
> > It looks like we have a variety of preferences on the syntax, how about we
> > choose the most acceptable one?
> >
> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
> > would be:
> >
> > - SHOW JOBS
> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> > `table.job.stop-with-drain`)
> >
> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> > JOB`:
> >
> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> > manager remembers)
> > - DROP SAVEPOINT <savepoint_path>
> >
> > cc @Jark @ShengKai @Martijn @Timo .
> >
> > Best,
> > Paul Lam
> >
> >
> > godfrey he <[email protected] <mailto:[email protected]>> 于2022年5月23日周一 
> > 21:34写道：
> >
> >> Hi Paul,
> >>
> >> Thanks for the update.
> >>
> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream or SQL) or
> >> clients (SQL client or CLI).
> >>
> >> Is DataStream job a QUERY? I think not.
> >> For a QUERY, the most important concept is the statement. But the
> >> result does not contain this info.
> >> If we need to contain all jobs in the cluster, I think the name should
> >> be JOB or PIPELINE.
> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
> >>
> >> > SHOW SAVEPOINTS
> >> To list the savepoint for a specific job, we need to specify a
> >> specific pipeline,
> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
> >>
> >> Best,
> >> Godfrey
> >>
> >> Paul Lam <[email protected] <mailto:[email protected]>> 
> >> 于2022年5月20日周五 11:25写道：
> >> >
> >> > Hi Jark,
> >> >
> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
> >> >
> >> > Another question is, what should be the syntax for ungracefully
> >> > canceling a query? As ShengKai pointed out in a offline discussion,
> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> >> > Flink CLI has both stop and cancel, mostly due to historical problems.
> >> >
> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> >> > that savepoints are owned by users and beyond the lifecycle of a Flink
> >> > cluster. For example, a user might take a savepoint at a custom path
> >> > that’s different than the default savepoint path, I think jobmanager
> >> would
> >> > not remember that, not to mention the jobmanager may be a fresh new
> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> >> > probably a best-effort one.
> >> >
> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> >> > Savepoints are alias for nested transactions in DB area[1], and there’s
> >> > correspondingly global transactions. If we consider Flink jobs as
> >> > global transactions and Flink checkpoints as nested transactions,
> >> > then the savepoint semantics are close, thus I think savepoint syntax
> >> > in SQL-standard could be considered. But again, I’m don’t have very
> >> > strong preference.
> >> >
> >> > Ping @Timo to get more inputs.
> >> >
> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction 
> >> > <https://en.wikipedia.org/wiki/Nested_transaction> <
> >> https://en.wikipedia.org/wiki/Nested_transaction 
> >> <https://en.wikipedia.org/wiki/Nested_transaction>>
> >> >
> >> > Best,
> >> > Paul Lam
> >> >
> >> > > 2022年5月18日 17:48，Jark Wu <[email protected] <mailto:[email protected]>> 
> >> > > 写道：
> >> > >
> >> > > Hi Paul,
> >> > >
> >> > > 1) SHOW QUERIES
> >> > > +1 to add finished time, but it would be better to call it "end_time"
> >> to
> >> > > keep aligned with names in Web UI.
> >> > >
> >> > > 2) DROP QUERY
> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
> >> to
> >> > > stop batch queries?
> >> > > At present, I don't think "DROP" is a suitable keyword for this
> >> statement.
> >> > > From the perspective of users, "DROP" sounds like the query should be
> >> > > removed from the
> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
> >> more
> >> > > suitable and
> >> > > compliant with commands of Flink CLI.
> >> > >
> >> > > 3) SHOW SAVEPOINTS
> >> > > I think this statement is needed, otherwise, savepoints are lost
> >> after the
> >> > > SAVEPOINT
> >> > > command is executed. Savepoints can be retrieved from REST API
> >> > > "/jobs/:jobid/checkpoints"
> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
> >> considering
> >> > > providing "SHOW CHECKPOINTS"
> >> > > to list all checkpoints.
> >> > >
> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
> >> statements
> >> > > now.
> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
> >> both
> >> > > the same savepoint id.
> >> > > However, in our syntax, the first one is query id, and the second one
> >> is
> >> > > savepoint path, which is confusing and
> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
> >> they
> >> > > should be in the same syntax set.
> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> >> > > <sp_path>.
> >> > > That means we don't follow the majority of vendors in SAVEPOINT
> >> commands. I
> >> > > would say the purpose is different in Flink.
> >> > > What other's opinion on this?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > > [1]:
> >> > >
> >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> >>  
> >> <https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints>
> >> > >
> >> > >
> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <[email protected] 
> >> > > <mailto:[email protected]>> wrote:
> >> > >
> >> > >> Hi Godfrey,
> >> > >>
> >> > >> Thanks a lot for your inputs!
> >> > >>
> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream
> >> > >> or SQL) or
> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
> >> > >> ClusterClient#listJobs, the
> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
> >> in SQL
> >> > >> client, because
> >> > >> these jobs can be managed via SQL client too.
> >> > >>
> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
> >> I’m a
> >> > >> bit afraid that the
> >> > >> rows would be too long.
> >> > >>
> >> > >> WRT ‘DROP QUERY’,
> >> > >>> What's the behavior for batch jobs and the non-running jobs?
> >> > >>
> >> > >>
> >> > >> In general, the behavior would be aligned with Flink CLI. Triggering
> >> a
> >> > >> savepoint for
> >> > >> a non-running job would cause errors, and the error message would be
> >> > >> printed to
> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> >> > >> streaming
> >> > >> execution mode would be the same with streaming jobs. However, for
> >> batch
> >> > >> jobs in
> >> > >> batch execution mode, I think there would be an error, because batch
> >> > >> execution
> >> > >> doesn’t support checkpoints currently (please correct me if I’m
> >> wrong).
> >> > >>
> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
> >> clusterClient/
> >> > >> jobClient doesn’t have such a functionality at the moment, neither do
> >> > >> Flink CLI.
> >> > >> Maybe we could make it a follow-up FLIP, which includes the
> >> modifications
> >> > >> to
> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
> >> > >>
> >> > >> Best,
> >> > >> Paul Lam
> >> > >>
> >> > >>> 2022年5月17日 20:34，godfrey he <[email protected] 
> >> > >>> <mailto:[email protected]>> 写道：
> >> > >>>
> >> > >>> Godfrey
> >> > >>
> >> > >>
> >> >
> >>
> >

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to