Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Jing Ge Tue, 07 Jun 2022 16:32:00 -0700

Hi Paul,

Sorry that I am a little bit too late to join this thread. Thanks for
driving this and starting this informative discussion. The FLIP looks
really interesting. It will help us a lot to manage Flink SQL jobs.


Have you considered the ETL scenario with Flink SQL, where multiple SQLs
build a DAG for many DAGs?

1)
+1 for SHOW JOBS. I think sooner or later we will start to discuss how to
support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
responsible to *produce* data as the result(cube, materialized view, etc.)
for the future consumption by queries. The INSERT INTO SELECT FROM example
in FLIP and CTAS are typical SQL in this case. I would prefer to call them
Jobs instead of Queries.

2)
Speaking of ETL DAG, we might want to see the lineage. Is it possible to
support syntax like:

SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
job_id
SHOW JOBTREES // shows all DAGs
SHOW ANCIENTS <job_id> // shows all parents of the given job_id

3)
Could we also support Savepoint housekeeping syntax? We ran into this issue
that a lot of savepoints have been created by customers (via their apps).
It will take extra (hacking) effort to clean it.

RELEASE SAVEPOINT ALL

Best regards,
Jing

On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <[email protected]>
wrote:

> Hi Paul,
>
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>
> Best regards,
>
> Martijn
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <[email protected]>:
>
> > Hi Godfrey,
> >
> > Sorry for the late reply, I was on vacation.
> >
> > It looks like we have a variety of preferences on the syntax, how about
> we
> > choose the most acceptable one?
> >
> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to
> jobs
> > would be:
> >
> > - SHOW JOBS
> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> > `table.job.stop-with-drain`)
> >
> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> > JOB`:
> >
> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> > manager remembers)
> > - DROP SAVEPOINT <savepoint_path>
> >
> > cc @Jark @ShengKai @Martijn @Timo .
> >
> > Best,
> > Paul Lam
> >
> >
> > godfrey he <[email protected]> 于2022年5月23日周一 21:34写道：
> >
> >> Hi Paul,
> >>
> >> Thanks for the update.
> >>
> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream or SQL) or
> >> clients (SQL client or CLI).
> >>
> >> Is DataStream job a QUERY? I think not.
> >> For a QUERY, the most important concept is the statement. But the
> >> result does not contain this info.
> >> If we need to contain all jobs in the cluster, I think the name should
> >> be JOB or PIPELINE.
> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
> >>
> >> > SHOW SAVEPOINTS
> >> To list the savepoint for a specific job, we need to specify a
> >> specific pipeline,
> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
> >>
> >> Best,
> >> Godfrey
> >>
> >> Paul Lam <[email protected]> 于2022年5月20日周五 11:25写道：
> >> >
> >> > Hi Jark,
> >> >
> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
> >> >
> >> > Another question is, what should be the syntax for ungracefully
> >> > canceling a query? As ShengKai pointed out in a offline discussion,
> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> >> > Flink CLI has both stop and cancel, mostly due to historical problems.
> >> >
> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> >> > that savepoints are owned by users and beyond the lifecycle of a Flink
> >> > cluster. For example, a user might take a savepoint at a custom path
> >> > that’s different than the default savepoint path, I think jobmanager
> >> would
> >> > not remember that, not to mention the jobmanager may be a fresh new
> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> >> > probably a best-effort one.
> >> >
> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> >> > Savepoints are alias for nested transactions in DB area[1], and
> there’s
> >> > correspondingly global transactions. If we consider Flink jobs as
> >> > global transactions and Flink checkpoints as nested transactions,
> >> > then the savepoint semantics are close, thus I think savepoint syntax
> >> > in SQL-standard could be considered. But again, I’m don’t have very
> >> > strong preference.
> >> >
> >> > Ping @Timo to get more inputs.
> >> >
> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
> >> https://en.wikipedia.org/wiki/Nested_transaction>
> >> >
> >> > Best,
> >> > Paul Lam
> >> >
> >> > > 2022年5月18日 17:48，Jark Wu <[email protected]> 写道：
> >> > >
> >> > > Hi Paul,
> >> > >
> >> > > 1) SHOW QUERIES
> >> > > +1 to add finished time, but it would be better to call it
> "end_time"
> >> to
> >> > > keep aligned with names in Web UI.
> >> > >
> >> > > 2) DROP QUERY
> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
> >> to
> >> > > stop batch queries?
> >> > > At present, I don't think "DROP" is a suitable keyword for this
> >> statement.
> >> > > From the perspective of users, "DROP" sounds like the query should
> be
> >> > > removed from the
> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
> >> more
> >> > > suitable and
> >> > > compliant with commands of Flink CLI.
> >> > >
> >> > > 3) SHOW SAVEPOINTS
> >> > > I think this statement is needed, otherwise, savepoints are lost
> >> after the
> >> > > SAVEPOINT
> >> > > command is executed. Savepoints can be retrieved from REST API
> >> > > "/jobs/:jobid/checkpoints"
> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
> >> considering
> >> > > providing "SHOW CHECKPOINTS"
> >> > > to list all checkpoints.
> >> > >
> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
> >> statements
> >> > > now.
> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT
> are
> >> both
> >> > > the same savepoint id.
> >> > > However, in our syntax, the first one is query id, and the second
> one
> >> is
> >> > > savepoint path, which is confusing and
> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
> >> they
> >> > > should be in the same syntax set.
> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP
> SAVEPOINT
> >> > > <sp_path>.
> >> > > That means we don't follow the majority of vendors in SAVEPOINT
> >> commands. I
> >> > > would say the purpose is different in Flink.
> >> > > What other's opinion on this?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > > [1]:
> >> > >
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> >> > >
> >> > >
> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <[email protected]>
> wrote:
> >> > >
> >> > >> Hi Godfrey,
> >> > >>
> >> > >> Thanks a lot for your inputs!
> >> > >>
> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream
> >> > >> or SQL) or
> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
> >> > >> ClusterClient#listJobs, the
> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
> >> in SQL
> >> > >> client, because
> >> > >> these jobs can be managed via SQL client too.
> >> > >>
> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
> >> I’m a
> >> > >> bit afraid that the
> >> > >> rows would be too long.
> >> > >>
> >> > >> WRT ‘DROP QUERY’,
> >> > >>> What's the behavior for batch jobs and the non-running jobs?
> >> > >>
> >> > >>
> >> > >> In general, the behavior would be aligned with Flink CLI.
> Triggering
> >> a
> >> > >> savepoint for
> >> > >> a non-running job would cause errors, and the error message would
> be
> >> > >> printed to
> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> >> > >> streaming
> >> > >> execution mode would be the same with streaming jobs. However, for
> >> batch
> >> > >> jobs in
> >> > >> batch execution mode, I think there would be an error, because
> batch
> >> > >> execution
> >> > >> doesn’t support checkpoints currently (please correct me if I’m
> >> wrong).
> >> > >>
> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
> >> clusterClient/
> >> > >> jobClient doesn’t have such a functionality at the moment, neither
> do
> >> > >> Flink CLI.
> >> > >> Maybe we could make it a follow-up FLIP, which includes the
> >> modifications
> >> > >> to
> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
> >> > >>
> >> > >> Best,
> >> > >> Paul Lam
> >> > >>
> >> > >>> 2022年5月17日 20:34，godfrey he <[email protected]> 写道：
> >> > >>>
> >> > >>> Godfrey
> >> > >>
> >> > >>
> >> >
> >>
> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to