Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

godfrey he Wed, 08 Jun 2022 20:54:49 -0700

Hi all,

Regarding `PIPELINE`, it comes from flink-core module, see
`PipelineOptions` class for more details.
`JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with `JOBS`.


+1 to discuss JOBTREE in other FLIP.

+1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `

+1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT <savepoint_path>`

Best,
Godfrey

Jing Ge <[email protected]> 于2022年6月9日周四 01:48写道：
>
> Hi Paul, Hi Jark,
>
> Re JOBTREE, agree that it is out of the scope of this FLIP
>
> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP 
> SAVEPOINT ALL' housekeeping. WDYT?
>
> Best regards,
> Jing
>
>
> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <[email protected]> wrote:
>>
>> Hi Jing,
>>
>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the 
>> scope
>>  of this FLIP and can be discussed in another FLIP.
>>
>> Job lineage is a big topic that may involve many problems:
>> 1) how to collect and report job entities, attributes, and lineages?
>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>> 3) how does Flink SQL CLI/Gateway know the lineage information and show 
>> jobtree?
>> 4) ...
>>
>> Best,
>> Jark
>>
>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <[email protected]> wrote:
>>>
>>> Hi Paul,
>>>
>>> I'm fine with using JOBS. The only concern is that this may conflict with 
>>> displaying more detailed
>>> information for query (e.g. query content, plan) in the future, e.g. SHOW 
>>> QUERIES EXTENDED in ksqldb[1].
>>> This is not a big problem as we can introduce SHOW QUERIES in the future if 
>>> necessary.
>>>
>>> > STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and 
>>> > `table.job.stop-with-drain`)
>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>> It might be trivial and error-prone to set configuration before executing a 
>>> statement,
>>> and the configuration will affect all statements after that.
>>>
>>> > CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>> and always use configuration "state.savepoints.dir" as the default 
>>> savepoint dir.
>>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>>> and savepoint_path is the returned value.
>>>
>>> I'm fine with other changes.
>>>
>>> Thanks,
>>> Jark
>>>
>>> [1]: 
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>>
>>>
>>>
>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <[email protected]> wrote:
>>>>
>>>> Hi Jing,
>>>>
>>>> Thank you for your inputs!
>>>>
>>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think 
>>>> they’re managed just like other jobs interns of job lifecycles (please 
>>>> correct me if I’m wrong).
>>>>
>>>> WRT to the SQL statements about SQL lineages, I think it might be a little 
>>>> bit out of the scope of the FLIP, since it’s mainly about lifecycles. By 
>>>> the way, do we have these functionalities in Flink CLI or REST API already?
>>>>
>>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the 
>>>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m 
>>>> updating the FLIP arcading to the latest discussions.
>>>>
>>>> Best,
>>>> Paul Lam
>>>>
>>>> 2022年6月8日 07:31，Jing Ge <[email protected]> 写道：
>>>>
>>>> Hi Paul,
>>>>
>>>> Sorry that I am a little bit too late to join this thread. Thanks for 
>>>> driving this and starting this informative discussion. The FLIP looks 
>>>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>>>
>>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs 
>>>> build a DAG for many DAGs?
>>>>
>>>> 1)
>>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to 
>>>> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are 
>>>> responsible to *produce* data as the result(cube, materialized view, etc.) 
>>>> for the future consumption by queries. The INSERT INTO SELECT FROM example 
>>>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them 
>>>> Jobs instead of Queries.
>>>>
>>>> 2)
>>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to 
>>>> support syntax like:
>>>>
>>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given 
>>>> job_id
>>>> SHOW JOBTREES // shows all DAGs
>>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>>
>>>> 3)
>>>> Could we also support Savepoint housekeeping syntax? We ran into this 
>>>> issue that a lot of savepoints have been created by customers (via their 
>>>> apps). It will take extra (hacking) effort to clean it.
>>>>
>>>> RELEASE SAVEPOINT ALL
>>>>
>>>> Best regards,
>>>> Jing
>>>>
>>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <[email protected]> 
>>>> wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES 
>>>>> could
>>>>> imply that this will actually show the query, but we're returning IDs of
>>>>> the running application. At first I was also not very much in favour of
>>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>>>
>>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Martijn
>>>>>
>>>>> [1]
>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>>>
>>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <[email protected]>:
>>>>>
>>>>> > Hi Godfrey,
>>>>> >
>>>>> > Sorry for the late reply, I was on vacation.
>>>>> >
>>>>> > It looks like we have a variety of preferences on the syntax, how about 
>>>>> > we
>>>>> > choose the most acceptable one?
>>>>> >
>>>>> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to 
>>>>> > jobs
>>>>> > would be:
>>>>> >
>>>>> > - SHOW JOBS
>>>>> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>>> > `table.job.stop-with-drain`)
>>>>> >
>>>>> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>>> > JOB`:
>>>>> >
>>>>> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>>> > manager remembers)
>>>>> > - DROP SAVEPOINT <savepoint_path>
>>>>> >
>>>>> > cc @Jark @ShengKai @Martijn @Timo .
>>>>> >
>>>>> > Best,
>>>>> > Paul Lam
>>>>> >
>>>>> >
>>>>> > godfrey he <[email protected]> 于2022年5月23日周一 21:34写道：
>>>>> >
>>>>> >> Hi Paul,
>>>>> >>
>>>>> >> Thanks for the update.
>>>>> >>
>>>>> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>> >> (DataStream or SQL) or
>>>>> >> clients (SQL client or CLI).
>>>>> >>
>>>>> >> Is DataStream job a QUERY? I think not.
>>>>> >> For a QUERY, the most important concept is the statement. But the
>>>>> >> result does not contain this info.
>>>>> >> If we need to contain all jobs in the cluster, I think the name should
>>>>> >> be JOB or PIPELINE.
>>>>> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>>> >>
>>>>> >> > SHOW SAVEPOINTS
>>>>> >> To list the savepoint for a specific job, we need to specify a
>>>>> >> specific pipeline,
>>>>> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>>> >>
>>>>> >> Best,
>>>>> >> Godfrey
>>>>> >>
>>>>> >> Paul Lam <[email protected]> 于2022年5月20日周五 11:25写道：
>>>>> >> >
>>>>> >> > Hi Jark,
>>>>> >> >
>>>>> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>>> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>>> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>>> >> >
>>>>> >> > Another question is, what should be the syntax for ungracefully
>>>>> >> > canceling a query? As ShengKai pointed out in a offline discussion,
>>>>> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>>> >> > Flink CLI has both stop and cancel, mostly due to historical 
>>>>> >> > problems.
>>>>> >> >
>>>>> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>>> >> > that savepoints are owned by users and beyond the lifecycle of a 
>>>>> >> > Flink
>>>>> >> > cluster. For example, a user might take a savepoint at a custom path
>>>>> >> > that’s different than the default savepoint path, I think jobmanager
>>>>> >> would
>>>>> >> > not remember that, not to mention the jobmanager may be a fresh new
>>>>> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, 
>>>>> >> > it's
>>>>> >> > probably a best-effort one.
>>>>> >> >
>>>>> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>>>> >> > Savepoints are alias for nested transactions in DB area[1], and 
>>>>> >> > there’s
>>>>> >> > correspondingly global transactions. If we consider Flink jobs as
>>>>> >> > global transactions and Flink checkpoints as nested transactions,
>>>>> >> > then the savepoint semantics are close, thus I think savepoint syntax
>>>>> >> > in SQL-standard could be considered. But again, I’m don’t have very
>>>>> >> > strong preference.
>>>>> >> >
>>>>> >> > Ping @Timo to get more inputs.
>>>>> >> >
>>>>> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>>>> >> https://en.wikipedia.org/wiki/Nested_transaction>
>>>>> >> >
>>>>> >> > Best,
>>>>> >> > Paul Lam
>>>>> >> >
>>>>> >> > > 2022年5月18日 17:48，Jark Wu <[email protected]> 写道：
>>>>> >> > >
>>>>> >> > > Hi Paul,
>>>>> >> > >
>>>>> >> > > 1) SHOW QUERIES
>>>>> >> > > +1 to add finished time, but it would be better to call it 
>>>>> >> > > "end_time"
>>>>> >> to
>>>>> >> > > keep aligned with names in Web UI.
>>>>> >> > >
>>>>> >> > > 2) DROP QUERY
>>>>> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, 
>>>>> >> > > how
>>>>> >> to
>>>>> >> > > stop batch queries?
>>>>> >> > > At present, I don't think "DROP" is a suitable keyword for this
>>>>> >> statement.
>>>>> >> > > From the perspective of users, "DROP" sounds like the query should 
>>>>> >> > > be
>>>>> >> > > removed from the
>>>>> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>>> >> more
>>>>> >> > > suitable and
>>>>> >> > > compliant with commands of Flink CLI.
>>>>> >> > >
>>>>> >> > > 3) SHOW SAVEPOINTS
>>>>> >> > > I think this statement is needed, otherwise, savepoints are lost
>>>>> >> after the
>>>>> >> > > SAVEPOINT
>>>>> >> > > command is executed. Savepoints can be retrieved from REST API
>>>>> >> > > "/jobs/:jobid/checkpoints"
>>>>> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
>>>>> >> considering
>>>>> >> > > providing "SHOW CHECKPOINTS"
>>>>> >> > > to list all checkpoints.
>>>>> >> > >
>>>>> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>>>>> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>>> >> statements
>>>>> >> > > now.
>>>>> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT 
>>>>> >> > > are
>>>>> >> both
>>>>> >> > > the same savepoint id.
>>>>> >> > > However, in our syntax, the first one is query id, and the second 
>>>>> >> > > one
>>>>> >> is
>>>>> >> > > savepoint path, which is confusing and
>>>>> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>>> >> they
>>>>> >> > > should be in the same syntax set.
>>>>> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP 
>>>>> >> > > SAVEPOINT
>>>>> >> > > <sp_path>.
>>>>> >> > > That means we don't follow the majority of vendors in SAVEPOINT
>>>>> >> commands. I
>>>>> >> > > would say the purpose is different in Flink.
>>>>> >> > > What other's opinion on this?
>>>>> >> > >
>>>>> >> > > Best,
>>>>> >> > > Jark
>>>>> >> > >
>>>>> >> > > [1]:
>>>>> >> > >
>>>>> >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>>>> >> > >
>>>>> >> > >
>>>>> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <[email protected]> 
>>>>> >> > > wrote:
>>>>> >> > >
>>>>> >> > >> Hi Godfrey,
>>>>> >> > >>
>>>>> >> > >> Thanks a lot for your inputs!
>>>>> >> > >>
>>>>> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>> >> (DataStream
>>>>> >> > >> or SQL) or
>>>>> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
>>>>> >> > >> ClusterClient#listJobs, the
>>>>> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>>> >> in SQL
>>>>> >> > >> client, because
>>>>> >> > >> these jobs can be managed via SQL client too.
>>>>> >> > >>
>>>>> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. 
>>>>> >> > >> But
>>>>> >> I’m a
>>>>> >> > >> bit afraid that the
>>>>> >> > >> rows would be too long.
>>>>> >> > >>
>>>>> >> > >> WRT ‘DROP QUERY’,
>>>>> >> > >>> What's the behavior for batch jobs and the non-running jobs?
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> > >> In general, the behavior would be aligned with Flink CLI. 
>>>>> >> > >> Triggering
>>>>> >> a
>>>>> >> > >> savepoint for
>>>>> >> > >> a non-running job would cause errors, and the error message would 
>>>>> >> > >> be
>>>>> >> > >> printed to
>>>>> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs 
>>>>> >> > >> in
>>>>> >> > >> streaming
>>>>> >> > >> execution mode would be the same with streaming jobs. However, for
>>>>> >> batch
>>>>> >> > >> jobs in
>>>>> >> > >> batch execution mode, I think there would be an error, because 
>>>>> >> > >> batch
>>>>> >> > >> execution
>>>>> >> > >> doesn’t support checkpoints currently (please correct me if I’m
>>>>> >> wrong).
>>>>> >> > >>
>>>>> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>>> >> clusterClient/
>>>>> >> > >> jobClient doesn’t have such a functionality at the moment, 
>>>>> >> > >> neither do
>>>>> >> > >> Flink CLI.
>>>>> >> > >> Maybe we could make it a follow-up FLIP, which includes the
>>>>> >> modifications
>>>>> >> > >> to
>>>>> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
>>>>> >> > >>
>>>>> >> > >> Best,
>>>>> >> > >> Paul Lam
>>>>> >> > >>
>>>>> >> > >>> 2022年5月17日 20:34，godfrey he <[email protected]> 写道：
>>>>> >> > >>>
>>>>> >> > >>> Godfrey
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> >
>>>>> >>
>>>>> >
>>>>
>>>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to