Re: [DISCUSS] FLIP-280: Introduce a new explain mode to provide SQL advice

Jane Chan Fri, 06 Jan 2023 02:48:57 -0800

Hi, devs,

Thanks for all the feedback.


Based on the discussion[1], we seem to have a consensus so far, so I would
like to start a vote on FLIP-280[2], which begins on the following Monday
(Jan 9th at 10:00 AM GMT).

If you have any questions or doubts, please do not hesitate to follow up on
this discussion.

Best,
Jane

[1] https://lists.apache.org/thread/5xywxv7g43byoh0jbx1b6qo6gx6wjkcz
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-280%3A+Introduce+EXPLAIN+PLAN_ADVICE+to+provide+SQL+advice

On Fri, Jan 6, 2023 at 4:27 PM Jingsong Li <[email protected]> wrote:

> Thanks Jane for your feedback.
>
> `EXPLAIN PLAN_ADVICE` looks good to me.
>
> Best,
> Jingsong
>
> On Thu, Jan 5, 2023 at 5:20 PM Jane Chan <[email protected]> wrote:
> >
> > Hi, devs,
> >
> > After discussing with Godfrey <[email protected]>, Lincoln
> > <[email protected]>, and Jark <[email protected]>, I've updated the
> > FLIP document[1] and look forward to your opinions and suggestions.
> >
> > The highlight difference is listed as follows.
> >
> >    - *The proposed syntax changes from EXPLAIN ANALYZED_PHYSICAL_PLAN
> >    <query> to EXPLAIN PLAN_ADVICE <query>*.
> >       - The reason for changing the syntax is that the output format and
> >       analyzed target are two orthogonal concepts and better be
> > decoupled. On the
> >       other hand, users may care about the advice content instead of
> which plan
> >       is analyzed, and thus PHYSICAL should be kept from users.
> >
> >
> >    - *The output format changes from JSON to current tree-style text.
> >    Introduce ExplainFormat to classify the output format.*
> >       - The current output format is a mixture of plain text (AST,
> >       Optimized Physical Plan, and Optimized Execution Plan) and JSON
> (Physical
> >       Execution Plan,  via EXPLAIN JSON_EXECUTION_PLAN ), which is not
> > structured
> >       and categorized. By introducing ExplainFormat, we can better
> classify the
> >       output format and have more flexibility to extend more formats in
> the
> >       future.
> >
> >
> >    - *The PlanAnalyzer installation gets rid of SPI.*
> >       - PlanAnalyzer should be an internal interface and not be exposed
> to
> >       users. Therefore, the Factory mechanism is unsuitable for this.
> >
> >
> > To Godfrey <[email protected]>, Jingsong <[email protected]>, and
> > Shengkai <[email protected]>, Thanks for your comments and questions.
> >
> > @Jingsong
> >
> > > Can you give examples of other systems for the syntax?
> > > In other systems, is EXPLAIN ANALYZE already PHYSICAL_PLAN?
> > >
> >
> > For other systems like MySQL[2], PostgreSQL[3], Presto[4], and TiDB[5]
> >
> > EXPLAIN ANALYZE <query>
> > is the mainstream syntax.
> >
> > However, it represents an actual measurement of the cost, i.e., the
> > statement will execute the statement, which is unsuitable for this
> > condition.
> >
> >
> > `EXPLAIN ANALYZED_PHYSICAL_PLAN <query>` looks a bit strange, and even
> > > stranger that it contains `advice`. The purpose of FLIP seems to be a
> bit
> > > more to `advice`, so can we just
> > > introduce a syntax for `advice`?
> >
> >
> > Good point. After several discussions, the syntax has been updated to
> >
> > EXPLAIN PLAN_ADVICE <query>
> >
> > @Godfrey
> >
> > Do we really need to expose `PlanAnalyzerFactory` as public interface?
> > > I prefer we only expose ExplainDetail#ANALYZED_PHYSICAL_PLAN and the
> > > analyzed result.
> > > Which is enough for users and consistent with the results of `explain`
> > > method.The classes about plan analyzer are in table planner module,
> which
> > > does not public api (public interfaces should be defined in
> > > flink-table-api-java module). And PlanAnalyzer is depend on RelNode,
> which
> > > is internal class of planner, and not expose to users.
> >
> >
> > Good point. After reconsideration, the SPI mechanism is removed from the
> > FLIP. PlanAnalyzer should be an internal implementation much similar to a
> > RelOptRule, and should not be exposed to users.
> >
> > @Shengkai
> >
> > > 1. `PlanAnalyzer#analyze` uses the FlinkRelNode as the input. Could you
> > > share some thoughts about the motivation? In my experience, users
> mainly
> > > care about 2 things when they develop their job:
> >
> > a. Why their SQL can not work? For example, their streaming SQL contains
> an
> > > OVER window but their ORDER key is not ROWTIME. In this case, we may
> don't
> > > have a physical node or logical node because, during the optimization,
> the
> > > planner has already thrown the exception.
> > >
> >
> >  The prerequisite for providing advice is that the optimized physical can
> > be generated. The planner should throw exceptions if the query contains
> > syntax errors or other problems.
> >
> >
> >
> > > b. Many users care about whether their state is compatible after
> upgrading
> > > their Flink version. In this case, I think the old execplan and the SQL
> > > statement are the user's input.
> >
> >
> > Good point. State compatibility detection is beneficial, but it better be
> > decoupled with EXPLAIN PLAN_ADVICE. We could provide a separate mechanism
> > for cross-version validation.
> >
> >
> > 2. I am just curious how other people add the rules to the Advisor. When
> > > rules increases, all these rules should be added to the Flink codebase?
> >
> >
> > It is much similar to adding a RelOptRule to RuleSet. The number of
> > analyzers will not be growing too fast. So adding them to the Flink
> > codebase may not be a concern.
> >
> >
> > 3. How do users configure another advisor?
> >
> >
> >  After reconsideration, I would like to let PlanAdvisor be an internal
> > interface, which is different from implementing a custom
> connector/format.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-280%3A+Introduce+EXPLAIN+PLAN_ADVICE+to+provide+SQL+advice
> > [2] https://dev.mysql.com/doc/refman/8.0/en/explain.html#explain-analyze
> > [3] https://www.postgresql.org/docs/current/sql-explain.html
> > [4] https://prestodb.io/docs/current/sql/explain-analyze.html
> > [5] https://docs.pingcap.com/tidb/dev/sql-statement-explain-analyze
> >
> > Best regards,
> > Jane
> >
> > On Tue, Jan 3, 2023 at 6:20 PM Jingsong Li <[email protected]>
> wrote:
> >
> > > Thanks Jane for the FLIP! It looks very nice!
> > >
> > > Can you give examples of other systems for the syntax?
> > > In other systems, is EXPLAIN ANALYZE already PHYSICAL_PLAN?
> > >
> > > `EXPLAIN ANALYZED_PHYSICAL_PLAN <query>` looks a bit strange, and even
> > > stranger that it contains `advice`.
> > >
> > > The purpose of FLIP seems to be a bit more to `advice`, so can we just
> > > introduce a syntax for `advice`?
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Jan 3, 2023 at 3:40 PM godfrey he <[email protected]> wrote:
> > > >
> > > > Thanks for driving this discussion.
> > > >
> > > > Do we really need to expose `PlanAnalyzerFactory` as public
> interface?
> > > > I prefer we only expose ExplainDetail#ANALYZED_PHYSICAL_PLAN and the
> > > > analyzed result.
> > > > Which is enough for users and consistent with the results of
> `explain`
> > > method.
> > > >
> > > > The classes about plan analyzer are in table planner module, which
> > > > does not public api
> > > > (public interfaces should be defined in flink-table-api-java module).
> > > > And PlanAnalyzer is depend on RelNode, which is internal class of
> > > > planner, and not expose to users.
> > > >
> > > > Bests,
> > > > Godfrey
> > > >
> > > >
> > > > Shengkai Fang <[email protected]> 于2023年1月3日周二 13:43写道：
> > > > >
> > > > > Sorry for the missing answer about the configuration of the
> Analyzer.
> > > Users
> > > > > may don't need to configure this with SQL statements. In the SQL
> > > Gateway,
> > > > > users can configure the endpoints with the option
> > > `sql-gateway.endpoint.type`
> > > > > in the flink-conf.
> > > > >
> > > > > Best,
> > > > > Shengkai
> > > > >
> > > > > Shengkai Fang <[email protected]> 于2023年1月3日周二 12:26写道：
> > > > >
> > > > > > Hi, Jane.
> > > > > >
> > > > > > Thanks for bringing this to the discussion. I have some questions
> > > about
> > > > > > the FLIP:
> > > > > >
> > > > > > 1. `PlanAnalyzer#analyze` uses the FlinkRelNode as the input.
> Could
> > > you
> > > > > > share some thoughts about the motivation? In my experience, users
> > > mainly
> > > > > > care about 2 things when they develop their job:
> > > > > >
> > > > > > a. Why their SQL can not work? For example, their streaming SQL
> > > contains
> > > > > > an OVER window but their ORDER key is not ROWTIME. In this case,
> we
> > > may
> > > > > > don't have a physical node or logical node because, during the
> > > > > > optimization, the planner has already thrown the exception.
> > > > > >
> > > > > > b. Many users care about whether their state is compatible after
> > > upgrading
> > > > > > their Flink version. In this case, I think the old execplan and
> the
> > > SQL
> > > > > > statement are the user's input.
> > > > > >
> > > > > > So, I think we should introduce methods like
> > > `PlanAnalyzer#analyze(String
> > > > > > sql)` and `PlanAnalyzer#analyze(String sql, ExecnodeGraph)` here.
> > > > > >
> > > > > > 2. I am just curious how other people add the rules to the
> Advisor.
> > > When
> > > > > > rules increases, all these rules should be added to the Flink
> > > codebase?
> > > > > > 3. How do users configure another advisor?
> > > > > >
> > > > > > Best,
> > > > > > Shengkai
> > > > > >
> > > > > >
> > > > > >
> > > > > > Jane Chan <[email protected]> 于2022年12月28日周三 12:30写道：
> > > > > >
> > > > > >> Hi @yuxia, Thank you for reviewing the FLIP and raising
> questions.
> > > > > >>
> > > > > >> 1: Is the PlanAnalyzerFactory also expected to be implemented by
> > > users
> > > > > >> just
> > > > > >> > like DynamicTableSourceFactory or other factories? If so, I
> > > notice that
> > > > > >> in
> > > > > >> > the code of PlanAnalyzerManager#registerAnalyzers, the code
> is as
> > > > > >> follows:
> > > > > >> > FactoryUtil.discoverFactory(classLoader,
> > > PlanAnalyzerFactory.class,
> > > > > >> > StreamPlanAnalyzerFactory.STREAM_IDENTIFIER)); IIUC, it'll
> always
> > > find
> > > > > >> the
> > > > > >> > factory with the name
> > > StreamPlanAnalyzerFactory.STREAM_IDENTIFIER; Is
> > > > > >> it a
> > > > > >> > typo or by design ?
> > > > > >>
> > > > > >>
> > > > > >> This is a really good open question. For the short answer, yes,
> it
> > > is by
> > > > > >> design. I'll explain the consideration in more detail.
> > > > > >>
> > > > > >> The standard procedure to create a custom table source/sink is
> to
> > > > > >> implement
> > > > > >> the factory and the source/sink class. There is a strong 1v1
> > > relationship
> > > > > >> between the factory and the source/sink.
> > > > > >>
> > > > > >> SQL
> > > > > >>
> > > > > >> DynamicTableSourceFactory
> > > > > >>
> > > > > >> Source
> > > > > >>
> > > > > >> create table … with (‘connector’ = ‘foo’)
> > > > > >>
> > > > > >> #factoryIdentifer.equals(“foo”)
> > > > > >>
> > > > > >> FooTableSource
> > > > > >>
> > > > > >>
> > > > > >> *Apart from that, the custom function module is another kind of
> > > > > >> implementation. The factory creates a collection of functions.
> This
> > > is a
> > > > > >> 1vN relationship between the factory and the functions.*
> > > > > >>
> > > > > >> SQL
> > > > > >>
> > > > > >> ModuleFactory
> > > > > >>
> > > > > >> Function
> > > > > >>
> > > > > >> load module ‘bar’
> > > > > >>
> > > > > >> #factoryIdentifier.equals(“bar”)
> > > > > >>
> > > > > >> A collection of functions
> > > > > >>
> > > > > >> Back to the plan analyzers, if we choose the first style, we
> also
> > > need to
> > > > > >> expose a new SQL syntax to users, like "CREATE ANALYZER foo WITH
> > > ..." to
> > > > > >> specify the factory identifier. But I think it is too heavy
> because
> > > an
> > > > > >> analyzer is an auxiliary tool to help users write better
> queries,
> > > and thus
> > > > > >> it should be exposed at the API level other than the user syntax
> > > level.
> > > > > >>
> > > > > >> As a result, I propose to follow the second style. Then we don't
> > > need to
> > > > > >> introduce new syntax to create analyzers. Let
> > > StreamPlanAnalyzerFactory be
> > > > > >> the default factory to create analyzers under the streaming
> mode,
> > > and the
> > > > > >> custom analyzers will register themselves in
> > > StreamPlanAnalyzerFactory.
> > > > > >>
> > > > > >> @Override
> > > > > >> public List<PlanAnalyzer> createAnalyzers() {
> > > > > >>     return Arrays.asList(
> > > > > >>             FooAnalyzer.INSTANCE,
> > > > > >>             BarAnalyzer.INSTANCE,
> > > > > >>             ...);
> > > > > >> }
> > > > > >>
> > > > > >>
> > > > > >> 2: Is there any special reason make PlanAdvice be a final class?
> > > Would it
> > > > > >> > be better to make it an interface and we provide a default
> > > > > >> implementation?
> > > > > >> > My concern is some users may want have their own
> implementation
> > > for
> > > > > >> > PlanAdvice. But it may be overthinking. If you think it won't
> > > bring any
> > > > > >> > problem, I'm also fine with that.
> > > > > >>
> > > > > >>
> > > > > >> The reason why making PlanAdvice final is that I think users
> would
> > > prefer
> > > > > >> to implement the custom PlanAnalyzer than PlanAdvice.
> PlanAdvice is
> > > a POJO
> > > > > >> class to represent the analyzed result provided by PlanAnalyzer.
> > > > > >>
> > > > > >>
> > > > > >> 3: Is there a way only show advice? For me, it seems the advice
> > > will be
> > > > > >> > more useful and the nodes may contains to many details.
> > > > > >>
> > > > > >>
> > > > > >> The result contains two parts: the optimized physical plan
> itself +
> > > the
> > > > > >> analysis of the plan.
> > > > > >>
> > > > > >> For PlanAdvice with the scope as GLOBAL, it is possible to do
> so.
> > > While
> > > > > >> for
> > > > > >> a LOCAL scope, the advice content is specific to certain nodes
> > > (E.g., some
> > > > > >> certain rel nodes are sensitive to state TTL configuration). In
> this
> > > > > >> situation, the plan cannot be omitted. On the other hand, the
> plan
> > > is
> > > > > >> necessary from the visualization perspective. During the PoC
> phase,
> > > I made
> > > > > >> some attempts to adapt the Flink Visualizer to illustrate the
> > > analyzed
> > > > > >> plan, and it looks like the following pic. I think this is
> > > intuitive to
> > > > > >> help users understand their queries and what they can do
> according
> > > to the
> > > > > >> advice.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> 4: I'm curious about what't the global advice will look like.
> Is it
> > > > > >> > possible to provide an example?
> > > > > >>
> > > > > >>
> > > > > >> Here is an example to illustrate the non-deterministic update
> issue.
> > > > > >>
> > > > > >> create temporary table cdc_with_meta (
> > > > > >>   a int,
> > > > > >>   b bigint,
> > > > > >>   c string,
> > > > > >>   d boolean,
> > > > > >>   metadata_1 int metadata,
> > > > > >>   metadata_2 string metadata,
> > > > > >>   metadata_3 bigint metadata,
> > > > > >>   primary key (a) not enforced
> > > > > >> ) with (
> > > > > >>   'connector' = 'values',
> > > > > >>   'changelog-mode' = 'I,UA,UB,D',
> > > > > >>   'readable-metadata' = 'metadata_1:INT, metadata_2:STRING,
> > > > > >> metadata_3:BIGINT'
> > > > > >> );
> > > > > >>
> > > > > >> create temporary table sink_without_pk (
> > > > > >>   a int,
> > > > > >>   b bigint,
> > > > > >>   c string
> > > > > >> ) with (
> > > > > >>   'connector' = 'values',
> > > > > >>   'sink-insert-only' = 'false'
> > > > > >> );
> > > > > >>
> > > > > >> insert into sink_without_pk
> > > > > >> select a, metadata_3, c
> > > > > >> from cdc_with_meta;
> > > > > >>
> > > > > >> And with compilation as SCHEMA, the result is as below.
> > > > > >>
> > > > > >> {
> > > > > >>   "nodes" : [ {
> > > > > >>     "id" : 1,
> > > > > >>     "type" : "StreamPhysicalTableSourceScan",
> > > > > >>     "digest" : "TableSourceScan(table=[[default_catalog,
> > > default_database,
> > > > > >> cdc_with_meta, project=[a, c], metadata=[metadata_3]]],
> fields=[a,
> > > c,
> > > > > >> metadata_3], upsertKeys=[[a]])",
> > > > > >>     "changelog_mode" : "I,UB,UA,D"
> > > > > >>   }, {
> > > > > >>     "id" : 2,
> > > > > >>     "type" : "StreamPhysicalCalc",
> > > > > >>     "digest" : "Calc(select=[a, metadata_3, c],
> upsertKeys=[[a]])",
> > > > > >>     "changelog_mode" : "I,UB,UA,D",
> > > > > >>     "predecessors" : [ {
> > > > > >>       "id" : 1,
> > > > > >>       "distribution" : "ANY",
> > > > > >>       "changelog_mode" : "I,UB,UA,D"
> > > > > >>     } ]
> > > > > >>   }, {
> > > > > >>     "id" : 3,
> > > > > >>     "type" : "StreamPhysicalSink",
> > > > > >>     "digest" :
> > > > > >> "Sink(table=[default_catalog.default_database.sink_without_pk],
> > > > > >> fields=[a, metadata_3, c])",
> > > > > >>     "changelog_mode" : "NONE",
> > > > > >>     "predecessors" : [ {
> > > > > >>       "id" : 2,
> > > > > >>       "distribution" : "ANY",
> > > > > >>       "changelog_mode" : "I,UB,UA,D"
> > > > > >>     } ]
> > > > > >>   } ],
> > > > > >>   "advice" : [ {
> > > > > >>     "kind" : "WARNING",
> > > > > >>     "scope" : "GLOBAL",
> > > > > >>     "content" : "The metadata column(s): 'metadata_3' in cdc
> source
> > > may
> > > > > >> cause wrong result or error on downstream operators, please
> consider
> > > > > >> removing these columns or use a non-cdc source that only has
> insert
> > > > > >> messages.\nsource
> node:\nTableSourceScan(table=[[default_catalog,
> > > > > >> default_database, cdc_with_meta, project=[a, c],
> > > metadata=[metadata_3]]],
> > > > > >> fields=[a, c, metadata_3], changelogMode=[I,UB,UA,D],
> > > upsertKeys=[[a]])\n"
> > > > > >>   } ]
> > > > > >> }
> > > > > >>
> > > > > >>
> > > > > >> Best regards,
> > > > > >> Jane Chan
> > > > > >>
> > > > > >> On Tue, Dec 27, 2022 at 8:06 PM yuxia <
> [email protected]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Thanks for driving this FLIP. It should be a good improvement
> to
> > > users.
> > > > > >> > But I have few questions:
> > > > > >> > 1: Is the PlanAnalyzerFactory also expected to be implemented
> by
> > > users
> > > > > >> > just like DynamicTableSourceFactory or other factories? If
> so, I
> > > notice
> > > > > >> > that in the code of PlanAnalyzerManager#registerAnalyzers, the
> > > code is
> > > > > >> as
> > > > > >> > follows:
> > > > > >> > FactoryUtil.discoverFactory(classLoader,
> > > PlanAnalyzerFactory.class,
> > > > > >> > StreamPlanAnalyzerFactory.STREAM_IDENTIFIER));
> > > > > >> >
> > > > > >> > IIUC, it'll always find the factory with the name
> > > > > >> > StreamPlanAnalyzerFactory.STREAM_IDENTIFIER; Is it a typo or
> by
> > > design ?
> > > > > >> >
> > > > > >> > 2: Is there any special reason make PlanAdvice be a final
> class?
> > > Would
> > > > > >> it
> > > > > >> > be better to make it an interface and we provide a default
> > > > > >> implementation?
> > > > > >> > My concern is some users may want have their own
> implementation
> > > for
> > > > > >> > PlanAdvice. But it may be overthinking. If you think it won't
> > > bring any
> > > > > >> > problem, I'm also fine with that.
> > > > > >> >
> > > > > >> > 3: Is there a way only show advice? For me, it seems the
> advice
> > > will be
> > > > > >> > more useful and the nodes may contains to many details.
> > > > > >> >
> > > > > >> > 4: I'm curious about what't the global advice will look like.
> Is
> > > it
> > > > > >> > possible to provide an example?
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> >
> > > > > >> > Best regards,
> > > > > >> > Yuxia
> > > > > >> >
> > > > > >> > ----- 原始邮件 -----
> > > > > >> > 发件人: "Jane Chan" <[email protected]>
> > > > > >> > 收件人: "dev" <[email protected]>
> > > > > >> > 发送时间: 星期一, 2022年 12 月 26日 下午 9:39:18
> > > > > >> > 主题: [DISCUSS] FLIP-280: Introduce a new explain mode to
> provide
> > > SQL
> > > > > >> advice
> > > > > >> >
> > > > > >> > Hi, devs,
> > > > > >> >
> > > > > >> > I would like to start a discussion on FLIP-280: Introduce a
> new
> > > explain
> > > > > >> > mode to provide SQL advice[1].
> > > > > >> >
> > > > > >> > Currently, Flink SQL EXPLAIN statement provides three details
> to
> > > display
> > > > > >> > the plan. However, there is a considerable gap between the
> current
> > > > > >> > explained result and the actual, applicable actions for users
> to
> > > improve
> > > > > >> > their queries.
> > > > > >> >
> > > > > >> > To provide more understandable, actionable advice closer to
> the
> > > user's
> > > > > >> > perspective, we propose a new explain mode that analyzes the
> > > physical
> > > > > >> plan
> > > > > >> > and attaches available tuning advice and data correctness
> > > warnings.
> > > > > >> >
> > > > > >> > EXPLAIN ANALYZED_PHYSICAL_PLAN <query>
> > > > > >> >
> > > > > >> > I've included more details in [1], and I look forward to your
> > > feedback.
> > > > > >> >
> > > > > >> > [1]
> > > > > >> >
> > > > > >> >
> > > > > >>
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-280%3A+Support+EXPLAIN+SQL+statements+with+advice
> > > > > >> > [2] POC: https://github.com/LadyForest/flink/tree/FLIP-280
> > > > > >> >
> > > > > >> > Best regards,
> > > > > >> > Jane Chan
> > > > > >> >
> > > > > >>
> > > > > >
> > >
>

Re: [DISCUSS] FLIP-280: Introduce a new explain mode to provide SQL advice

Reply via email to