[
https://issues.apache.org/jira/browse/HIVE-29654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18087894#comment-18087894
]
Thomas Rebele commented on HIVE-29654:
--------------------------------------
Extending {{hive.log.explain.output}} sounds like the easiest way. The only
drawback that I can think of: if a tool reads the log files and depends on the
particular format, it might break. I think that's an acceptable risk.
> Provide an easy way to execute a query and get the CBO plan at the same time
> ----------------------------------------------------------------------------
>
> Key: HIVE-29654
> URL: https://issues.apache.org/jira/browse/HIVE-29654
> Project: Hive
> Issue Type: New Feature
> Reporter: Thomas Rebele
> Priority: Major
>
> It would be nice if Hive provided an easy way to execute a query and get the
> CBO and stage plans of that execution. Currently this is easy to achieve for
> the stage plans, but not for the CBO plan (see background section below).
> Some approaches:
> * output CBO plans if {{hive.log.explain.output}} is enabled
> * add another config option (e.g., {{{}hive.log.cbo.plan{}}}) to output the
> CBO plans when the query is executed
> * add a new command EXPLAIN EXECUTE that prints the query plans (CBO and
> stage plans) and then executes these plans
> ----
> Background:
> I've looked into the results of a TPC-DS benchmark. I wanted to understand
> why the performance of a query has changed. I had the execution time of the
> query, some logs, and some EXPLAIN plans. Unfortunately the EXPLAIN plans
> were not helpful, as the plans were not the same (different operator ids).
> I did an experiment with TPC-DS metadata and KLL histograms enabled. I
> executed EXPLAIN CBO JOINCOST several times in the same session. For 8 runs,
> I got 5 different plans.
> I had a chat with an AI tool, and it suggested some options to get the plan.
> After a while I got it working:
> * stage plans: {{SET hive.server2.logging.operation.level = VERBOSE; SET
> hive.log.explain.output=true;}}
> * CBO plan: enable DEBUG logging for
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}} in
> {{{}hive-log4j2.properties{}}}. Also executing {{SET
> hive.server2.logging.operation.level = VERBOSE;}} helps, as the plans will
> then show up in beeline.
> It would be nice to make this a bit simpler. It might be the case that the
> user cannot easily change the log4j2 properties. Also, setting the DEBUG
> level globally just to get the plan for one query looks like a lot of
> overhead, if other users are running queries at the same time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)