Thomas Rebele created HIVE-29654:
------------------------------------
Summary: Provide an easy way to execute a query and get the CBO
plan at the same time
Key: HIVE-29654
URL: https://issues.apache.org/jira/browse/HIVE-29654
Project: Hive
Issue Type: New Feature
Reporter: Thomas Rebele
It would be nice if Hive provided an easy way to execute a query and get the
CBO and stage plans of that execution. Currently this is easy to achieve for
the stage plans, but not for the CBO plan (see background section below).
Some approaches:
* output CBO plans if {{hive.log.explain.output}} is enabled
* add another config option (e.g., {{{}hive.log.cbo.plan{}}}) to output the
CBO plans when the query is executed
* add a new command EXPLAIN EXECUTE that prints the query plans (CBO and stage
plans) and then executes these plans
----
Background:
I've looked into the results of a TPC-DS benchmark. I wanted to understand why
the performance of a query has changed. I had the execution time of the query,
some logs, and some EXPLAIN plans. Unfortunately the EXPLAIN plans were not
helpful, as the plans were not the same (different operator ids).
I did an experiment with TPC-DS metadata and KLL histograms enabled. I executed
EXPLAIN CBO JOINCOST several times in the same session. For 8 runs, I got 5
different plans.
I had a chat with an AI tool, and it suggested some options to get the plan.
After a while I got it working:
* stage plans: {{SET hive.server2.logging.operation.level = VERBOSE; SET
hive.log.explain.output=true;}}
* CBO plan: enable DEBUG logging for
{{org.apache.hadoop.hive.ql.parse.CalcitePlanner}} in
{{{}hive-log4j2.properties{}}}. Also executing {{SET
hive.server2.logging.operation.level = VERBOSE;}} helps, as the plans will then
show up in beeline.
It would be nice to make this a bit simpler. It might be the case that the user
cannot easily change the log4j2 properties. Also, setting the DEBUG level
globally just to get the plan for one query looks like a lot of overhead, if
other users are running queries at the same time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)