[ https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696590&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696590 ]
ASF GitHub Bot logged work on HIVE-25792: ----------------------------------------- Author: ASF GitHub Bot Created on: 15/Dec/21 12:41 Start Date: 15/Dec/21 12:41 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #2865: URL: https://github.com/apache/hive/pull/2865#discussion_r769579986 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java ########## @@ -673,8 +672,7 @@ Operator genOPTree(ASTNode ast, PlannerContext plannerCtx) throws SemanticExcept } this.ctx.setCboInfo(cboMsg); - // Determine if we should re-throw the exception OR if we try to mark plan as reAnalyzeAST to retry - // planning as non-CBO. + // Determine if we should re-throw the exception OR if we try to mark the query to retry as non-CBO. Review comment: I am wondering if it would be better to just throw here and let the reexecution plugin deal with what do to afterwards. ########## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ########## @@ -5561,7 +5563,8 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal "Size of the runtime statistics cache. Unit is: OperatorStat entry; a query plan consist ~100."), HIVE_QUERY_PLANMAPPER_LINK_RELNODES("hive.query.planmapper.link.relnodes", true, "Whether to link Calcite nodes to runtime statistics."), - + HIVE_QUERY_MAX_RECOMPILATION_COUNT("hive.query.recompilation.max.count", 1, + "Maximum number of re-compilations for a single query."), Review comment: Why do we need the number of recompilations to be configurable? ########## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ########## @@ -5536,10 +5536,12 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal HIVE_QUERY_REEXECUTION_ENABLED("hive.query.reexecution.enabled", true, "Enable query reexecutions"), - HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", "overlay,reoptimize,reexecute_lost_am,dagsubmit", + HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", + "overlay,reoptimize,reexecute_lost_am,dagsubmit,recompile_without_cbo", "comma separated list of plugin can be used:\n" + " overlay: hiveconf subtree 'reexec.overlay' is used as an overlay in case of an execution errors out\n" + " reoptimize: collects operator statistics during execution and recompile the query after a failure\n" + + " recompile_without_cbo: recompiles query after a CBO failure\n" + " reexecute_lost_am: reexecutes query if it failed due to tez am node gets decommissioned"), Review comment: I didn't go through the whole PR but I get the impression that we could/should combine the `hive.query.reexecution.strategies` and `hive.cbo.fallback.strategy` configurations somehow. Having both does not seem necessary and raises some questions about the expected behavior. Consider for instance the following: ``` set hive.cbo.fallback.strategy = always; -- Note that recompile_without_cbo is missing set hive.query.reexecution.strategies = overlay,reoptimize,reexecute_lost_am,dagsubmit; ``` Reading the current configuration I am not sure what should happen when the CBO fails. The `hive.cbo.fallback.strategy` has not been released yet so we are free to drop it, or modify it to be consistent with `hive.query.reexecution.strategies`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 696590) Time Spent: 3h 40m (was: 3.5h) > Multi Insert query fails on CBO path > ------------------------------------- > > Key: HIVE-25792 > URL: https://issues.apache.org/jira/browse/HIVE-25792 > Project: Hive > Issue Type: Bug > Reporter: Zoltan Haindrich > Assignee: Peter Vary > Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > {code} > set hive.cbo.enable=true; > drop table if exists aa1; > drop table if exists bb1; > drop table if exists cc1; > drop table if exists dd1; > drop table if exists ee1; > drop table if exists ff1; > create table aa1 ( stf_id string); > create table bb1 ( stf_id string); > create table cc1 ( stf_id string); > create table ff1 ( x string); > explain > from ff1 as a join cc1 as b > insert overwrite table aa1 select stf_id GROUP BY b.stf_id > insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id > ; > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)