[ https://issues.apache.org/jira/browse/HIVE-15458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849292#comment-15849292 ]
Vineet Garg commented on HIVE-15458: ------------------------------------ This is not really subquery issue. This is also reproducible with following query: {code:SQL} select part.p_type from part join (select p1.p_name from part p1, part p2 group by p1.p_name) pp where pp.p_name = part.p_name; {code} This will throw following exception in hive log {noformat} org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Invalid table alias or column reference '$hdt$_0': (possible column names are: $hdt$_1.p_name, $hdt$_2.dummy) {noformat} Note that after throwing this exception HIVE falls back to non-cbo path to execute this query successfully, so beeline/hivecli won't see this error. Issue is during conversion of calcite plan to AST, specifically following code in {{ASTConverter.java}} {code} else if (r instanceof Join) { Join join = (Join) r; QueryBlockInfo left = convertSource(join.getLeft()); QueryBlockInfo right = convertSource(join.getRight()); s = new Schema(left.schema, right.schema); ASTNode cond = join.getCondition().accept(new RexVisitor(s)); boolean semiJoin = join instanceof SemiJoin; if (join.getRight() instanceof Join) { // Invert join inputs; this is done because otherwise the SemanticAnalyzer // methods to merge joins will not kick in JoinRelType type; if (join.getJoinType() == JoinRelType.LEFT) { type = JoinRelType.RIGHT; } else if (join.getJoinType() == JoinRelType.RIGHT) { type = JoinRelType.LEFT; } else { type = join.getJoinType(); } ast = ASTBuilder.join(right.ast, left.ast, type, cond, semiJoin); } else { ast = ASTBuilder.join(left.ast, right.ast, join.getJoinType(), cond, semiJoin); } if (semiJoin) { s = left.schema; } {code} We should not be inverting join inputs for SEMI join since it change the semantics. Bypassing this for semi-join produces correct AST but further throws an exception while generating joinTree from AST in {{SemanticAnalyzer::genJoinTree()}} Plan after semi-join optimization looks like as follow: {code} HiveProject(p_type=[$1]) HiveSemiJoin(condition=[=($0, $2)], joinType=[inner]) HiveProject(p_name=[$1], p_type=[$4]) HiveFilter(condition=[IS NOT NULL($1)]) HiveTableScan(table=[[default.part]], table:alias=[part]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveProject(p_name=[$1]) HiveFilter(condition=[IS NOT NULL($1)]) HiveTableScan(table=[[default.part]], table:alias=[p1]) HiveProject(DUMMY=[0]) HiveTableScan(table=[[default.part]], table:alias=[p2]) {code} Since {{HiveSemiJoin}} has {{HiveJoin}} as it's right input following code in {{SemanticAnalyzer::genJoinTree()}} throws an error {code} ASTNode left = (ASTNode) joinParseTree.getChild(0); ASTNode right = (ASTNode) joinParseTree.getChild(1); boolean isValidLeftToken = isValidJoinSide(left); boolean isJoinLeftToken = !isValidLeftToken && isJoinToken(left); boolean isValidRightToken = isValidJoinSide(right); boolean isJoinRightToken = !isValidRightToken && isJoinToken(right); // TODO: if we didn't care about the column order, we could switch join sides here // for TOK_JOIN and TOK_FULLOUTERJOIN. if (!isValidLeftToken && !isJoinLeftToken) { throw new SemanticException("Invalid token on the left side of the join: " + left.getToken().getText() + "; please rewrite your query"); } else if (!isValidRightToken) { String advice= ""; if (isJoinRightToken && !isJoinLeftToken) { advice = "; for example, put the nested join on the left side, or nest joins differently"; } else if (isJoinRightToken) { advice = "; for example, nest joins differently"; } throw new SemanticException("Invalid token on the right side of the join: " + right.getToken().getText() + "; please rewrite your query" + advice); } {code} {{genJoinTree}} does not expect it's right input to be another join {code} private static boolean isValidJoinSide(ASTNode right) { return (right.getToken().getType() == HiveParser.TOK_TABREF) || (right.getToken().getType() == HiveParser.TOK_SUBQUERY) || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION); } {code} > Fix semi-join conversion rule for subquery > ------------------------------------------ > > Key: HIVE-15458 > URL: https://issues.apache.org/jira/browse/HIVE-15458 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer > Reporter: Vineet Garg > Assignee: Vineet Garg > > Subquery code in *CalcitePlanner* turns off *hive.enable.semijoin.conversion* > since it doesn't work for subqueries. -- This message was sent by Atlassian JIRA (v6.3.15#6346)