[ https://issues.apache.org/jira/browse/HIVE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290120#comment-16290120 ]
Ashutosh Chauhan commented on HIVE-18201: ----------------------------------------- Ya, config driven is not ideal solution. However, getting this costing done correctly is currently non trivial in our system because this is mostly a runtime costing we have to do, i.e., decide whether shuffling over network + distributed cpu is faster than no network and cpu with lower parallelism. We need to model network, cpu and parallelism in this case. Currently, we mostly do logical costing based on cardinality of different operators. So, we need to make enhancements in our system to model these runtime params. Meanwhile, this patch is step in right direction. It makes it possible to have that switch b/w different edges possible. Next step will be to estimate the threshold automatically using the costing I outlined above. > Disable XPROD_EDGE for sq_count_check() created for scalar subqueries > ---------------------------------------------------------------------- > > Key: HIVE-18201 > URL: https://issues.apache.org/jira/browse/HIVE-18201 > Project: Hive > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Nita Dembla > Assignee: Ashutosh Chauhan > Attachments: HIVE-18201.1.patch, query6.explain2.out > > > sq_count_check() will either return an error at runtime or a single row. In > case of query6, the subquery has avg() function that should return a single > row. Attaching the explain. > This does not need an x-prod, because it is not useful to shuffle the big > table side for a cross-product against 1 row. -- This message was sent by Atlassian JIRA (v6.4.14#64029)