[ https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555191#comment-13555191 ]
Ashutosh Chauhan commented on HIVE-3852: ---------------------------------------- Namit, bq. Should we have this optimization now ? I am not sure which particular optimization you are referring to. I assume you mean there is no need for reduce-side groupbys anymore, since we have map-side aggregates. If so, I think those are still required. As Navis, pointed out if reduction ratio is not high enough, mappers may run out of memory and than we suggest users to turn-off map-side aggregation. > Multi-groupby optimization fails when same distinct column is used twice or > more > -------------------------------------------------------------------------------- > > Key: HIVE-3852 > URL: https://issues.apache.org/jira/browse/HIVE-3852 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Navis > Assignee: Navis > Priority: Trivial > Attachments: HIVE-3852.D7737.1.patch > > > {code} > FROM INPUT > INSERT OVERWRITE TABLE dest1 > SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct > substr(INPUT.value,5)) GROUP BY INPUT.key > INSERT OVERWRITE TABLE dest2 > SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct > substr(INPUT.value,5)) GROUP BY INPUT.key; > {code} > fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira