[ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555191#comment-13555191
 ] 

Ashutosh Chauhan commented on HIVE-3852:
----------------------------------------

Namit,
bq. Should we have this optimization now ?
I am not sure which particular optimization you are referring to. I assume you 
mean there is no need for reduce-side groupbys anymore, since we have map-side 
aggregates. If so, I think those are still required. As Navis, pointed out if 
reduction ratio is not high enough, mappers may run out of memory and than we 
suggest users to turn-off map-side aggregation.

                
> Multi-groupby optimization fails when same distinct column is used twice or 
> more
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-3852
>                 URL: https://issues.apache.org/jira/browse/HIVE-3852
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-3852.D7737.1.patch
>
>
> {code}
> FROM INPUT
> INSERT OVERWRITE TABLE dest1 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key
> INSERT OVERWRITE TABLE dest2 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
> substr(INPUT.value,5)) GROUP BY INPUT.key;
> {code}
> fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to