[jira] [Updated] (HIVE-12369) Native Vector GroupBy

Matt McCline (JIRA) Wed, 04 Apr 2018 13:28:21 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matt McCline updated HIVE-12369:
--------------------------------
    Description: 
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single Long key with a single COUNT 
aggregation.  Or, a single Long key and no aggregation also known as duplicate 
reduction.

3 new classes introduces that stored the count in the slot table and don't 
allocate hash elements:
{noformat}
  COUNT(column)  VectorGroupByHashLongKeyCountColumnOperator      
  COUNT(key)     VectorGroupByHashLongKeyCountKeyOperator            
  COUNT(*)       VectorGroupByHashLongKeyCountStarOperator           
{noformat}
And the duplicate reduction operator a single Long key:
{noformat}
  VectorGroupByHashLongKeyDuplicateReductionOperator
{noformat}

  was:
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single Long key, aggregation on Long columns, 
no more than 31 columns.

3 new classes introduces that stored the count in the slot table and don't 
allocate hash elements:

{noformat}
  COUNT(column)  VectorGroupByHashOneLongKeyCountColumnOperator      
  COUNT(key)     VectorGroupByHashOneLongKeyCountKeyOperator            
  COUNT(*)       VectorGroupByHashOneLongKeyCountStarOperator           
{noformat}

And a new class that aggregates a single Long key:

{noformat}
  VectorGroupByHashOneLongKeyOperator
{noformat}


> Native Vector GroupBy
> ---------------------
>
>                 Key: HIVE-12369
>                 URL: https://issues.apache.org/jira/browse/HIVE-12369
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, 
> HIVE-12369.05.patch, HIVE-12369.06.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed 
> for Native Vector MapJoin, etc.
> Patch is currently limited to a single Long key with a single COUNT 
> aggregation.  Or, a single Long key and no aggregation also known as 
> duplicate reduction.
> 3 new classes introduces that stored the count in the slot table and don't 
> allocate hash elements:
> {noformat}
>   COUNT(column)  VectorGroupByHashLongKeyCountColumnOperator      
>   COUNT(key)     VectorGroupByHashLongKeyCountKeyOperator            
>   COUNT(*)       VectorGroupByHashLongKeyCountStarOperator           
> {noformat}
> And the duplicate reduction operator a single Long key:
> {noformat}
>   VectorGroupByHashLongKeyDuplicateReductionOperator
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-12369) Native Vector GroupBy

Reply via email to