[ https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline updated HIVE-12369: -------------------------------- Description: Implement Native Vector GroupBy using fast hash table technology developed for Native Vector MapJoin, etc. Patch is currently limited to a single Long key with a single COUNT aggregation. Or, a single Long key and no aggregation also known as duplicate reduction. 3 new classes introduces that stored the count in the slot table and don't allocate hash elements: {noformat} COUNT(column) VectorGroupByHashLongKeyCountColumnOperator COUNT(key) VectorGroupByHashLongKeyCountKeyOperator COUNT(*) VectorGroupByHashLongKeyCountStarOperator {noformat} And the duplicate reduction operator a single Long key: {noformat} VectorGroupByHashLongKeyDuplicateReductionOperator {noformat} was: Implement Native Vector GroupBy using fast hash table technology developed for Native Vector MapJoin, etc. Patch is currently limited to a single Long key, aggregation on Long columns, no more than 31 columns. 3 new classes introduces that stored the count in the slot table and don't allocate hash elements: {noformat} COUNT(column) VectorGroupByHashOneLongKeyCountColumnOperator COUNT(key) VectorGroupByHashOneLongKeyCountKeyOperator COUNT(*) VectorGroupByHashOneLongKeyCountStarOperator {noformat} And a new class that aggregates a single Long key: {noformat} VectorGroupByHashOneLongKeyOperator {noformat} > Native Vector GroupBy > --------------------- > > Key: HIVE-12369 > URL: https://issues.apache.org/jira/browse/HIVE-12369 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Matt McCline > Assignee: Matt McCline > Priority: Critical > Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, > HIVE-12369.05.patch, HIVE-12369.06.patch > > > Implement Native Vector GroupBy using fast hash table technology developed > for Native Vector MapJoin, etc. > Patch is currently limited to a single Long key with a single COUNT > aggregation. Or, a single Long key and no aggregation also known as > duplicate reduction. > 3 new classes introduces that stored the count in the slot table and don't > allocate hash elements: > {noformat} > COUNT(column) VectorGroupByHashLongKeyCountColumnOperator > COUNT(key) VectorGroupByHashLongKeyCountKeyOperator > COUNT(*) VectorGroupByHashLongKeyCountStarOperator > {noformat} > And the duplicate reduction operator a single Long key: > {noformat} > VectorGroupByHashLongKeyDuplicateReductionOperator > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)