Mustafa İman created HIVE-24510:
-----------------------------------
Summary: Vectorize compute_bit_vector
Key: HIVE-24510
URL: https://issues.apache.org/jira/browse/HIVE-24510
Project: Hive
Issue Type: Improvement
Reporter: Mustafa İman
Assignee: Mustafa İman
After https://issues.apache.org/jira/browse/HIVE-23530 , almost all compute
stats functions are vectorizable. Only function that is not vectorizable is
"compute_bit_vector" for ndv statistics computation. This causes "create table
as select" and "insert overwrite select" queries to run in non-vectorized mode.
Even a very naive implementation of vectorized compute_bit_vector gives about
50% performance improvement on simple "insert overwrite select" queries. That
is because entire mapper or reducer can run in vectorized mode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)