Teddy Choi created HIVE-16704:
---------------------------------

             Summary: Replace vector code generation with stream and lambdas
                 Key: HIVE-16704
                 URL: https://issues.apache.org/jira/browse/HIVE-16704
             Project: Hive
          Issue Type: Improvement
            Reporter: Teddy Choi
            Assignee: Teddy Choi


Hive uses vectorized execution engine. It uses code generator to cover various 
data types. Because Java compiler recognizes and optimizes only simple code 
loop with primitive data types and operators, not conditional branches, such as 
IF or SWITCH. The code generator and its generated code is hard to read and 
maintain.

Meanwhile, Hive 3 used Java 8+, which introduced lambda and new Stream API.

Lambda with new Stream API is an excellent replacement for vector code 
generation. It is more concise, because it doesn't make several copies of the 
template code for each class. It's more precise, because the template code and 
string replacement allowed only some data types and operators, not whole code 
blocks with compiler support. It's still fast, because Java 8 compiler 
optimizes primitive data type operations in lambda as a loop. Therefore, it 
will give more space to memory and more readability and extensibility to 
programmers.

The vector code generation part is huge. So it needs to be divided in small 
sub-tasks. I will start with ColumnArithmeticColumn for long, which covers 
LongColAddLongColumn, LongColSubtractLongColumn, and LongColMultiplyLongColumn.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to