Jingsong Lee created FLINK-12796:
------------------------------------

             Summary: Introduce BaseArray and BaseMap to reduce conversion 
overhead to blink
                 Key: FLINK-12796
                 URL: https://issues.apache.org/jira/browse/FLINK-12796
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Planner
            Reporter: Jingsong Lee
            Assignee: Jingsong Lee


Currently, in internal data format of flink, the array is only BinaryArray, and 
the map is only BinaryMap. If the user writes a UDAF with arrays as parameters 
and return values, it will lead to frequent conversion between Java arrays and 
BinaryArrays (each conversion is equivalent to the entire array of copys), 
which is very time-consuming.

In order to avoid copy in conversion, BaseArray and BaseMap are introduced as 
internal formats.

BaseArray is the parent of GenericArray and BinaryArray, providing various read 
and write operations on an array.

GenericArray is a wrapper class for Java arrays, which internally wraps a Java 
array. This array stores some elements of internal data format.

Conversion can be avoided when the element type is a primitive type or a type 
that is consistent internally format and externally format. (Detail see: 
DataFormatConverters)

After our benchmark, the performance of UDAF using primitive Array has been 
improved by 10 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to