[ 
https://issues.apache.org/jira/browse/HIVE-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843601#comment-13843601
 ] 

Eric Hanson commented on HIVE-5762:
-----------------------------------

I'm thinking about using this basic structure for a decimal column vector for 
limited-precision decimals. Then a utility package of static functions can be 
implemented to do decimal arithmetic on individual values. It should be 
possible to make this a lot faster than if the code relies on 
java.math.BigDecimal, because it is less general, and because new() and garbage 
collection will be reduced.

{code}
public class DecimalColumnVector extends ColumnVector {
  public int precision; // precision of all elements in vector (max 38)
  public int scale;     // scale of all elements in vector (max 38)
  public static final int WORDS_PER_VALUE = 4;

  /**
   * Logically a vector of 128 bit unsigned int, that is "little-endian."  This
   * means that for a value v, v[0] is least significant. The 4-word
   * 32 bit values are treated as unsigned. However,the high-order bit
   * of the highest word (word 3) must be 0.
   */
  public int[][] vector;
  public byte[] sign;  // -1 if negative, 0 if zero, 1 if positive

  public DecimalColumnVector() {
    super(VectorizedRowBatch.DEFAULT_SIZE);
    final int len = VectorizedRowBatch.DEFAULT_SIZE;
    vector = new int[len][];
    for (int i = 0; i < len; i++) {
      vector[i] = new int[WORDS_PER_VALUE];
    }
    sign = new byte[len];
  }
...
}
{code}


> Implement vectorized support for the DECIMAL data type
> ------------------------------------------------------
>
>                 Key: HIVE-5762
>                 URL: https://issues.apache.org/jira/browse/HIVE-5762
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>
> Add support to allow queries referencing DECIMAL columns and expression 
> results to run efficiently in vectorized mode.  Include unit tests and 
> end-to-end tests. 
> Before starting or at least going very far, please write design specification 
> (a new section for the design spec attached to HIVE-4160) for how support for 
> the different DECIMAL types should work in vectorized mode, and the roadmap, 
> and have it reviewed. 
> It may be feasible to re-use LongColumnVector and related VectorExpression 
> classes for fixed-point decimal in certain data ranges. That should be at 
> least considered to get faster performance and save code. For unlimited 
> precision DECIMAL, a new column vector subtype may be needed, or a 
> BytesColumnVector could be re-used.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to