[ 
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13834489#comment-13834489
 ] 

Teddy Choi commented on HIVE-5761:
----------------------------------

Eric,

I researched the history of Hive date data type.

1. DATE in ORC: HIVE-4055 already implemented it. It uses an integer variable 
DateWritable#daysSinceEpoch to represent a date. I think there is a hard chance 
to use the alternative representation, which I prefer.
1. Basic operations: We may need to use java.sql.Date every time. [~thejas] and 
[~jdere] already suggested JodaTime library, which is significantly faster. But 
there were negative opinions about additional dependencies in HIVE-3910.
1. Complex operations: Fortunately, they will benefit from 
DateWritable#daysSinceEpoch representation.
1. Vectorized plan: I'm not sure now. I will run some tests.

The key point is, how to improve basic operations performance with 
DateWritable#daysSinceEpoch. I found that org.joda.time.Chronology does not 
create objects during repetitive calculations 
(http://stackoverflow.com/questions/6465330/any-good-high-performance-java-library-that-works-with-timestamp).
 It gives me an insight, but looks hard to implement.

I'll start with a basic implementation with java.sql.Date, then I will find 
more ways to optimize it.

Teddy

> Implement vectorized support for the DATE data type
> ---------------------------------------------------
>
>                 Key: HIVE-5761
>                 URL: https://issues.apache.org/jira/browse/HIVE-5761
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>
> Add support to allow queries referencing DATE columns and expression results 
> to run efficiently in vectorized mode. This should re-use the code for the 
> the integer/timestamp types to the extent possible and beneficial. Include 
> unit tests and end-to-end tests. Consider re-using or extending existing 
> end-to-end tests for vectorized integer and/or timestamp operations.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to