Eugene Koifman created HIVE-6370:
------------------------------------

             Summary: LazySimpleSerDe doesn't handle Date and Timestamp properly
                 Key: HIVE-6370
                 URL: https://issues.apache.org/jira/browse/HIVE-6370
             Project: Hive
          Issue Type: Bug
          Components: Serializers/Deserializers
    Affects Versions: 0.12.0
            Reporter: Eugene Koifman


LazySimpleSerde#serialize() calls LazyUtils.writePrimitiveUTF8() to handle 
primitive types.
When writing out java.sql.Date, this in turn, calls LazyDate.writeUTF8() which 
calls DateWritable.toString(), which is effectively Date.toString().
Date.toString() makes an implicit adjustment for the local timezone in it's 
output.  Thus if Date.getTime() is on a day boundary (midnight UTC), toString() 
on it will write out the previous day.  Date.valueOf() which is used by this 
SerDe to read data makes a similar adjustment for current timezone.

This is wrong, it should write out Date.getTime() (possibly normalizing to day 
boundary).  This will make read/write independent of current timezone.

I think java.sql.Timestamp has similar issue.  When this is fixed, work in 
HIVE-5814 should be adjusted to work with getTime() rather than use deprecated 
day/month/year API it uses now.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to