Eugene Koifman created HIVE-6370: ------------------------------------ Summary: LazySimpleSerDe doesn't handle Date and Timestamp properly Key: HIVE-6370 URL: https://issues.apache.org/jira/browse/HIVE-6370 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Eugene Koifman
LazySimpleSerde#serialize() calls LazyUtils.writePrimitiveUTF8() to handle primitive types. When writing out java.sql.Date, this in turn, calls LazyDate.writeUTF8() which calls DateWritable.toString(), which is effectively Date.toString(). Date.toString() makes an implicit adjustment for the local timezone in it's output. Thus if Date.getTime() is on a day boundary (midnight UTC), toString() on it will write out the previous day. Date.valueOf() which is used by this SerDe to read data makes a similar adjustment for current timezone. This is wrong, it should write out Date.getTime() (possibly normalizing to day boundary). This will make read/write independent of current timezone. I think java.sql.Timestamp has similar issue. When this is fixed, work in HIVE-5814 should be adjusted to work with getTime() rather than use deprecated day/month/year API it uses now. -- This message was sent by Atlassian JIRA (v6.1.5#6160)