Kunal Sharma created SQOOP-3448:
-----------------------------------

             Summary: Pulling timestamp over year 2038/2039 and storing it to 
parquet file causes unix timestamp stored to be inaccurate.
                 Key: SQOOP-3448
                 URL: https://issues.apache.org/jira/browse/SQOOP-3448
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.6
            Reporter: Kunal Sharma


Background:

We are pulling the data from source and storing it directly into the the 
parquet file as result of this all timestamp/date value is changed into unix 
timestamp (*milliseconds*) in the Parquet file stored as data type long as part 
of the sqoop process by the sqoop process as default configuration. (which is 
what we want since we want to avoid the whole timezone issue parquet file has 
with different data engine.)

 

Issue:

When pulling timestamp over the classic year 2038 issue 
([https://en.wikipedia.org/wiki/Year_2038_problem)] we get negative number. 
Which is weird as the unix timestamp that is stored as is in milliseconds and 
milliseconds needs to be stored as big int. So some where in the process the 
transformation is happening as int or double which is getting multiple by 1000 
and then truncated into big int data type which is the end results we see 
stored on the parquet file which is data type long (big int)

 

Key Configuration

Oracle jars  - "HADOOP_CLASSPATH=ojdbc6.jar"

See attached file for the sqoop command reference 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to