Re: [Spark SQL] Nanoseconds in Timestamps are set as Microseconds

Anton Okolnychyi Fri, 02 Jun 2017 01:35:34 -0700

Then let me provide a PR so that we can discuss an alternative way

2017-06-02 8:26 GMT+02:00 Reynold Xin <r...@databricks.com>:


> Seems like a bug we should fix? I agree some form of truncation makes more
> sense.
>
>
> On Thu, Jun 1, 2017 at 1:17 AM, Anton Okolnychyi <
> anton.okolnyc...@gmail.com> wrote:
>
>> Hi all,
>>
>> I would like to ask what the community thinks regarding the way how Spark
>> handles nanoseconds in the Timestamp type.
>>
>> As far as I see in the code, Spark assumes microseconds precision.
>> Therefore, I expect to have a truncated to microseconds timestamp or an
>> exception if I specify a timestamp with nanoseconds. However, the current
>> implementation just silently sets nanoseconds as microseconds in [1], which
>> results in a wrong timestamp. Consider the example below:
>>
>> spark.sql("SELECT cast('2015-01-02 00:00:00.000000001' as
>> TIMESTAMP)").show(false)
>> +------------------------------------------------+
>> |CAST(2015-01-02 00:00:00.000000001 AS TIMESTAMP)|
>> +------------------------------------------------+
>> |2015-01-02 00:00:00.000001                      |
>> +------------------------------------------------+
>>
>> This issue was already raised in SPARK-17914 but I do not see any
>> decision there.
>>
>> [1] - org.apache.spark.sql.catalyst.util.DateTimeUtils, toJavaTimestamp,
>> line 204
>>
>> Best regards,
>> Anton
>>
>
>

Re: [Spark SQL] Nanoseconds in Timestamps are set as Microseconds

Reply via email to