Hey Sean

Thanks for the reply. Indeed I could check that the values are consistent
with Java. So basically the issue is related to Java and not spark. Thanks
!

Regards.

Ankit Prakash Gupta

On Fri, Sep 6, 2024 at 8:37 AM Sean Owen <sro...@gmail.com> wrote:

> Are you sure those are incorrect? Or at least are they not consistent with
> java? For dates really far in the past, the exact mapping gets really
> complex due to calendar changes over time. Specifically the calendar we all
> use today didn't exist 2000 years ago even
>
> On Thu, Sep 5, 2024, 9:55 PM Ankit Gupta <info.ank...@gmail.com> wrote:
>
>> Hi Dev Community
>>
>> I came across a weird bug in spark sql function `from_utc_timestamp`, the
>> values are not consistent. When converting to any other timezone from UTC
>> like IST below the values are erratic. I have already created a jira ticket
>> for the same.
>>
>> Any thoughts on this, how we can avoid this?
>>
>> For example
>>
>>
>>> java.util.TimeZone.setDefault(java.util.TimeZone.getTimeZone("UTC"))
>>> val df = Seq(java.sql.Timestamp.valueOf("0001-01-01 00:00:00"),
>>> java.sql.Timestamp.valueOf("1900-01-01 00:00:00"),
>>> java.sql.Timestamp.valueOf("1799-12-31 00:00:00"),
>>> java.sql.Timestamp.valueOf("1850-12-31 00:00:00"), new
>>> java.sql.Timestamp(0)).toDF("ts")
>>> df.withColumn("ts_trans", from_utc_timestamp($"ts", "IST")).show
>>>
>>>
>>> // Exiting paste mode, now interpreting.
>>>
>>> +-------------------+-------------------+
>>> |                 ts|           ts_trans|
>>> +-------------------+-------------------+
>>> |0001-01-01 00:00:00|0001-01-01 05:53:28|
>>> |1900-01-01 00:00:00|1900-01-01 05:21:10|
>>> |1799-12-31 00:00:00|1799-12-31 05:53:28|
>>> |1850-12-31 00:00:00|1850-12-31 05:53:28|
>>> |1970-01-01 00:00:00|1970-01-01 05:30:00|
>>> +-------------------+-------------------+
>>
>>
>>  Thanks and Regards.
>>
>> Ankit Prakash Gupta
>>
>

Reply via email to