Re: Change default timestamp offset on data load

2023-09-07 Thread Jack Goodson
Thanks Mich figured that might be the case, regardless, appreciate the help :) On Thu, Sep 7, 2023 at 8:36 PM Mich Talebzadeh wrote: > Hi, > > As far as I am aware there is no Spark or JVM setting that can make Spark > assume a different timezone during the initial load from Parquet as Parquet >

Re: Change default timestamp offset on data load

2023-09-07 Thread Mich Talebzadeh
Hi, As far as I am aware there is no Spark or JVM setting that can make Spark assume a different timezone during the initial load from Parquet as Parquet files store timestamps in UTC. The timezone conversion can be done (as I described before) after the load. HTH Mich Talebzadeh, Distinguished

Re: Change default timestamp offset on data load

2023-09-06 Thread Jack Goodson
Thanks Mich, sorry, I might have been a bit unclear in my original email. The timestamps are getting loaded as 2003-11-24T09:02:32+ for example but I want it loaded as 2003-11-24T09:02:32+1300 I know how to do this with various transformations however I'm wondering if there's any spark or jvm s

Re: Change default timestamp offset on data load

2023-09-06 Thread Mich Talebzadeh
Hi Jack, You may use from_utc_timestamp and to_utc_timestamp to see if they help. from pyspark.sql.functions import from_utc_timestamp You can read your Parquet file into DF df = spark.read.parquet('parquet_file_path') # Convert timestamps (assuming your column name) from UTC to Pacific/Auckla