Hi Lucas,
The assessments from Wes and Li are right on. Just to add to that, and
unfortunately make things even more complicated.. Spark does not always use
the config "spark.sql.session.timeZone", so it doesn't really help with
your example. It would be used if instead you generated timestamps t
Lucas,
Wes' explanation is correct. If you are using Spark 2.2, you can set spark
config "spark.sql.session.timeZone" to "UTC".
I have written an documentation explaining this. I can clean it up for
ARROW-1425.
On Mon, Aug 28, 2017 at 5:23 PM, Wes McKinney wrote:
> see https://issues.apache.or
see https://issues.apache.org/jira/browse/ARROW-1425
On Mon, Aug 28, 2017 at 12:32 PM, Wes McKinney wrote:
> hi Lucas,
>
> Bryan Cutler, Holden Karau, Li Jin, or someone with deeper knowledge
> of the Spark timestamp issue (which is a known, and not a bug per se)
> should be able to give some ext
hi Lucas,
Bryan Cutler, Holden Karau, Li Jin, or someone with deeper knowledge
of the Spark timestamp issue (which is a known, and not a bug per se)
should be able to give some extra context about this.
My understanding is that when you read timezone-naive data in Spark,
it is treated as session-
Here is the pyspark script I used to see this difference.
On Mon, 28 Aug 2017 at 09:20 Lucas Pickup
wrote:
> Hi all,
>
> Very sorry if people already responded to this at:
> lucas.pic...@microsoft.com There was an INVALID identifier attached to
> the end of the reply address for some reason whic
Quick follow up. I'm trying to work around this myself in the meantime. The
goal is to qualify the TimestampValue with a timezone (by creating a new column
in the arrow table based off the previous one). If this can be done before the
Value's are converted to python it may fix the issue I was ha