Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-26 Thread Bart Samwel
t;>> >>>> On Thu, Jun 25, 2020 at 10:41 PM Rylan Dmello >>>> wrote: >>>> >>>>> Hello Bart, >>>>> >>>>> Thank you for sharing these links, this was exactly what Tahsin and I >>>>> were looking for.

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-26 Thread Maxim Gekk
ng for. It looks like there has been a lot of discussion about >>>> this already, which is good to see. >>>> >>>> In one of these pull requests, there is a comment about the number of >>>> real-world use-cases for some kind of TimeType in Spark. We cou

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-26 Thread Bart Samwel
ld add our >>> use-case of compatibility with Parquet's TimeType as a use-case for a new >>> Spark TimeType. >>> >>> Would it be helpful to collect/document these TimeType use-cases to >>> gauge interest? We could add a new story or comment in the Spark JIRA or

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-26 Thread Maxim Gekk
meType use-cases to gauge >> interest? We could add a new story or comment in the Spark JIRA or a page >> on the Apache Confluence if that helps. >> >> Rylan >> ---------- >> *From:* Bart Samwel >> *Sent:* Wednesday, June 24, 2020 4:08

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-26 Thread Bart Samwel
Confluence if that helps. > > Rylan > -- > *From:* Bart Samwel > *Sent:* Wednesday, June 24, 2020 4:08 PM > *To:* Rylan Dmello > *Cc:* dev@spark.apache.org ; Tahsin Hassan < > thas...@mathworks.com> > *Subject:* Re: [Spark SQL] Question

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-25 Thread Rylan Dmello
pache Confluence if that helps. Rylan From: Bart Samwel Sent: Wednesday, June 24, 2020 4:08 PM To: Rylan Dmello Cc: dev@spark.apache.org ; Tahsin Hassan Subject: Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files The relevant ea

Re: [Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-24 Thread Bart Samwel
The relevant earlier discussion is here: https://github.com/apache/spark/pull/25678#issuecomment-531585556. (FWIW, a recent PR tried adding this again: https://github.com/apache/spark/pull/28858.) On Wed, Jun 24, 2020 at 10:01 PM Rylan Dmello wrote: > Hello, > > > Tahsin and I are trying to use

[Spark SQL] Question about support for TimeType columns in Apache Parquet files

2020-06-24 Thread Rylan Dmello
Hello, Tahsin and I are trying to use the Apache Parquet file format with Spark SQL, but are running into errors when reading Parquet files that contain TimeType columns. We're wondering whether this is unsupported in Spark SQL due to an architectural limitation, or due to lack of resources?