subject:"Re\: Dataset schema incompatibility bug when reading column partitioned data"

Re: Dataset schema incompatibility bug when reading column partitioned data

2019-04-13 Thread Felix Cheung

I kinda agree it is confusing when a parameter is not used... From: Ryan Blue Sent: Thursday, April 11, 2019 11:07:25 AM To: Bruce Robbins Cc: Dávid Szakállas; Spark Dev List Subject: Re: Dataset schema incompatibility bug when reading column partitioned data

Re: Dataset schema incompatibility bug when reading column partitioned data

2019-04-11 Thread Ryan Blue

I think the confusion is that the schema passed to spark.read is not a projection schema. I don’t think it is even used in this case because the Parquet dataset has its own schema. You’re getting the schema of the table. I think the correct behavior is to reject a user-specified schema in this case

Re: Dataset schema incompatibility bug when reading column partitioned data

2019-04-11 Thread Bruce Robbins

I see a Jira: https://issues.apache.org/jira/browse/SPARK-21021 On Thu, Apr 11, 2019 at 9:08 AM Dávid Szakállas wrote: > +dev for more visibility. Is this a known issue? Is there a plan for a fix? > > Thanks, > David > > Begin forwarded message: > > *From: *Dávid Szakállas > *Subject: **Datase