On Fri, Oct 21, 2016 at 8:40 PM, Koert Kuipers wrote:
> This rather innocent looking optimization flag nullable has caused a lot
> of bugs... Makes me wonder if we are better off without it
>
Yes... my most regretted design decision :(
Please give thoughts here: https://issues.apache.org/jira/b
This rather innocent looking optimization flag nullable has caused a lot of
bugs... Makes me wonder if we are better off without it
On Oct 21, 2016 8:37 PM, "Muthu Jayakumar" wrote:
> Thanks Cheng Lian for opening the JIRA. I found this with Spark 2.0.0.
>
> Thanks,
> Muthu
>
> On Fri, Oct 21, 2
Thanks Cheng Lian for opening the JIRA. I found this with Spark 2.0.0.
Thanks,
Muthu
On Fri, Oct 21, 2016 at 3:30 PM, Cheng Lian wrote:
> Yea, confirmed. While analyzing unions, we treat StructTypes with
> different field nullabilities as incompatible types and throws this error.
>
> Opened htt
Yea, confirmed. While analyzing unions, we treat StructTypes with
different field nullabilities as incompatible types and throws this error.
Opened https://issues.apache.org/jira/browse/SPARK-18058 to track this
issue. Thanks for reporting!
Cheng
On 10/21/16 3:15 PM, Cheng Lian wrote:
Hi
Hi Muthu,
What is the version of Spark are you using? This seems to be a bug in
the analysis phase.
Cheng
On 10/21/16 12:50 PM, Muthu Jayakumar wrote:
Sorry for the late response. Here is what I am seeing...
Schema from parquet file.
d1.printSchema()
root
|-- task_id: string (nullable =
Sorry for the late response. Here is what I am seeing...
Schema from parquet file.
d1.printSchema()
root
|-- task_id: string (nullable = true)
|-- task_name: string (nullable = true)
|-- some_histogram: struct (nullable = true)
||-- values: array (nullable = true)
|||-- element
What is the issue you see when unioning?
On Wed, Oct 19, 2016 at 6:39 PM, Muthu Jayakumar wrote:
> Hello Michael,
>
> Thank you for looking into this query. In my case there seem to be an
> issue when I union a parquet file read from disk versus another dataframe
> that I construct in-memory. Th
Hello Michael,
Thank you for looking into this query. In my case there seem to be an issue
when I union a parquet file read from disk versus another dataframe that I
construct in-memory. The only difference I see is the containsNull = true.
In fact, I do not see any errors with union on the simple
Nullable is just a hint to the optimizer that its impossible for there to
be a null value in this column, so that it can avoid generating code for
null-checks. When in doubt, we set nullable=true since it is always safer
to check.
Why in particular are you trying to change the nullability of the