Somehow I just revisited the issue, and realized the issue is resolved in
Spark 3.0.0. ExpressionEncoder is refactored in Spark 3.0.0 and schema is
removed as a part of refactor, which seems to be a root cause as schema and
the data types of serializer don't match in such case. ExpressionEncoder in
I meant how to interpret Java Beans in Spark are not consistently defined.
Unlike you've guessed, in most paths Spark uses "read-only" properties.
(All the failed existing tests in my experiment have "read-only"
properties.) The problematic case is when Java bean is used for read-write;
one case i
Java Beans are well-defined; it's valid to have a getter- or
setter-only property. That doesn't mean Spark can meaningfully use
such a property, as it typically has to both read and write them. I
guess it depends on context. For example, I don't see how you can have
a deserializer without setters,
OK I just went through the change, and the change breaks bunch of existing
UTs.
https://github.com/apache/spark/pull/28611
Note that I modified all the cases where Spark extracts the columns for
"read method" only properties to both "read" & "write". It doesn't only
change the code path of Encode
First case is not tied to the batch / streaming as Encoders.bean simply
fails when inferring schema.
Second case is tied to the streaming, and I've described the reason in the
last reply. I'm not sure we don't have similar case for batch though. (If
there're some operators only relying on the sequ
is it a problem only for streaming or it affects batch queries as well?
On Fri, May 8, 2020 at 11:42 PM Jungtaek Lim
wrote:
> The first case of user report is obvious - according to the user report,
> AVRO generated code contains getter which denotes to itself hence Spark
> disallows (throws exc
The first case of user report is obvious - according to the user report,
AVRO generated code contains getter which denotes to itself hence Spark
disallows (throws exception), but it doesn't have matching setter method
(if I understand correctly) so technically it shouldn't matter.
For the second c
Can you give some simple examples to demonstrate the problem? I think the
inconsistency would bring problems but don't know how.
On Fri, May 8, 2020 at 3:49 PM Jungtaek Lim
wrote:
> (bump to expose the discussion to more readers)
>
> On Mon, May 4, 2020 at 4:57 PM Jungtaek Lim
> wrote:
>
>> Hi
(bump to expose the discussion to more readers)
On Mon, May 4, 2020 at 4:57 PM Jungtaek Lim
wrote:
> Hi devs,
>
> There're couple of issues being reported on the user@ mailing list which
> results in being affected by inconsistent schema on Encoders.bean.
>
> 1. Typed datataset from Avro generat
Hi devs,
There're couple of issues being reported on the user@ mailing list which
results in being affected by inconsistent schema on Encoders.bean.
1. Typed datataset from Avro generated classes? [1]
2. spark structured streaming GroupState returns weird values from sate [2]
Below is a part of
10 matches
Mail list logo