for example (the log shows when it creates a kryo encoder):

scala> implicitly[EncoderEvidence[Option[Seq[String]]]].encoder
res5: org.apache.spark.sql.Encoder[Option[Seq[String]]] = class[value[0]:
array<string>]

scala> implicitly[EncoderEvidence[Option[Set[String]]]].encoder
dataframe.EncoderEvidence$: using kryo encoder for scala.Option[Set[String]]
res6: org.apache.spark.sql.Encoder[Option[Set[String]]] = class[value[0]:
binary]




On Wed, Oct 26, 2016 at 4:00 PM, Koert Kuipers <ko...@tresata.com> wrote:

> why would generating implicits for ProductN where you also require the
> elements in the Product to have an expression encoder not work?
>
> we do this. and then we have a generic fallback where it produces a kryo
> encoder.
>
> for us the result is that say an implicit for Seq[(Int, Seq[(String,
> Int)])] will create a new ExpressionEncoder(), while an implicit for
> Seq[(Int, Set[(String, Int)])] produces a Encoders.kryoEncoder()
>
> On Wed, Oct 26, 2016 at 3:50 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> Sorry, I realize that set is only one example here, but I don't think
>> that making the type of the implicit more narrow to include only ProductN
>> or something eliminates the issue.  Even with that change, we will fail to
>> generate an encoder with the same error if you, for example, have a field
>> of your case class that is an unsupported type.
>>
>> Short of changing this to compile-time macros, I think we are stuck with
>> this class of errors at runtime.  The simplest solution seems to be to
>> expand the set of thing we can handle as much as possible and allow users
>> to turn on a kryo fallback for expression encoders.  I'd be hesitant to
>> make this the default though, as behavior would change with each release
>> that adds support for more types.  I would be very supportive of making
>> this fallback a built-in option though.
>>
>> On Wed, Oct 26, 2016 at 11:47 AM, Koert Kuipers <ko...@tresata.com>
>> wrote:
>>
>>> yup, it doesnt really solve the underlying issue.
>>>
>>> we fixed it internally by having our own typeclass that produces
>>> encoders and that does check the contents of the products, but we did this
>>> by simply supporting Tuple1 - Tuple22 and Option explicitly, and not
>>> supporting Product, since we dont have a need for case classes
>>>
>>> if case classes extended ProductN (which they will i think in scala
>>> 2.12?) then we could drop Product and support Product1 - Product22 and
>>> Option explicitly while checking the classes they contain. that would be
>>> the cleanest.
>>>
>>>
>>> On Wed, Oct 26, 2016 at 2:33 PM, Ryan Blue <rb...@netflix.com> wrote:
>>>
>>>> Isn't the problem that Option is a Product and the class it contains
>>>> isn't checked? Adding support for Set fixes the example, but the problem
>>>> would happen with any class there isn't an encoder for, right?
>>>>
>>>> On Wed, Oct 26, 2016 at 11:18 AM, Michael Armbrust <
>>>> mich...@databricks.com> wrote:
>>>>
>>>>> Hmm, that is unfortunate.  Maybe the best solution is to add support
>>>>> for sets?  I don't think that would be super hard.
>>>>>
>>>>> On Tue, Oct 25, 2016 at 8:52 PM, Koert Kuipers <ko...@tresata.com>
>>>>> wrote:
>>>>>
>>>>>> i am trying to use encoders as a typeclass where if it fails to find
>>>>>> an ExpressionEncoder it falls back to KryoEncoder.
>>>>>>
>>>>>> the issue seems to be that ExpressionEncoder claims a little more
>>>>>> than it can handle here:
>>>>>>   implicit def newProductEncoder[T <: Product : TypeTag]: Encoder[T]
>>>>>> = Encoders.product[T]
>>>>>>
>>>>>> this "claims" to handle for example Option[Set[Int]], but it really
>>>>>> cannot handle Set so it leads to a runtime exception.
>>>>>>
>>>>>> would it be useful to make this a little more specific? i guess the
>>>>>> challenge is going to be case classes which unfortunately dont extend
>>>>>> Product1, Product2, etc.
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>>>
>>>
>>>
>>
>

Reply via email to