for example (the log shows when it creates a kryo encoder): scala> implicitly[EncoderEvidence[Option[Seq[String]]]].encoder res5: org.apache.spark.sql.Encoder[Option[Seq[String]]] = class[value[0]: array<string>]
scala> implicitly[EncoderEvidence[Option[Set[String]]]].encoder dataframe.EncoderEvidence$: using kryo encoder for scala.Option[Set[String]] res6: org.apache.spark.sql.Encoder[Option[Set[String]]] = class[value[0]: binary] On Wed, Oct 26, 2016 at 4:00 PM, Koert Kuipers <ko...@tresata.com> wrote: > why would generating implicits for ProductN where you also require the > elements in the Product to have an expression encoder not work? > > we do this. and then we have a generic fallback where it produces a kryo > encoder. > > for us the result is that say an implicit for Seq[(Int, Seq[(String, > Int)])] will create a new ExpressionEncoder(), while an implicit for > Seq[(Int, Set[(String, Int)])] produces a Encoders.kryoEncoder() > > On Wed, Oct 26, 2016 at 3:50 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> Sorry, I realize that set is only one example here, but I don't think >> that making the type of the implicit more narrow to include only ProductN >> or something eliminates the issue. Even with that change, we will fail to >> generate an encoder with the same error if you, for example, have a field >> of your case class that is an unsupported type. >> >> Short of changing this to compile-time macros, I think we are stuck with >> this class of errors at runtime. The simplest solution seems to be to >> expand the set of thing we can handle as much as possible and allow users >> to turn on a kryo fallback for expression encoders. I'd be hesitant to >> make this the default though, as behavior would change with each release >> that adds support for more types. I would be very supportive of making >> this fallback a built-in option though. >> >> On Wed, Oct 26, 2016 at 11:47 AM, Koert Kuipers <ko...@tresata.com> >> wrote: >> >>> yup, it doesnt really solve the underlying issue. >>> >>> we fixed it internally by having our own typeclass that produces >>> encoders and that does check the contents of the products, but we did this >>> by simply supporting Tuple1 - Tuple22 and Option explicitly, and not >>> supporting Product, since we dont have a need for case classes >>> >>> if case classes extended ProductN (which they will i think in scala >>> 2.12?) then we could drop Product and support Product1 - Product22 and >>> Option explicitly while checking the classes they contain. that would be >>> the cleanest. >>> >>> >>> On Wed, Oct 26, 2016 at 2:33 PM, Ryan Blue <rb...@netflix.com> wrote: >>> >>>> Isn't the problem that Option is a Product and the class it contains >>>> isn't checked? Adding support for Set fixes the example, but the problem >>>> would happen with any class there isn't an encoder for, right? >>>> >>>> On Wed, Oct 26, 2016 at 11:18 AM, Michael Armbrust < >>>> mich...@databricks.com> wrote: >>>> >>>>> Hmm, that is unfortunate. Maybe the best solution is to add support >>>>> for sets? I don't think that would be super hard. >>>>> >>>>> On Tue, Oct 25, 2016 at 8:52 PM, Koert Kuipers <ko...@tresata.com> >>>>> wrote: >>>>> >>>>>> i am trying to use encoders as a typeclass where if it fails to find >>>>>> an ExpressionEncoder it falls back to KryoEncoder. >>>>>> >>>>>> the issue seems to be that ExpressionEncoder claims a little more >>>>>> than it can handle here: >>>>>> implicit def newProductEncoder[T <: Product : TypeTag]: Encoder[T] >>>>>> = Encoders.product[T] >>>>>> >>>>>> this "claims" to handle for example Option[Set[Int]], but it really >>>>>> cannot handle Set so it leads to a runtime exception. >>>>>> >>>>>> would it be useful to make this a little more specific? i guess the >>>>>> challenge is going to be case classes which unfortunately dont extend >>>>>> Product1, Product2, etc. >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Ryan Blue >>>> Software Engineer >>>> Netflix >>>> >>> >>> >> >