Re: Serialization issues with case classes with Flink

2020-04-19 Thread Som Lima
o 10).map(y => C(y.toInt)) > > xs.count > > > > benv.execute("Count Collection") > > > > Cheers > > Ravi Pullareddy > > > > *From:* Juan Rodríguez Hortalá > *Sent:* Monday, 20 April 2020 2:44 AM > *To:* users@zeppelin.apache.org > *Subjec

RE: Serialization issues with case classes with Flink

2020-04-19 Thread Ravi Pullareddy
ot;Count Collection") Cheers Ravi Pullareddy *From:* Juan Rodríguez Hortalá *Sent:* Monday, 20 April 2020 2:44 AM *To:* users@zeppelin.apache.org *Subject:* Re: Serialization issues with case classes with Flink Hi Jeff, I’ll try using POJO clases instead. Thanks, Juan El E

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
;> >>> Note I was running Zeppelin in a docker container in my laptop, also >>> using Flink's local mode >>> >>> I think this is a problem with the Zeppelin integratin with Flink, and >>> how it processes case class definitions in a cell. >>> >>>

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Jeff Zhang
gt; >> I think this is a problem with the Zeppelin integratin with Flink, and >> how it processes case class definitions in a cell. >> >> Thanks for your answer, >> >> Juan >> >> >> On Sun, Apr 19, 2020 at 12:22 PM Ravi Pullareddy < >&g

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
Sun, Apr 19, 2020 at 12:22 PM Ravi Pullareddy < > ravi.pullare...@minlog.com.au> wrote: > >> Hi Juan >> >> >> >> I have written various applications for continuous processing of csv >> files. Please post your entire code and how you are mapping. It be

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
> > > > > *From:* Juan Rodríguez Hortalá > *Sent:* Sunday, 19 April 2020 6:25 PM > *To:* users@zeppelin.apache.org > *Subject:* Re: Serialization issues with case classes with Flink > > > > Just for the record, the Spark version of that works fine: > >

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Zahid Rahman
You may also want to look at this https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html On Sun, 19 Apr 2020, 09:05 Juan Rodríguez Hortalá, < juan.rodriguez.hort...@gmail.com> wrote: > Hi, > > I'm using the Flink interpreter and the benv environment. I'm reading some > cs

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Som Lima
You may wish to have a look at Apache avro project. It shows you how to Generare source code for schema classes for creating objects to be used in reading csv files etc. for the purpose of serialisation and deserialization. https://avro.apache.org/docs/current/gettingstartedjava.html The avro

RE: Serialization issues with case classes with Flink

2020-04-19 Thread Ravi Pullareddy
@zeppelin.apache.org *Subject:* Re: Serialization issues with case classes with Flink Just for the record, the Spark version of that works fine: ``` %spark case class C2(x: Int) val xs = sc.parallelize(1 to 10) val csSpark = xs.map{C2(_)} csSpark.collect res3: Array[C2] = Array(C2(1), C2(2

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
Just for the record, the Spark version of that works fine: ``` %spark case class C2(x: Int) val xs = sc.parallelize(1 to 10) val csSpark = xs.map{C2(_)} csSpark.collect res3: Array[C2] = Array(C2(1), C2(2), C2(3), C2(4), C2(5), C2(6), C2(7), C2(8), C2(9), C2(10)) ``` Thanks, Juan On Sun, Ap

Re: Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
Minimal reproduction: - Fist option ```scala case class C(x: Int) val xs = benv.fromCollection(1 to 10) val cs = xs.map{C(_)} cs.count ``` defined class C xs: org.apache.flink.api.scala.DataSet[Int] = org.apache.flink.api.scala.DataSet@39c713c6 cs: org.apache.flink.api.scala.DataSet[C] = or

Serialization issues with case classes with Flink

2020-04-19 Thread Juan Rodríguez Hortalá
Hi, I'm using the Flink interpreter and the benv environment. I'm reading some csv files using benv.readCsvFile and it works ok. I have also defined a case class C for the csv records. The problem happens when I apply a map operation on the DataSet of tuples returned by benv.readCsvFile, to conver