Thanks again Eugen! I don't get the point..why you prefer to avoid kyro ser for closures?is there any problem with that? On Apr 17, 2014 11:10 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote:
> You have two kind of ser : data and closures. They both use java ser. This > means that in your function you reference an object outside of it and it is > getting ser with your task. To enable kryo ser for closures set > spark.closure.serializer property. But usualy I dont as it allows me to > detect such unwanted references. > Le 17 avr. 2014 22:17, "Flavio Pompermaier" <pomperma...@okkam.it> a > écrit : > >> Now I have another problem..I have to pass one o this non serializable >> object to a PairFunction and I received another non serializable >> exception..it seems that Kyro doesn't work within Functions. Am I wrong or >> this is a limit of Spark? >> On Apr 15, 2014 1:36 PM, "Flavio Pompermaier" <pomperma...@okkam.it> >> wrote: >> >>> Ok thanks for the help! >>> >>> Best, >>> Flavio >>> >>> >>> On Tue, Apr 15, 2014 at 12:43 AM, Eugen Cepoi <cepoi.eu...@gmail.com>wrote: >>> >>>> Nope, those operations are lazy, meaning it will create the RDDs but >>>> won't trigger any "action". The computation is launched by operations such >>>> as collect, count, save to HDFS etc. And even if they were not lazy, no >>>> serialization would happen. Serialization occurs only when data will be >>>> transfered (collect, shuffle, maybe perist to disk - but I am not sure for >>>> this one). >>>> >>>> >>>> 2014-04-15 0:34 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>> >>>> Ok, that's fair enough. But why things work up to the collect?during >>>>> map and filter objects are not serialized? >>>>> On Apr 15, 2014 12:31 AM, "Eugen Cepoi" <cepoi.eu...@gmail.com> >>>>> wrote: >>>>> >>>>>> Sure. As you have pointed, those classes don't implement Serializable >>>>>> and Spark uses by default java serialization (when you do collect the >>>>>> data >>>>>> from the workers will be serialized, "collected" by the driver and then >>>>>> deserialized on the driver side). Kryo (as most other decent >>>>>> serialization >>>>>> libs) doesn't require you to implement Serializable. >>>>>> >>>>>> For the missing attributes it's due to the fact that java >>>>>> serialization does not ser/deser attributes from classes that don't impl. >>>>>> Serializable (in your case the parent classes). >>>>>> >>>>>> >>>>>> 2014-04-14 23:17 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>>>> >>>>>>> Thanks Eugen for tgee reply. Could you explain me why I have the >>>>>>> problem?Why my serialization doesn't work? >>>>>>> On Apr 14, 2014 6:40 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> as a easy workaround you can enable Kryo serialization >>>>>>>> http://spark.apache.org/docs/latest/configuration.html >>>>>>>> >>>>>>>> Eugen >>>>>>>> >>>>>>>> >>>>>>>> 2014-04-14 18:21 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it >>>>>>>> >: >>>>>>>> >>>>>>>>> Hi to all, >>>>>>>>> >>>>>>>>> in my application I read objects that are not serializable because >>>>>>>>> I cannot modify the sources. >>>>>>>>> So I tried to do a workaround creating a dummy class that extends >>>>>>>>> the unmodifiable one but implements serializable. >>>>>>>>> All attributes of the parent class are Lists of objects (some of >>>>>>>>> them are still not serializable and some of them are, i.e. >>>>>>>>> List<String>). >>>>>>>>> >>>>>>>>> Until I do map and filter on the RDD that objects are filled >>>>>>>>> correclty (I checked that via Eclipse debug), but when I do collect >>>>>>>>> all the >>>>>>>>> attributes of my objects are empty. Could you help me please? >>>>>>>>> I'm using spark-core-2.10 e version 0.9.0-incubating. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Flavio >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>