Now I have another problem..I have to pass one o this non serializable object to a PairFunction and I received another non serializable exception..it seems that Kyro doesn't work within Functions. Am I wrong or this is a limit of Spark? On Apr 15, 2014 1:36 PM, "Flavio Pompermaier" <pomperma...@okkam.it> wrote:
> Ok thanks for the help! > > Best, > Flavio > > > On Tue, Apr 15, 2014 at 12:43 AM, Eugen Cepoi <cepoi.eu...@gmail.com>wrote: > >> Nope, those operations are lazy, meaning it will create the RDDs but >> won't trigger any "action". The computation is launched by operations such >> as collect, count, save to HDFS etc. And even if they were not lazy, no >> serialization would happen. Serialization occurs only when data will be >> transfered (collect, shuffle, maybe perist to disk - but I am not sure for >> this one). >> >> >> 2014-04-15 0:34 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >> >> Ok, that's fair enough. But why things work up to the collect?during map >>> and filter objects are not serialized? >>> On Apr 15, 2014 12:31 AM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote: >>> >>>> Sure. As you have pointed, those classes don't implement Serializable >>>> and Spark uses by default java serialization (when you do collect the data >>>> from the workers will be serialized, "collected" by the driver and then >>>> deserialized on the driver side). Kryo (as most other decent serialization >>>> libs) doesn't require you to implement Serializable. >>>> >>>> For the missing attributes it's due to the fact that java serialization >>>> does not ser/deser attributes from classes that don't impl. Serializable >>>> (in your case the parent classes). >>>> >>>> >>>> 2014-04-14 23:17 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>> >>>>> Thanks Eugen for tgee reply. Could you explain me why I have the >>>>> problem?Why my serialization doesn't work? >>>>> On Apr 14, 2014 6:40 PM, "Eugen Cepoi" <cepoi.eu...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> as a easy workaround you can enable Kryo serialization >>>>>> http://spark.apache.org/docs/latest/configuration.html >>>>>> >>>>>> Eugen >>>>>> >>>>>> >>>>>> 2014-04-14 18:21 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it>: >>>>>> >>>>>>> Hi to all, >>>>>>> >>>>>>> in my application I read objects that are not serializable because I >>>>>>> cannot modify the sources. >>>>>>> So I tried to do a workaround creating a dummy class that extends >>>>>>> the unmodifiable one but implements serializable. >>>>>>> All attributes of the parent class are Lists of objects (some of >>>>>>> them are still not serializable and some of them are, i.e. >>>>>>> List<String>). >>>>>>> >>>>>>> Until I do map and filter on the RDD that objects are filled >>>>>>> correclty (I checked that via Eclipse debug), but when I do collect all >>>>>>> the >>>>>>> attributes of my objects are empty. Could you help me please? >>>>>>> I'm using spark-core-2.10 e version 0.9.0-incubating. >>>>>>> >>>>>>> Best, >>>>>>> Flavio >>>>>>> >>>>>>> >>>>>> >>>> >