Re: Cartesian issue with user defined objects

2015-02-26 Thread Marco Gaido
Thanks, my issue was exactly that the function to extract the class from the file used the same object, by only changing it. Creating a new object for each item solved the issue. Thank you very much for your reply. Best regards. > Il giorno 26/feb/2015, alle ore 22:25, Imran Rashid ha > scritt

Re: Cartesian issue with user defined objects

2015-02-26 Thread Imran Rashid
any chance your input RDD is being read from hdfs, and you are running into this issue (in the docs on SparkContext#hadoopFile): * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable object for each * record, directly caching the returned RDD or directly passing it to an aggr