Un-serializable 3rd-party classes (Spark, Java)

2014-06-17 Thread Daedalus
I'm trying to use matrix-toolkit-java for an application of mine, particularly ,the FlexCompRowMatrix class (used to store sparse matrices). I have a class Dataframe -- which contains and int array, two double values, and one FlexCompRowMatrix.

Re: Un-serializable 3rd-party classes (Spark, Java)

2014-06-18 Thread Daedalus
tions, you could also use > those inside a wrapper object. Take a look at > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SerializableWritable.scala > for > an example where we make Hadoop’s Writables serializable. > > Matei > > On Jun 17, 2

Re: Serialization problem in Spark

2014-06-19 Thread Daedalus
I'm not sure if this is a Hadoop-centric issue or not. I had similar issues with non-serializable external library classes. I used a Kryo config (as illustrated here ) and registered the one troublesome class. It seemed to wor

Repeated Broadcasts

2014-06-19 Thread Daedalus
I'm trying to use Spark (Java) for an optimization algorithm that needs repeated server-node exchanges of information. (The ADMM algorithm for whoever is familiar). In each iteration, I need to update a set of values on the nodes, and collect them on the server, which will update it's own set of va

Re: Repeated Broadcasts

2014-06-21 Thread Daedalus
Anyone who has used this sort of construct? (Read: bump) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Repeated-Broadcasts-tp7977p8063.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Persistent Local Node variables

2014-06-22 Thread Daedalus
*TL;DR:* I want to run a pre-processing step on the data from each partition (such as parsing) and retain the parsed object on each node for future processing calls to avoid repeated parsing. /More detail:/ I have a server and two nodes in my cluster, and data partitioned using hdfs. I am trying

Re: Persistent Local Node variables

2014-06-22 Thread Daedalus
Will using mapPartitions and creating a new RDD of ParsedData objects avoid multiple parsing? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Persistent-Local-Node-variables-tp8104p8107.html Sent from the Apache Spark User List mailing list archive at Nabble