Re: all values for a key must fit in memory

2014-05-25 Thread Nilesh
Iterator implementation works OK for me here, though it might turn out to be slow. Cheers, Nilesh PS: Can't wait for 1.0! ^_^ Looks like it's been RC10 till now. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-

Re: all values for a key must fit in memory

2014-05-25 Thread Nilesh
om 1.0, the logic of which thankfully works with 0.9.1 too, no new API changes there. Cheers, Nilesh -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6794.html Sent from the Apache Spark Developers List mai

Re: all values for a key must fit in memory

2014-05-25 Thread Nilesh
groupByKey().values.map(x => while(x.hasNext) ... ) assuming x : Iterable[Value] is larger than the RAM on a single machine? Or will this be possible later, in subsequent versions? Could you please propose a workaround for this for the meantime? I'm out of ideas. Thanks, Nilesh -- V

Kryo serialization for closures: a workaround

2014-05-24 Thread Nilesh
Suppose my mappers can be functions (def) that internally call other classes and create objects and do different things inside. (Or they can even be classes that extend (Foo) => Bar and do the processing in their apply method - but let's ignore this case for now) Spark supports only Java Serializa