Re: Convert Iterable to RDD

2016-02-12 Thread Jerry Lam
Not sure if I understand your problem well but why don't you create the file locally and then upload to hdfs? Sent from my iPhone > On 12 Feb, 2016, at 9:09 am, "seb.arzt" wrote: > > I have an Iterator of several million elements, which unfortunately won't fit > into the driver memory at the s

Re: Convert Iterable to RDD

2016-02-12 Thread seb.arzt
I have an Iterator of several million elements, which unfortunately won't fit into the driver memory at the same time. I would like to save them as object file in HDFS: Doing so I am running out of memory on the driver: Using a stream also won't work. I cannot further increase the driver memory.

Re: Convert Iterable to RDD

2014-10-21 Thread Olivier Girardot
I don't think this is provided out of the box, but you can use toSeq on your Iterable and if the Iterable is lazy, it should stay that way for the Seq. And then you can use sc.parallelize(my-iterable.toSeq) so you'll have your RDD. For the Iterable[Iterable[T]] you can flatten it and then create y

RE: Convert Iterable to RDD

2014-10-20 Thread Dai, Kevin
In addition, how to convert Iterable[Iterable[T]] to RDD[T] Thanks, Kevin. From: Dai, Kevin [mailto:yun...@ebay.com] Sent: 2014年10月21日 10:58 To: user@spark.apache.org Subject: Convert Iterable to RDD Hi, All Is there any way to convert iterable to RDD? Thanks, Kevin.