Not sure if I understand your problem well but why don't you create the file
locally and then upload to hdfs?
Sent from my iPhone
> On 12 Feb, 2016, at 9:09 am, "seb.arzt" wrote:
>
> I have an Iterator of several million elements, which unfortunately won't fit
> into the driver memory at the s
I have an Iterator of several million elements, which unfortunately won't fit
into the driver memory at the same time. I would like to save them as object
file in HDFS:
Doing so I am running out of memory on the driver:
Using a stream
also won't work. I cannot further increase the driver memory.
I don't think this is provided out of the box, but you can use toSeq on
your Iterable and if the Iterable is lazy, it should stay that way for the
Seq.
And then you can use sc.parallelize(my-iterable.toSeq) so you'll have your
RDD.
For the Iterable[Iterable[T]] you can flatten it and then create y
In addition, how to convert Iterable[Iterable[T]] to RDD[T]
Thanks,
Kevin.
From: Dai, Kevin [mailto:yun...@ebay.com]
Sent: 2014年10月21日 10:58
To: user@spark.apache.org
Subject: Convert Iterable to RDD
Hi, All
Is there any way to convert iterable to RDD?
Thanks,
Kevin.