Thanks much for the detailed explanations. I suspected architectural
support of the notion of rdd of rdds, but my understanding of Spark or
distributed computing in general is not as deep as allowing me to
understand better. so this really helps!
I ended up going with List[RDD]. The collection of
>>
>> On Tue, Jun 9, 2015 at 1:47 PM, kiran lonikar wrote:
>>
>>> Simillar question was asked before:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Rdd-of-Rdds-td17025.html
>>>
>>> Here is one of the reasons why I think RDD[RDD[T]] is not pos
n or action APIs of
> RDD), it will be possible to have RDD of RDD.
>
> On Tue, Jun 9, 2015 at 1:47 PM, kiran lonikar wrote:
>
>> Simillar question was asked before:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Rdd-of-Rdds-td17025.html
>>
>> Here
; http://apache-spark-user-list.1001560.n3.nabble.com/Rdd-of-Rdds-td17025.html
>
> Here is one of the reasons why I think RDD[RDD[T]] is not possible:
>
>- RDD is only a handle to the actual data partitions. It has a
>reference/pointer to the *SparkContext* object (*sc*) and a li
rk job.
Hope it helps. You need to consider List[RDD] or some other collection.
Possibly in future, if and when spark architecture allows workers to launch
spark jobs (the functions passed to transformation or action APIs of RDD),
it will be possible to have RDD of RDD.
--
View this messa
Simillar question was asked before:
http://apache-spark-user-list.1001560.n3.nabble.com/Rdd-of-Rdds-td17025.html
Here is one of the reasons why I think RDD[RDD[T]] is not possible:
- RDD is only a handle to the actual data partitions. It has a
reference/pointer to the *SparkContext* object
ap or DataFrame operations on
them. (I already had the function coded, I am therefore reluctant to work
with the ResultIterable object coming out of rdd.groupByKey() ... )
I've searched the mailing list and googled on "RDD of RDDs" and seems like
it isn't a thing at all.
A few c
nifesting itself as a
>>> new
>>> one.
>>>
>>>
>>> Regards
>>> -Ravi
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabbl
PE but further down I am getting a indexOutOfBounds, so
>> trying to figure out if the original problem is manifesting itself as a
>> new
>> one.
>>
>>
>> Regards
>> -Ravi
>>
>>
>>
>>
>> --
>> View this message in context
ger get the NPE but further down I am getting a indexOutOfBounds, so
> trying to figure out if the original problem is manifesting itself as a new
> one.
>
>
> Regards
> -Ravi
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n
rk-user-list.1001560.n3.nabble.com/How-to-merge-a-RDD-of-RDDs-into-one-uber-RDD-tp20986p21012.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apach
e an array of RDDs from which you can fold over them and merge
them.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-merge-a-RDD-of-RDDs-into-one-uber-RDD-tp20986p21007.html
Sent from the Apache Spark User List mailing list archive at
On Wednesday, October 22, 2014 9:06 AM, Sean Owen wrote:
> No, there's no such thing as an RDD of RDDs in Spark.
> Here though, why not just operate on an RDD of Lists? or a List of RDDs?
> Usually one of these two is the right approach whenever you feel
> inclined to operate
Nube Technologies <http://www.nubetech.co>
<http://in.linkedin.com/in/sonalgoyal>
On Wed, Oct 22, 2014 at 8:35 PM, Sean Owen wrote:
> No, there's no such thing as an RDD of RDDs in Spark.
> Here though, why not just operate on an RDD of Lists? or a List of RDDs?
> Usually one of t
No, there's no such thing as an RDD of RDDs in Spark.
Here though, why not just operate on an RDD of Lists? or a List of RDDs?
Usually one of these two is the right approach whenever you feel
inclined to operate on an RDD of RDDs.
On Wed, Oct 22, 2014 at 3:58 PM, Tomer Benyamini wrote:
&g
Hello,
I would like to parallelize my work on multiple RDDs I have. I wanted
to know if spark can support a "foreach" on an RDD of RDDs. Here's a
java example:
public static void main(String[] args) {
SparkConf sparkConf = new SparkConf().setA
16 matches
Mail list logo