earn the Parallel Programming Model of an OO
>>> Framework like Spark – in any OO Framework lots of Behavior is hidden /
>>> encapsulated by the Framework and the client code gets invoked at specific
>>> points in the Flow of Control / Data based on callback functions
t;>
>> That’s why stuff like RDD.filter(), RDD.filter() may look “sequential” to
>> you but it is not
>>
>>
>>
>>
>>
>> *From:* Bill Q [mailto:bill.q@gmail.com]
>> *Sent:* Thursday, May 7, 2015 6:27 PM
>>
>> *To:* Evo Eftimov
>
in the Flow of Control / Data based on callback functions
>
>
>
> That’s why stuff like RDD.filter(), RDD.filter() may look “sequential” to
> you but it is not
>
>
>
>
>
> *From:* Bill Q [mailto:bill.q@gmail.com]
> *Sent:* Thursday, May 7, 2015 6:27 PM
>
> *T
: Bill Q [mailto:bill.q@gmail.com]
Sent: Thursday, May 7, 2015 6:27 PM
To: Evo Eftimov
Cc: user@spark.apache.org
Subject: Re: Map one RDD into two RDD
The multi-threading code in Scala is quite simple and you can google it pretty
easily. We used the Future framework. You can use Akka also
n Parallel Pipelines / DAGs within the Spark Framework
>>
>> RDD1 = RDD.filter()
>>
>> RDD2 = RDD.filter()
>>
>>
>>
>>
>>
>> *From:* Bill Q [mailto:bill.q@gmail.com]
>> *Sent:* Thursday, May 7, 2015 4:55 PM
>> *To:* Evo Eftimov
&
ipelines / DAGs within the Spark Framework
>
> RDD1 = RDD.filter()
>
> RDD2 = RDD.filter()
>
>
>
>
>
> *From:* Bill Q [mailto:bill.q@gmail.com
> ]
> *Sent:* Thursday, May 7, 2015 4:55 PM
> *To:* Evo Eftimov
> *Cc:* user@spark.apache.org
>
> *Subject:*
: Bill Q [mailto:bill.q@gmail.com]
Sent: Thursday, May 7, 2015 4:55 PM
To: Evo Eftimov
Cc: user@spark.apache.org
Subject: Re: Map one RDD into two RDD
Thanks for the replies. We decided to use concurrency in Scala to do the two
mappings using the same source RDD in parallel. So far, it
Hi Bill,
Could you show a snippet of code to illustrate your choice?
-Gerard.
On Thu, May 7, 2015 at 5:55 PM, Bill Q wrote:
> Thanks for the replies. We decided to use concurrency in Scala to do the
> two mappings using the same source RDD in parallel. So far, it seems to be
> working. Any com
Thanks for the replies. We decided to use concurrency in Scala to do the
two mappings using the same source RDD in parallel. So far, it seems to be
working. Any comments?
On Wednesday, May 6, 2015, Evo Eftimov wrote:
> RDD1 = RDD.filter()
>
> RDD2 = RDD.filter()
>
>
>
> *From:* Bill Q [mailto:bi
RDD1 = RDD.filter()
RDD2 = RDD.filter()
From: Bill Q [mailto:bill.q@gmail.com]
Sent: Tuesday, May 5, 2015 10:42 PM
To: user@spark.apache.org
Subject: Map one RDD into two RDD
Hi all,
I have a large RDD that I map a function to it. Based on the nature of each
record in the input RDD,
Have you looked at RDD#randomSplit() (as example) ?
Cheers
On Tue, May 5, 2015 at 2:42 PM, Bill Q wrote:
> Hi all,
> I have a large RDD that I map a function to it. Based on the nature of
> each record in the input RDD, I will generate two types of data. I would
> like to save each type into it
11 matches
Mail list logo